Skip to content
Anthropic logo

Performance Engineer, Inference Systems

AnthropicGenerative AI, company
San Francisco, United StatesMid
Data & AI

About the role

Investigate and improve inference fleet performance and correctness across hardware and serving layers.

  • Drive cross-layer performance and correctness for Anthropic's inference fleet, analyzing throughput, latency, reliability, and correctness across hardware and serving stacks.
  • Key Responsibilities Run cross-layer performance investigations and roofline analysis to find root causes and value of fixes Own and improve correctness evaluation pipelines and investigate regressions Build observability, dashboards, and modeling tools for performance and cost trade-offs Partner with kernel, serving, routing, autoscaling, and capacity teams to implement optimizations Requirements Hands-on performance engineering: profiling, latency/throughput optimization, root-cause analysis Proficiency in Python and ability to work in large production codebases Data analysis skills (SQL, pandas) to turn telemetry into findings Strong communication of quantitative results and interest in numerical correctness
View original posting →

Tech stack

PythonSQLPandas

Match insights

Tech:Python, SQL, Pandas
Level:Mid
Location:San Francisco, United States