Researcher, Connectors - Agent Post-Training
OpenAIGenerative AI company
San Francisco, United StatesMid
Data & AI
About the role
Train and evaluate agent models to connect LLMs to professional software and ship model improvements.
- •Train and evaluate agentic models to interface with professional software, build data, evals, and training pipelines, and ship improvements into products.
- •Key Responsibilities Design and run experiments improving agent behavior for complex software and plugins.
- •Own post-training stack improvements: RL, data pipelines, graders, reward signals, and evals.
- •Build evals/environments, convert failures into training data or product fixes.
- •Partner with product teams to translate product signals into model improvements.
- •Improve large-scale training machinery for reliability, observability, and production readiness.
- •Requirements Strong technical fundamentals across ML, software engineering, systems, or statistics.
- •Hands-on experience with LLMs, RL, RLHF/RLAIF, post-training, evals, or graders.
- •Experience building experiments, synthetic data, and evaluation loops for model behavior.
- •Ability to work cross-functionally and debug hard failures into concrete fixes.
Tech stack
PythonOpenAI APINotionLinearSalesforceSlack
Match insights
Tech:Python, OpenAI API, Notion, Linear, Salesforce
Level:Mid
Location:San Francisco, United States