Researcher, Connectors - Agent Post-Training

OpenAIGenerative AI company

San Francisco, United StatesMid

Data & AI

About the role

Train and evaluate agent models to connect LLMs to professional software and ship model improvements.

•Train and evaluate agentic models to interface with professional software, build data, evals, and training pipelines, and ship improvements into products.
•Key Responsibilities Design and run experiments improving agent behavior for complex software and plugins.
•Own post-training stack improvements: RL, data pipelines, graders, reward signals, and evals.
•Build evals/environments, convert failures into training data or product fixes.
•Partner with product teams to translate product signals into model improvements.
•Improve large-scale training machinery for reliability, observability, and production readiness.
•Requirements Strong technical fundamentals across ML, software engineering, systems, or statistics.
•Hands-on experience with LLMs, RL, RLHF/RLAIF, post-training, evals, or graders.
•Experience building experiments, synthetic data, and evaluation loops for model behavior.
•Ability to work cross-functionally and debug hard failures into concrete fixes.

PythonOpenAI APINotionLinearSalesforceSlack

Tech:Python, OpenAI API, Notion, Linear, Salesforce

Level:Mid

Location:San Francisco, United States