Skip to content
OpenAI logo

Researcher, Connectors - Agent Post-Training

OpenAIGenerative AI company
San Francisco, United StatesMid
Data & AI

About the role

Train and evaluate agent models to connect LLMs to professional software and ship model improvements.

  • Train and evaluate agentic models to interface with professional software, build data, evals, and training pipelines, and ship improvements into products.
  • Key Responsibilities Design and run experiments improving agent behavior for complex software and plugins.
  • Own post-training stack improvements: RL, data pipelines, graders, reward signals, and evals.
  • Build evals/environments, convert failures into training data or product fixes.
  • Partner with product teams to translate product signals into model improvements.
  • Improve large-scale training machinery for reliability, observability, and production readiness.
  • Requirements Strong technical fundamentals across ML, software engineering, systems, or statistics.
  • Hands-on experience with LLMs, RL, RLHF/RLAIF, post-training, evals, or graders.
  • Experience building experiments, synthetic data, and evaluation loops for model behavior.
  • Ability to work cross-functionally and debug hard failures into concrete fixes.
View original posting →

Tech stack

PythonOpenAI APINotionLinearSalesforceSlack

Match insights

Tech:Python, OpenAI API, Notion, Linear, Salesforce
Level:Mid
Location:San Francisco, United States