Senior Product Manager, Conversational AI Chatbot & Agent Quality
Not sure if you're a good fit?
Upload your resume and TixelJobs AI will compare it against Senior Product Manager, Conversational AI Chatbot & Agent Quality at OKX. Get a match score, missing keywords, and improvement tips before you apply.
Free preview · Your resume stays private
About the Role
Who We Are
About The Opportunity
We are looking for an execution-focused Product Manager who has built and improved conversational AI products in production — and has business results to prove it. A strong plus is hands-on experience with agent evaluation harnesses or internal agent platform product design: you've defined the systems that test, score, and operate agents at scale, not just shipped the agents themselves.
You work in logs and specs, not just decks. You know what a bad retrieval chunk looks like, you've personally written labeling guidelines, and you can point to a quarter where your work moved resolution rate by double digits.
What We Are Looking For
You have hands-on experience building and operating conversational AI products in production — not just shipping agents, but owning the quality systems, data pipelines, and operational platforms that keep them reliable at scale. Ideal candidates will have background in one or more of the following areas:
- Knowledge Base & Data Quality — knowledge base architecture, retrieval quality tuning, content governance, labeling pipelines, annotation guidelines, training data impact tracking, and dataset freshness management
- Agent Evaluation & Quality Assurance — evaluation harness design, test case schemas, automated scoring rubrics (correctness, groundedness, tool-use accuracy), LLM-as-judge evaluation, regression testing for non-deterministic systems, and feedback-driven improvement loops
- Chatbot Operations & Dialogue Design — SOP-to-agent-flow translation, edge case handling, escalation path design, log-based failure triage, and metrics ownership (resolution rate, fallback rate, per-intent accuracy, CSAT)
- Agent Runtime & Observability Platforms — agent runtime product requirements, tool permission models, task configuration interfaces, developer-facing observability dashboards, failure alerting logic, and debugging workflows
- Human-in-the-Loop Workflows — low-confidence case routing, reviewer task interface design, correction data capture, and feedback loop integration back into training or knowledge pipelines
Chatbot & Knowledge Base (Core)
- Built or rebuilt a knowledge base — defined structure, wrote/reviewed content, fixed retrieval quality, saw metrics improve
- Designed SOPs that became agent flows — mapped real business processes, handled edge cases, shipped as working dialogue flows
- Owned a labeling pipeline — wrote annotation guidelines, QA'd batches, tracked whether labeled data moved production metrics
- Moved a metric that mattered — resolution rate, fallback rate, CSAT — and can explain exactly what changed
Agent Harness & Platform Product (Strong Plus)
- Designed an agent evaluation harness: defined test case schemas, scoring rubrics, and spec'd automated evaluation pipelines with engineering
- Product-designed an internal agent platform: defined requirements for agent runtime — tool permission models, task configuration interfaces, developer-facing observability dashboards, and failure debugging workflows; owned the roadmap and shipped iteratively
Ready to apply?
This job is active. Apply now to get in early.