TixelJobs
U
Ubervia Indeed

Staff Voice AI Engineer - Applied AI

San Francisco, CA, US$232K - $258K/yrPosted 4mo ago
ML EngineerStaff+Full-time#llm#spark#ray

Not sure if you're a good fit?

Upload your resume and TixelJobs AI will compare it against Staff Voice AI Engineer - Applied AI at Uber. Get a match score, missing keywords, and improvement tips before you apply.

Free preview · Your resume stays private

About the Role

About the Role:

Applied AI at Uber builds intelligent systems that power next-generation product experiences for riders, drivers, merchants, and couriers. As a Staff Voice AI Engineer, you will lead the design and deployment of large-scale, real-time Voice AI systems that enable natural, reliable, and intelligent voice interactions across Uber's ecosystem.

You will operate as a full-stack technical leader across speech modeling, LLM-powered conversational intelligence, and low-latency backend infrastructure - owning Voice AI systems end-to-end, from model development and evaluation to highly available, distributed production services. This includes advancing capabilities in automatic speech recognition (ASR), text-to-speech (TTS), spoken language understanding, and LLM-driven dialogue systems.

You will partner closely with product, design, and infrastructure teams to translate customer pain points into seamless voice-first experiences - setting the foundation for how Voice AI is built, deployed, and operated across Uber's global platform.

What You Will Do:
  • Design and build end-to-end Voice AI solutions, from understanding customer pain points and defining product requirements to deploying LLM-powered, real-time voice interfaces in production.
  • Benchmark and evaluate voice AI systems, including speech recognition, speech synthesis, and spoken language understanding, by designing evaluations, analyzing results, and identifying systematic weaknesses.
  • Improve voice model performance through system prompt tuning, fine-tuning voice- and speech-specific models, and optimizing architectures for low-latency, real-time voice interactions.
  • Analyze voice request logs, prompt traces, and audio inputs to diagnose failure modes, improve transcription accuracy, conversational quality, and overall user experience.
  • Build and maintain internal tools and platforms to automate Voice AI workflows, such as large-scale transcription pipelines, real-time audio processing services, and evaluation harnesses for voice quality.
  • Own Voice AI systems in production end-to-end, including rollout strategies, monitoring, alerting, quality regression detection, and on-call readiness.
  • Collaborate closely with product, design, and research teams to translate user needs into Voice AI capabilities with measurable business and customer impact.
Basic Qualifications:
  • 10+ years of experience in software engineering, data science, or machine learning, including a track record of shipping production AI systems.
  • Deep understanding of large language models, including fine-tuning, prompt engineering, embeddings, and retrieval-augmented generation (RAG).
  • Strong backend and distributed systems expertise, with experience designing and operating highly available, scalable services in production.
  • Deep experience with ML infrastructure, including model training pipelines, online serving systems, feature stores, experiment platforms, and evaluation frameworks.
  • Hands-on experience with distributed data processing systems (e.g., Spark, Flink, Ray) and workflow orchestration (e.g., Airflow or equivalent).
  • Ability to analyze data, run experiments, and derive insights for model and product improvement.
  • Excellent communication and collaboration skills across technical and non-technical teams.
Preferred Qualifications:
  • Experience building evaluation frameworks for Voice AI, including metrics and human/LLM-assisted evaluations for speech recognition accuracy, latency, robustness, and naturalness of synthesized speech.
  • Demonstrated expertise in machine learning fundamentals applied to voice, including model evaluation, training, and fine-tuning of ASR, TTS, or speech-language models.
  • Proven experience deploying Voice AI systems to production, with an emphasis on low-latency, high-reliability, real-time environments.
  • Experience writing developer documentation, creating voice-specific SDKs, or enabling internal teams to build on shared Voice AI platforms.
  • Hands-on work with large-scale audio datasets, including data curation, labeling strategies, and optimization of voice processing pipelines at scale.


For San Francisco, CA-based roles: The base salary range for this role is USD$232,000 per year - USD$258,000 per year. For Sunnyvale, CA-based roles: The base salary range for this role is USD$232,000 per year - USD$258,000 per year. For all US locations, you will be eligible to participate in Uber's bonus program, and may be offered an equity award & other types of comp. All full-time employees are eligible to participate in a 401(k) plan. You will also be eligible for various benefits. More details can be found at the following link https://jobs.uber.com/en/benefits.
Share
Job Not Found | TixelJobs — Jobs at AI Companies