Doctolibvia Greenhouse

Senior Data Engineer Python/GCP (x/f/m)

Paris, Paris, FrancePosted 1w ago

Data EngineerSeniorFull-time

Not sure if you're a good fit?

Upload your resume and TixelJobs AI will compare it against Senior Data Engineer Python/GCP (x/f/m) at Doctolib. Get a match score, missing keywords, and improvement tips before you apply.

Free preview · Your resume stays private

About the Role

Your Impact

We are looking for a Senior Data Engineer to join the AI Team working on our AI Medical Companion.

Your mission will be to build and optimize the data foundations that power safe, scalable, and impactful AI models. You will work on data infrastructure for LLM, VLM, and RAG-based systems, ensuring our engineers and data scientists can train, evaluate, and deploy AI models efficiently on high-quality, well-structured, and compliant data. Your work will directly support health professionals in delivering better care while improving their work-life balance, ultimately impacting 80 million patients and 400,000 healthcare professionals across Europe.

Working in the tech team at Doctolib means building innovative products and features to improve the daily lives of care teams and patients.

What you'll do

Your responsibilities include but are not limited to:

Design, build, and maintain scalable data pipelines on Google Cloud Platform (GCP) for AI and machine learning use cases
Implement data ingestion and transformation frameworks that power Retrieval systems and training datasets for LLMs and multimodal models
Architect and manage NoSQL and Vector Databases to store and retrieve embeddings, documents, and model inputs efficiently
Collaborate with ML and platform teams to define data schemas, partitioning strategies, and governance rules that ensure privacy, scalability, and reliability
Integrate unstructured and structured data sources (text, speech, image, documents, metadata) into unified data models ready for AI consumption
Optimize performance and cost of data pipelines using GCP native services (BigQuery, Dataflow, Pub/Sub, Cloud Storage, Vertex AI)
Contribute to data quality and lineage frameworks, ensuring AI models are trained on validated, auditable, and compliant datasets
Continuously evaluate and improve our data stack to accelerate AI experimentation and deployment

Who you are

Before you read on: if you don't have the exact profile described below, but you feel this job description matches your skill set, we still encourage you to apply.

You'll be a great fit if you:

You have 5+ years of experience in Data Engineering, ideally supporting AI or ML workloads
You have strong experience with the GCP data ecosystem and proficiency in Python and SQL
You have deep understanding of NoSQL systems (e.g., MongoDB) and vector databases (e.g., FAISS, Vector Search)
You have experience designing data architectures for RAG, embeddings, or model training pipelines
You have knowledge of data governance, security, and compliance for sensitive or regulated data
You are fluent in English

It would be fantastic if you:

You hold a Master's or Ph.D. degree in Computer Science, Data Engineering, or a related field
Share

Ready to apply?

This job is active. Apply now to get in early.

Similar Jobs

NTI - Data Engineer

Hrfh

Junior Data Engineer

Codefirstgirls

Data engineer

IT Resources Srl

India - Senior Data Engineer

Issgh

View all jobs