Embedding Vcvia Ashby

Member of Technical Staff - Efficient ML

San Francisco Bay AreaPosted 4mo ago

ML EngineerStaff+Full-time

Not sure if you're a good fit?

Upload your resume and TixelJobs AI will compare it against Member of Technical Staff - Efficient ML at Embedding Vc. Get a match score, missing keywords, and improvement tips before you apply.

Free preview · Your resume stays private

About the Role

Introducing Moonlake, AI for creating world simulations.

SCOPE OF WORK

Training efficiency

- Dataloaders, fusion, activation remat, gradient checkpointing.

- FSDP/ZeRO/tensor+pipeline parallel; NCCL tuning.

GPU + kernel performance

- Nsight profiling, Triton/CUDA kernels, fused ops.

- Flash-attention–style speedups, sequence packing, KV-cache tricks.

Inference optimization

- Low-latency serving, continuous batching, speculative decoding.

- Quantization (GPTQ/AWQ), distillation, pruning.

Infra + reliability

- SLURM/K8s multi-node jobs, checkpoint hygiene.

- Determinism, env pinning, GPU failure handling.

We are committed to being an on-site, in-person team currently based in San Mateo

Ready to apply?

This job is active. Apply now to get in early.

Similar Jobs

Machine Learning Engineer

HR Ashwini k

Principal AI Engineer

Nxt Level

AI Engineer, AI Transformation

Idinsight

Machine Learning Engineer

Cisco

View all jobs