SMACK Technologiesvia Indeed

Simulation-to-ML Infrastructure Engineer

El Segundo, CA, US$140K - $180K/yrPosted 2mo ago

MLOpsMid LevelFull-time#python#pytorch#tensorflow#reinforcement-learning#kubernetes#docker#java

Not sure if you're a good fit?

Upload your resume and TixelJobs AI will compare it against Simulation-to-ML Infrastructure Engineer at SMACK Technologies. Get a match score, missing keywords, and improvement tips before you apply.

Free preview · Your resume stays private

About the Role

Role Overview

Smack Technologies is building the infrastructure that turns large-scale simulation output into usable training data for reinforcement learning systems. As a Simulation-to-ML Infrastructure Engineer on Applied Engineering, you will own the end-to-end pipeline from raw simulation states through data processing, storage, versioning, and delivery into ML training workflows.

This is a greenfield role. Early work focuses on standing up foundational infrastructure that enables iteration. Over time, the system must scale to handle massive simulation-driven data generation and repeated training cycles. The emphasis is on building working systems first, then evolving them as usage and scale increase.

You will operate at the intersection of simulation, data engineering, infrastructure, and ML consumption, working closely with simulation engineers and ML researchers to ensure the system supports real training needs.

What You’ll Do

Stand up foundational infrastructure to support simulation execution and data collection.

Design and implement data storage and management practices that scale with growing volume and complexity.

Build initial data pipelines that ingest simulation outputs and prepare them for reinforcement learning training.

Implement basic validation, quality checks, and data organization to support early experimentation.

Establish data versioning and lineage practices to support reproducibility as the system evolves.

Set up experiment tracking, dataset management, and model artifact storage for training workflows.

Support training job execution and infrastructure required for iterative RL experimentation.

Work with simulation and ML teams to define data interfaces and end-to-end flow requirements.

Build bridges between simulation systems and ML training infrastructure, with future feedback loops in mind.

Implement containerization, deployment pipelines, and basic observability to support rapid iteration.

Continuously evaluate scaling bottlenecks and evolve the system as usage patterns emerge.

Document architectural decisions and patterns to keep the system understandable as it grows.

Contribute to adjacent infrastructure or tooling work as needed to unblock progress.

Must-Have Qualifications

Active TS/SCI clearance
Experience building and operating infrastructure in greenfield environments
Strong background in distributed systems, data pipelines, or ML infrastructure
Experience with containerization and orchestration using Docker and Kubernetes
Comfort working with cloud platforms and infrastructure automation
Solid understanding of Linux systems, networking, and storage
Experience designing systems that start simple and evolve toward scale
Strong programming skills in Go, Python, Java, or similar
Ability to work across simulation, data, and ML boundaries
Comfort operating with ambiguity and making pragmatic tradeoffs

Core Technologies & Concepts

Infrastructure: Docker, Kubernetes, CI/CD
Data Pipelines: ingestion, validation, versioning, storage
ML Training Support: experiment tracking, dataset management, artifact storage
Systems: distributed execution, scalability, reliability
Languages: Go, Python, or similar
Deployment Contexts: TS/SCI, IL-7, on-prem environments

Nice-to-Have Qualifications

Prior experience in ML infrastructure or MLOps
Experience supporting reinforcement learning or large-scale training systems
Familiarity with ML frameworks such as PyTorch or TensorFlow
Experience with experiment tracking tools or workflow orchestration systems
Background in simulation, scientific computing, or HPC environments
Experience evolving systems from early prototypes to large-scale platforms

Pay: $140,000.00 - $180,000.00 per year

Benefits:

401(k)
Dental insurance
Health insurance
Paid time off
Vision insurance

Application Question(s):

Do you have experience building infrastructure and data pipelines from scratch to support simulation or machine learning workflows?
Do you have experience implementing data validation, versioning, and experiment tracking to support reproducible ML workflows

Security clearance:

Top Secret (Required)

Ability to Commute:

El Segundo, CA 90245 (Required)

Work Location: In person

Ready to apply?

This job is active. Apply now to get in early.

Similar Jobs

Senior Automation & AI Platform Engineer

Gbsgroup

Data Science & MLOps Specialist (m/f/d)

DEKRA España

AI Infrastructure Engineer

Scout Motors Inc.

Junior MLOps Engineer

Zzazz

View all jobs