TixelJobs
H
Helsingvia Greenhouse

AI Research Engineer - AI Safety

Berlin; London; MunichPosted 2w ago
ResearchMid LevelFull-time#ai-lab

Not sure if you're a good fit?

Upload your resume and TixelJobs AI will compare it against AI Research Engineer - AI Safety at Helsing. Get a match score, missing keywords, and improvement tips before you apply.

Free preview · Your resume stays private

About the Role

Who we are

At Helsing we deliver AI-based capabilities and the enabling foundation that allow machines to perceive and assist human decision-making. You will have the unique opportunity to shape AI capabilities in one of the most challenging sectors, where high generalisation capabilities need to be paired with hardware constraints and robustness against adversarial attacks.

You will join a team focused on AI Assurance, where you will develop cutting-edge techniques for scalable evaluation of AI products across the company, design data collection and experimentation strategies to extract causal insights, and enhance responsible decision-making via uncertainty quantification and safety mechanisms.

The Role

At Helsing we deliver AI-based capabilities and the enabling foundation that allow machines to perceive and assist human decision-making. You will have the unique opportunity to shape AI capabilities in one of the most challenging sectors, where high generalisation capabilities need to be paired with robustness against adversarial attacks and the highest standards of operational safety.

You will be responsible for defining operational domains and evaluating the reliability of AI capabilities developed in-house. Your work will span the full assurance lifecycle: from characterising distribution shifts and failure modes, to developing and extending the state of the art in uncertainty quantification and calibration. You will interface deeply with our AI systems, design rigorous evaluation frameworks, and assess their robustness under real-world and adversarial conditions, collaborating across research, engineering, and product teams to translate assurance findings into actionable improvements.

You should apply if you

  • Hold an MSc in Mathematics, Statistics, Machine Learning, or a closely related field, with a strong mathematical and statistical foundation.

  • Have hands-on experience in model evaluation, uncertainty quantification, or calibration. You understand the difference between epistemic and aleatoric uncertainty and know how to measure and reduce them in deep learning models.

  • Are familiar with methods for distribution shift detection, out-of-distribution detection, and adversarial robustness evaluation, and can design experiments that surface genuine failure modes rather than benchmark artefacts.

  • Possess solid software engineering skills, writing clean and well-structured code in Python and/or languages like Rust or modern C++, and have experience deploying AI software to production including testing, QA, and monitoring.

  • Have excellent communication skills and the ability to report and present research findings clearly and efficiently, both internally and externally.

  • Are passionate about keeping up to date with current research and enjoy reimplementing and extending state-of-the-art approaches in deep learning evaluation and assurance.

Note: We operate at an intersection where women, as well as other minority groups, are systemically under-represented. We encourage you to apply even if you don’t meet all the listed qualifications;  ability and impact cannot be summarised in a few bullet points.

Nice to have

  • PhD in model evaluation, uncertainty quantification, robustness, experimental design, causal inference, or a related field, with publications in top-tier venues (e.g. NeurIPS, ICML, ICLR, CVPR).

  • Previous industrial experience assuring the safe deployment of AI products in high-stakes or safety-critical systems.

  • Familiarity with formal methods, interpretability techniques, or Bayesian approaches to reasoning about model behaviour under uncertainty.

  • Experience with adversarial machine learning, red-teaming, or systematic stress-testing of AI systems in operational settings.

  • Experience with conformal prediction, calibration methods (e.g. temperature scaling, Platt scaling), or Bayesian deep learning.

Join Helsing and work with world-leading experts in their fields

  • Helsing’s work is important. You’ll be directly contributing to the protection of democratic countries while balancing both ethical and geopolitical concerns. 

  • The work is unique. We operate in a domain that has highly unusual technical requirements and constraints, and where robustness, safety, and ethical considerations are vital. You will face unique Engineering and AI challenges that make a meaningful impact in the world. 

  • Our work frequently takes us right up to the state of the art in technical innovation, be it reinforcement learning, distributed systems, generative AI, or deployment infrastructure. The defence industry is entering the most exciting phase of the technological development curve. Advances in our field of world are not incremental: Helsing is part of, and often leading, historic leaps forward. 

  • In our domain, success is a matter of order-of-magnitude improvements and novel capabilities. This means we take bets, aim high, and focus on big opportunities. Despite being a relatively young company, Helsing has already been selected for multiple significant government contracts.  

  • We actively encourage healthy, proactive, and diverse debate internally about what we do and how we choose to do it. Teams and individual engineers are trusted (and encouraged) to practise responsible autonomy and critical thinking, and to focus on outcomes, not conformity. At Helsing you will have a say in how we (and you!) work, the opportunity to engage on what does and doesn’t work, and to take ownership of aspects of our culture that you care deeply about. 

What we

Share