Founding Machine Learning Engineer - Post Training, RL
Founding Machine Learning Engineer - Post Training, RL
A stealth-stage venture backed by Lux Capital (including backers of DeepMind and OpenAI)is on a mission to transform drug development with frontier-scale AI. Their goal: make large language models and multimodal AI systems practical for real-world biomedical applications-accelerating discovery and saving billions in R&D costs.
As a Founding Engineer, you'll work end-to-end-from data engine → training recipe → evaluation → deployment-to make cutting-edge models useful for drug development. You'll own post-training pipelines (SFT, DPO, RLHF), reward modeling, and evaluation systems, while collaborating closely with product and research teams.
Core Responsibilities
- Build and optimize post-training workflows for large-scale LLMs and multimodal models.
- Architect scalable data processing and filtering pipelines for proprietary biomedical datasets.
- Design and implement distributed training systems for foundation models.
- Rapidly iterate on prototypes and ship production-ready systems in a fast-paced, collaborative environment.
Skills
- Strong software engineering skills and experience building and deploying AI/ML systems at scale.
- Deep understanding of LLM training and post-training techniques (RLHF, instruction tuning, reward modeling).
- Proficiency in Python and modern ML frameworks (PyTorch, JAX).
- Familiarity with distributed training, multi-cloud infrastructure, and data pipeline design.
- Bonus: Prior startup experience or background in life sciences. Experience shipping frontier models end-to-end
Why This Role Is Unique
- Frontier-Scale Modeling: Architect and train a multimodal biomedical foundation model on a dataset at the magnitude of that used to train GPT-4.
- Applied LLMs for Science: Build systems that reason over heterogeneous biomedical data to accelerate decision-making in drug development.
- Massive-Scale Data Infrastructure: Design pipelines for ingesting and processing terabytes of structured and unstructured data across modalities.
- Founding-Level Impact: Own the core AI stack, shape model architecture, and define scaling laws for applied life sciences AI.