ML Engineer & Post-Training Lead |Hugging Face
Co-authored 'NLP with Transformers' (O'Reilly), leads alignment and post-training at Hugging Face. Created Zephyr, SetFit, Alignment Handbook, and co-built TRL.
Biography
Lewis Tunstall is a Machine Learning Engineer at Hugging Face, where he leads post-training efforts and builds open-source tools to align language models with human preferences. He co-authored the O'Reilly bestseller "Natural Language Processing with Transformers" (2022) with Leandro von Werra and Thomas Wolf, and is a core contributor to TRL (Transformer Reinforcement Learning) and the Alignment Handbook. Before transitioning to ML, Tunstall earned a PhD in Theoretical Physics from the University of Adelaide, was a 2010 Fulbright Scholar, and held research positions in Australia, the USA, and Switzerland (University of Bern). He grew up in northwest Tasmania, Australia and is now based in Bern, Switzerland.
Bestselling book co-authored with Leandro von Werra and Thomas Wolf, providing a hands-on guide to building NLP applications with Hugging Face Transformers. Revised edition published in full color.
Core contributor to Hugging Face's library for fine-tuning and aligning language models via RLHF, DPO, and other preference learning methods. One of the most widely used alignment libraries in the open-source ecosystem.
Led the Zephyr project demonstrating that distilled DPO (dDPO) with AI feedback can produce a 7B-parameter chat model surpassing Llama2-Chat-70B on MT-Bench, without requiring human annotation.
Open-source collection of robust recipes to align language models with human and AI preferences, covering SFT, DPO, Constitutional AI, and model-specific fine-tuning guides.
Efficient few-shot learning framework that fine-tunes Sentence Transformers contrastively on small labeled sets, achieving competitive accuracy with orders of magnitude fewer parameters than prompt-based methods.
Teamed up with Numina to create an 850k math problem-solution dataset and won the first AI Math Olympiad progress prize, demonstrating the power of high-quality open datasets for mathematical reasoning.
Co-authored SmolLM2 and SmolLM3 small language models and co-launched the Open-R1 initiative to reproduce open reasoning model training, including OlympicCoder for competitive programming.
Co-developed the free Hugging Face NLP course bridging software engineers into the Transformers ecosystem, covering tokenization, fine-tuning, and deployment across multiple modalities.
A few lines of code could outperform features that had been carefully designed by physicists over many years.
You could skip the costly human annotation step altogether and focus on generating data for specific tasks.
Just start. Just start coding. Just start contributing if you want to do open-source. You can always find reasons not to do it but you just have to get your hands dirty.
The biggest lesson I learned when I was starting out in the field is using baseline models when starting out... you can actually get quite far with regular expressions and linear models like logistic regression.
History generally shows that one shouldn't bet against deep learning!
Although there are several open weight models for mathematics, the training datasets are rarely, if ever, made public.
Research generated March 19, 2026