CEO & Co-Founder of Unsloth AI |Unsloth AI
Australian engineer who created Unsloth, making LLM fine-tuning up to 30x faster with 90% less memory through custom Triton kernels and optimized LoRA/QLoRA training.
Biography
Daniel Han is an Australian software engineer and CEO/co-founder of Unsloth AI (YC S24), based in San Francisco. Together with his brother Michael Han, he created Unsloth, an open-source framework that accelerates LLM fine-tuning by up to 30x while using up to 90% less memory, making custom model training accessible on consumer GPUs. Before founding Unsloth, Daniel spent over 8 years in production ML engineering and numerical optimization: at NVIDIA he sped up t-SNE by 2000x and reduced SVD memory usage in cuPy by ~50%, and he maintained HyperLearn, an open-source ML acceleration package used by NASA and Microsoft engineers. Trained in Data Science and Actuarial/Law at UNSW Australia, he declined a lifetime NVIDIA offer to pursue Unsloth full-time. He is also known for finding and fixing 20+ bugs in major open-source LLMs including Gemma, Llama, Mistral, and Phi, and for pioneering dynamic quantization techniques that preserve model accuracy at extremely low bit-widths. Unsloth's models have surpassed 10 million monthly downloads on Hugging Face.
Open-source LLM fine-tuning framework that achieves up to 30x speedup with 90% less memory by rewriting PyTorch modules into custom Triton kernels and manually deriving backpropagation steps. Over 56,000 GitHub stars and 10 million monthly Hugging Face downloads.
Novel quantization method that selectively avoids quantizing sensitive parameters, enabling 1.58-bit and 2-bit model compression while preserving accuracy. Applied to DeepSeek-R1, Llama, Qwen, and other models, solving the problem of output degradation at extreme compression levels.
Found and fixed over 20 bugs in major open-source LLMs including Google Gemma (8 bugs), Meta Llama, Mistral, and Microsoft Phi, improving model reliability across the ecosystem. Partnered with Google, OpenAI, Meta, and NVIDIA on quality assurance.
Open-source ML acceleration package that makes machine learning algorithms significantly faster, adopted by engineers at NASA, Microsoft, NVIDIA, Facebook, HP, VMware, and Intel.
Implementation of Group Relative Policy Optimization (GRPO) and FP8 precision reinforcement learning in Unsloth, enabling efficient reasoning model training with 1.4x faster performance and 60% less VRAM usage.
Research generated March 19, 2026