Daniel Han

Biography

Daniel Han is an Australian software engineer and CEO/co-founder of Unsloth AI (YC S24), based in San Francisco. Together with his brother Michael Han, he created Unsloth, an open-source framework that accelerates LLM fine-tuning by up to 30x while using up to 90% less memory, making custom model training accessible on consumer GPUs. Before founding Unsloth, Daniel spent over 8 years in production ML engineering and numerical optimization: at NVIDIA he sped up t-SNE by 2000x and reduced SVD memory usage in cuPy by ~50%, and he maintained HyperLearn, an open-source ML acceleration package used by NASA and Microsoft engineers. Trained in Data Science and Actuarial/Law at UNSW Australia, he declined a lifetime NVIDIA offer to pursue Unsloth full-time. He is also known for finding and fixing 20+ bugs in major open-source LLMs including Gemma, Llama, Mistral, and Phi, and for pioneering dynamic quantization techniques that preserve model accuracy at extremely low bit-widths. Unsloth's models have surpassed 10 million monthly downloads on Hugging Face.

LLM Fine-Tuning OptimizationLoRA / QLoRA TechniquesReinforcement Learning for LLMs (GRPO)Custom Triton KernelsDynamic Quantization (1.58-bit, 2-bit, 4-bit)Memory-Efficient TrainingOpen-Source AI InfrastructureGPU Kernel Optimization (CUDA, Triton)Flash Attention IntegrationModel Bug-Fixing & Quality Assurance

Timeline

15 Research15 total

2026

2026-01Research

Launched ultra long context RL training, enabling gpt-oss models with 380K context windows, plus faster MoE training and embedding model support

2025

2025-01Research

Released DeepSeek-R1 dynamic 1.58-bit quantization, preserving model accuracy at extreme compression levels

2025-11Research

Introduced FP8 Reinforcement Learning and GRPO support, enabling 1.4x faster RL training with 60% less VRAM

2024

2024-01Research

Uploaded quantized models to Hugging Face, quickly reaching millions of downloads

2024-03Research

Found and fixed 8 bugs in Google's Gemma models, establishing reputation for LLM quality assurance

2024-04Research

Introduced Unsloth Gradient Checkpointing for 4x longer context lengths; enabled Llama 3 70B to fit on 48GB VRAM

2024-05Research

Selected for the 2024 GitHub Accelerator program as one of 11 projects shaping open-source AI, receiving $40K in funding

2024-06Research

Accepted into Y Combinator Summer 2024 batch (YC S24)

2024-10Research

Completed seed funding round of $500K

2024-11Research

Introduced 2x faster and 70% less VRAM vision fine-tuning supporting Llama, Qwen, Pixtral, and Llava models

2023

2023-01Research

Co-founded Unsloth AI with brother Michael Han, targeting LLM fine-tuning acceleration

2023-11Research

Unsloth repository created on GitHub, open-sourcing the fine-tuning framework

2023-12Research

Launched Unsloth publicly, achieving 2x faster training with 50% less memory on initial release

2019

2019-01Research

Released HyperLearn, an open-source ML acceleration package adopted by NASA, Microsoft, NVIDIA, Facebook, and Intel

2016

2016-10Research

Created GitHub account and began open-source contributions in ML optimization

Key Contributions

Unsloth

Open-source LLM fine-tuning framework that achieves up to 30x speedup with 90% less memory by rewriting PyTorch modules into custom Triton kernels and manually deriving backpropagation steps. Over 56,000 GitHub stars and 10 million monthly Hugging Face downloads.

Dynamic Quantization (Unsloth Dynamic 2.0)

Novel quantization method that selectively avoids quantizing sensitive parameters, enabling 1.58-bit and 2-bit model compression while preserving accuracy. Applied to DeepSeek-R1, Llama, Qwen, and other models, solving the problem of output degradation at extreme compression levels.

LLM Bug Discovery and Fixes

Found and fixed over 20 bugs in major open-source LLMs including Google Gemma (8 bugs), Meta Llama, Mistral, and Microsoft Phi, improving model reliability across the ecosystem. Partnered with Google, OpenAI, Meta, and NVIDIA on quality assurance.

HyperLearn

Open-source ML acceleration package that makes machine learning algorithms significantly faster, adopted by engineers at NASA, Microsoft, NVIDIA, Facebook, HP, VMware, and Intel.

GRPO and FP8 Reinforcement Learning

Implementation of Group Relative Policy Optimization (GRPO) and FP8 precision reinforcement learning in Unsloth, enabling efficient reasoning model training with 1.4x faster performance and 60% less VRAM usage.

Notable Quotes

“

We're trying to make fine-tuning extremely easy for people.

VentureBeat interview on GitHub Accelerator·Source

“

Our open source package uses 70% less memory and is twice as fast as existing solutions, with no accuracy degradation.

VentureBeat interview on GitHub Accelerator·Source

11 sources(click to expand)

danielhanchen (Daniel Han) - GitHub Profile Unsloth AI - Y Combinator Company Page Introducing Unsloth - Official Blog Unsloth About Page 2024 GitHub Accelerator: Meet the 11 projects shaping open source AI - GitHub Blog GitHub Accelerator fuels open source AI revolution - VentureBeat Unsloth GitHub Repository Daniel Han - No Cap Blog Profile DeepSeek-R1 Dynamic 1.58-bit Quantization - Unsloth Blog Unsloth AI - Crunchbase GitHub API - users/danielhanchen

Research generated March 19, 2026

AI Infrastructure & Inference/Daniel Han

All Profiles