Tim Dettmers

Research Papers

3 papers

arXiv

QLoRA: Efficient Finetuning of Quantized LLMs

2305.14314·2023-05-23

arXiv

LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

2208.07339·2022-08-15

arXiv

SERA: Soft-Verified Efficient Repository Agents

2601.20789·2026-01-27

Biography

Tim Dettmers is an Assistant Professor at Carnegie Mellon University (Machine Learning and Computer Science departments) and a Research Scientist at the Allen Institute for Artificial Intelligence (Ai2). He earned his PhD from the University of Washington under Luke Zettlemoyer. Dettmers is the creator and maintainer of bitsandbytes (8,000+ stars, 2.2 million monthly installs), the foundational open-source library for k-bit quantization in PyTorch that powers efficient LLM inference and fine-tuning across the ecosystem. He is the lead author of QLoRA, an efficient fine-tuning method that enables training a 65B-parameter model on a single 48GB GPU while preserving full 16-bit performance, and LLM.int8(), which brought 8-bit matrix multiplication to transformers at scale. His current research focuses on open-source coding agents (SERA) and making foundation models accessible on consumer hardware. Before his PhD he worked three years in factory automation and studied psychology. He describes himself as dyslexic with bottom-5% working memory, which he credits for driving him toward simpler, more elegant solutions.

Quantization (4-bit, 8-bit, NormalFloat)Efficient LLM Fine-Tuning (QLoRA, LoRA)bitsandbytes LibraryOpen-Source Coding Agents (SERA)LLM Inference OptimizationSparse LearningKnowledge Graph Embeddings (ConvE)AI Accessibility & DemocratizationHardware Scaling LimitsOn-Device Mixture of Experts

Timeline

13 Research13 total

2026

2026-01Research

Published SERA (Soft-Verified Efficient Repository Agents) with Ai2 -- first open coding agent series, 32B model solving 54.2% of SWE-Bench Verified at $400 reproduction cost

2026-01Research

Published blog post 'Use Agents or Be Left Behind? A Personal Guide to Automating Your Own Work'

2025

2025-01Research

Received Google ML and Systems Junior Faculty Award ($100K unrestricted funding) as one of 50+ assistant professors selected across 27 U.S. universities

2025-01Research

Joined Carnegie Mellon University as Assistant Professor in MLD and CSD

2025-12Research

Interviewed by The Register, argued AGI and superintelligence thinking is 'fundamentally flawed' and GPU scaling has 1-2 years left

2024

2024-01Research

Named AI2050 Early Career Fellow by Schmidt Sciences (2-year research grant)

2023

2023-01Research

Received Madrona Prize, Google Open Source Award, and PyTorch Foundation Award for bitsandbytes

2023-05Research

Released QLoRA: Efficient Finetuning of Quantized LLMs (arXiv 2305.14314), enabling 65B model fine-tuning on a single GPU

2023-12Research

QLoRA paper presented at NeurIPS 2023, receiving Outstanding Paper designation

2022

2022-01Research

Published LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale, enabling efficient inference for large language models

2021

2021-01Research

Received NeurIPS 2021 Best Reviewer Award

2018

2018-01Research

Published ConvE: Convolutional 2D Knowledge Graph Embeddings (692 stars), a widely-cited approach for link prediction in knowledge graphs

2013

2013-08Research

Created GitHub account (TimDettmers), later growing to 59 public repos and 1,395 followers

Key Contributions

bitsandbytes

Foundational open-source library for k-bit quantization in PyTorch. Enables accessible LLM inference and training via 4-bit and 8-bit optimizers. 8,000+ stars, 2.2 million monthly installs, received Google Open Source and PyTorch Foundation awards.

QLoRA

Efficient fine-tuning method using 4-bit NormalFloat quantization and double quantization that enables training a 65B-parameter model on a single 48GB GPU. Guanaco models reached 99.3% of ChatGPT performance on Vicuna benchmark. 10,800+ stars. NeurIPS 2023.

LLM.int8()

8-bit matrix multiplication method for transformers at scale, enabling efficient inference for billion-parameter models without significant quality degradation. A key building block in the quantization ecosystem.

SERA (Soft-Verified Efficient Repository Agents)

Open-source coding agent from Ai2 that solves 54.2% of SWE-Bench Verified. Built with 32 GPUs and 5 researchers, reproducible for ~$400, 26x more efficient than RL approaches.

ConvE

Convolutional 2D Knowledge Graph Embeddings -- a pioneering approach for link prediction in knowledge graphs using 2D convolutions over embedding matrices. 692 stars.

Sparse Momentum

Sparse learning library implementing sparse momentum for training sparse neural networks. 385 stars.

Notable Quotes

“

The thinking around AGI and superintelligence is not just optimistic, but fundamentally flawed.

The Register interview (Dec 2025)·Source

“

We have maybe one, maybe two more years of scaling left before further improvements become physically infeasible.

The Register interview (Dec 2025)·Source

“

Sometimes constraints force you to find simpler solutions -- and sometimes those solutions turn out to be better.

Building SERA blog post (Jan 2026)·Source

“

Because our method is cheap, it opens coding agent research to everyone. You do not need large teams or thousands of dollars.

Building SERA blog post (Jan 2026)·Source

“

Switching fields from one that you are well established in to something completely new is probably one of the hardest things you can do in research.

Building SERA blog post (Jan 2026)·Source

“

I made much more progress with less compute.

Interconnects interview·Source

“

Constraints give you creativity.

Interconnects interview·Source

“

Open source can be competitive and might actually overtake closed source APIs.

Interconnects interview·Source

15 sources(click to expand)

About Me -- Tim Dettmers (personal website)Tim Dettmers -- CMU Faculty Profile TimDettmers on GitHub (59 repos, 1.4K followers)bitsandbytes-foundation/bitsandbytes on GitHub (8K+ stars)QLoRA: Efficient Finetuning of Quantized LLMs (arXiv 2305.14314)QLoRA NeurIPS 2023 proceedings SERA: Soft-Verified Efficient Repository Agents (arXiv 2601.20789)Interviewing Tim Dettmers on open-source AI (Interconnects)Ai2's Tim Dettmers: AGI is a fantasy (The Register)Dettmers Receives Google ML and Systems Junior Faculty Award (CMU)Tim Dettmers & Aviral Kumar Named AI2050 Early Career Fellows (CMU)My Journey Towards Coding Agents: Building SERA (blog)Tim Dettmers on Google Scholar Making LLMs accessible with bitsandbytes and QLoRA (Hugging Face blog)Open Coding Agents (Ai2 blog)

Research generated March 19, 2026

Researchers & Thinkers/Tim Dettmers

All Profiles