Andrej Karpathy

Research Papers

2 papers

arXiv

Deep Visual-Semantic Alignments for Generating Image Descriptions

1412.2306·2014-12-14

arXiv

Visualizing and Understanding Recurrent Networks

1506.02078·2015-06-05

Biography

Andrej Karpathy is a Slovakian-Canadian AI researcher, educator, and engineer who has shaped modern deep learning practice across research, industry, and education. Born in Bratislava, he moved to Toronto at 15, earned a BSc in Computer Science and Physics from the University of Toronto (2009), an MSc from UBC (UBC) (2011), and a PhD from Stanford under Fei-Fei Li (2015) with the thesis 'Connecting Images and Natural Language.' He co-designed and was lead instructor of Stanford's CS231n, the first deep learning course at Stanford. As a founding member of OpenAI (2015-2017) he worked on deep learning, generative models, and reinforcement learning. As Director of AI at Tesla (2017-2022) he led the Autopilot computer vision team. After a brief return to OpenAI (2023-2024) working on midtraining and synthetic data, he founded Eureka Labs in July 2024, an AI-native education company. Karpathy is the creator of iconic open-source projects including char-rnn, convnetjs, nanoGPT, minGPT, micrograd, llm.c, llama2.c, minbpe, nanochat, autoresearch, and LLM101n. His YouTube channel 'Neural Networks: Zero to Hero' has over 1 million subscribers. He coined the terms 'Software 2.0' and 'vibe coding,' and popularized 'context engineering' as a replacement for 'prompt engineering.' His GitHub has 149,000+ followers, making him one of the most followed developers on the platform.

Deep Learning EducationLarge Language ModelsNeural Network Training from ScratchComputer VisionSelf-Driving / Autonomous VehiclesSoftware 3.0 / LLM-as-ComputerVibe CodingContext EngineeringTokenization (BPE)AI Safety & AGI TimelinesOpen-Source AI InfrastructureAI-Native EducationAutonomous Research Agents

Key Contributions

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs. Arguably the most influential educational ML codebase, with 55k+ GitHub stars. Made transformer training accessible to individual developers.

nanochat

Full-stack ChatGPT clone: pretraining, SFT, RLHF, and inference in ~8,000 lines of PyTorch. Trainable end-to-end on a single GPU for under $100. 49k+ stars.

autoresearch

AI agents that autonomously run ML experiments on a single GPU overnight. 630 lines of Python, built on the nanochat training core. 43k+ stars.

llm.c

LLM training in simple, raw C/CUDA with no PyTorch or Python dependency. Achieved multi-GPU bfloat16 training with flash attention 7% faster than PyTorch nightly. 29k+ stars.

LLM101n

Eureka Labs' first course: 'Let's Build a Storyteller.' An undergraduate-level guide to training your own AI from scratch, designed to be guided by an AI Teaching Assistant. 36k+ stars for the course repo alone.

micrograd

A tiny scalar-valued autograd engine and neural net library with PyTorch-like API. The canonical educational resource for understanding backpropagation. 15k+ stars.

Neural Networks: Zero to Hero (YouTube)

YouTube lecture series covering backpropagation, language modeling, tokenization, attention, and GPT training from scratch. Over 1 million subscribers. Required viewing for aspiring ML engineers.

Stanford CS231n

Co-designed and was primary instructor for Stanford's first deep learning course, Convolutional Neural Networks for Visual Recognition. Grew from 150 to 750 students and became one of the most popular CS courses at Stanford.

Tesla Autopilot Vision

As Director of AI at Tesla (2017-2022), led the computer vision team responsible for Autopilot's vision-based self-driving pipeline, moving Tesla from radar+camera to a vision-only approach.

Software 2.0 / Software 3.0

Coined 'Software 2.0' (neural networks as code written by optimization) and later 'Software 3.0' (natural language as the programming interface for LLMs), reshaping how the industry thinks about software development paradigms.

Notable Quotes

“

Context engineering is the delicate art and science of filling the context window with just the right information for the next step.

X (Twitter), June 2025·Source

“

I've never felt this much behind as a programmer. The profession is being dramatically refactored as the bits contributed by the programmer are increasingly sparse and between.

X (Twitter), 2025·Source

“

Vibe coding is a new kind of coding where you fully give in to the vibes, embrace exponentials, and forget that the code even exists.

X (Twitter), February 2025·Source

“

I like to train Deep Neural Nets on large datasets.

GitHub profile bio·Source

“

The most important quality in a startup founder right now is taste. You need to know what is good because AI can produce infinite slop.

Dwarkesh Podcast, October 2025·Source

12 sources(click to expand)

Andrej Karpathy - Personal Website Andrej Karpathy - Wikipedia karpathy on GitHub (63 repos, 149k followers)Andrej Karpathy - Stanford Computer Science Eureka Labs - AI-Native Education Andrej Karpathy: Software Is Changing (Again) - YC AI Startup School keynote Andrej Karpathy on Dwarkesh Podcast: 'AGI is still a decade away'Deep Visual-Semantic Alignments for Generating Image Descriptions (arXiv)Visualizing and Understanding Recurrent Networks (arXiv)2025 LLM Year in Review - Karpathy Bear Blog Andrej Karpathy on Google Scholar Andrej Karpathy - Lex Fridman Podcast #333

Research generated March 19, 2026

Builders & Technical Leaders/Andrej Karpathy

All Profiles

Andrej Karpathy

Podcast Appearances

Research Papers

Key Contributions

Notable Quotes

Andrej Karpathy

Podcast Appearances

Research Papers

Key Contributions

Notable Quotes