nanogpt

Featured

Educational GPT implementation in ~300 lines. Reproduces GPT-2 (124M) on OpenWebText. Clean, hackable code for learning transformers. By Andrej Karpathy. Perfect for understanding GPT architecture from scratch. Train on Shakespeare (CPU) or OpenWebText (multi-GPU).

AI & Automation 6,478 stars 505 forks Updated 1 months ago MIT

Install

View on GitHub

Quality Score: 99/100

Stars 20%

100

Recency 20%

Frontmatter 20%

Documentation 15%

100

Issue Health 10%

License 10%

100

Description 5%

100

Skill Content

# nanoGPT - Minimalist GPT Training ## Quick start nanoGPT is a simplified GPT implementation designed for learning and experimentation. **Installation**: ```bash pip install torch numpy transformers datasets tiktoken wandb tqdm ``` **Train on Shakespeare** (CPU-friendly): ```bash # Prepare data python data/shakespeare_char/prepare.py # Train (5 minutes on CPU) python train.py config/train_shakespeare_char.py # Generate text python sample.py --out_dir=out-shakespeare-char ``` **Output**: ``` ROMEO: What say'st thou? Shall I speak, and be a man? JULIET: I am afeard, and yet I'll speak; for thou art One that hath been a man, and yet I know not What thou art. ``` ## Common workflows ### Workflow 1: Character-level Shakespeare **Complete training pipeline**: ```bash # Step 1: Prepare data (creates train.bin, val.bin) python data/shakespeare_char/prepare.py # Step 2: Train small model python train.py config/train_shakespeare_char.py # Step 3: Generate text python sample.py --out_dir=out-shakespeare-char ``` **Config** (`config/train_shakespeare_char.py`): ```python # Model config n_layer = 6 # 6 transformer layers n_head = 6 # 6 attention heads n_embd = 384 # 384-dim embeddings block_size = 256 # 256 char context # Training config batch_size = 64 learning_rate = 1e-3 max_iters = 5000 eval_interval = 500 # Hardware device = 'cpu' # Or 'cuda' compile = False # Set True for PyTorch 2.0 ``` **Training time**: ~5 minutes (CPU), ~1 minute...

Details

Author: Orchestra-Research
Repository: Orchestra-Research/AI-Research-SKILLs
Created: 6 months ago
Last Updated: 1 months ago
Language: TeX
License: MIT

Integrates with

Hugging Face · AI

Related Skills

AI & Automation Featured

videodb

See, Understand, Act on video and audio. See- ingest from local files, URLs, RTSP/live feeds, or live record desktop; return realtime context and playable stream links. Understand- extract frames, build visual/semantic/temporal indexes, and search moments with timestamps and auto-clips. Act- transcode and normalize (codec, fps, resolution, aspect ratio), perform timeline edits (subtitles, text/image overlays, branding, audio overlays, dubbing, translation), generate media assets (image, audio, video), and create real time alerts for events from live streams or desktop capture.

192,199 Updated today

affaan-m

AI & Automation Featured

ck

Persistent per-project memory for Claude Code. Auto-loads project context on session start, tracks sessions with git activity, and writes to native memory. Commands run deterministic Node.js scripts — behavior is consistent across model versions.

192,199 Updated today

affaan-m

AI & Automation Featured

browser

Web browser automation with AI-optimized snapshots for claude-flow agents

55,035 Updated today

ruvnet