← Back to Curriculum

Week 4: Scaling + Decoders + Capstone

Phase II · Days 22–28 · 17.5 hours

This week dives into tokenization, builds GPT from scratch with nanoGPT, explores scaling laws and emergence, and covers sampling, generation, and encoder-decoder architectures.

Daily Lessons

Day Topic Focus
22 Tokenization Deep Dive BPE, WordPiece, tiktoken
23 GPT & nanoGPT Day 1 Decoder-only transformers
24 nanoGPT Ablations Day 2 Systematic experiments
25 Scaling Laws & Emergence Chinchilla, power laws
26 Stop & Reflect #2 Scaling + compression
27 Sampling & Generation Top-k, nucleus, temperature
28 T5 & Encoder-Decoder LMs Text-to-text framework

Study Notes Reference