Phase II · Days 22–28 · 17.5 hours
This week dives into tokenization, builds GPT from scratch with nanoGPT, explores scaling laws and emergence, and covers sampling, generation, and encoder-decoder architectures.
| Day | Topic | Focus |
|---|---|---|
| 22 | Tokenization Deep Dive | BPE, WordPiece, tiktoken |
| 23 | GPT & nanoGPT Day 1 | Decoder-only transformers |
| 24 | nanoGPT Ablations Day 2 | Systematic experiments |
| 25 | Scaling Laws & Emergence | Chinchilla, power laws |
| 26 | Stop & Reflect #2 | Scaling + compression |
| 27 | Sampling & Generation | Top-k, nucleus, temperature |
| 28 | T5 & Encoder-Decoder LMs | Text-to-text framework |