← Back to Home

ML Systems & Compilers

70-day curriculum — from GPU architecture to TVM, Triton, and distributed training

📅 10 weeks ⏱ 175 hours 🎯 5 phases ⚡ 70 daily lessons

Phases

Phase I
Hardware & Compute Foundations: GPU Architecture → PyTorch Internals Days 1–14
Phase II
Compiler Infrastructure: IRs, Passes, Triton & torch.compile Days 15–28
Phase III
Apache TVM Deep Dive: Relay → TIR → Tuning → MLIR & XLA Days 29–49
Phase IV
Inference Optimization: Quantization, TensorRT & LLM Serving Days 50–63
Phase V
Training at Scale: Distributed Training & Capstone Project Days 64–70

Weekly Schedule

Phase I

Week 1: GPU Architecture & CUDA

Phase I

Week 2: PyTorch Internals

Phase II

Week 3: IR & Compiler Passes

Phase II

Week 4: Triton & Kernel Engineering

Phase III

Week 5: TVM Foundations

Phase III

Week 6: TVM Tuning & Backends

Phase III

Week 7: TVM Advanced & MLC

Phase IV

Week 8: Model Formats & Runtimes

Phase IV

Week 9: LLM Serving Systems

Phase V

Week 10: Distributed Training & Capstone