← Back to Home

From LLM to VLA

112-day curriculum — from deep learning foundations to Vision-Language-Action models for robotics

📅 16 weeks ⏱ 280 hours 🎯 7 phases 🤖 112 daily lessons

Phases

Phase I
DL Foundations: Backprop → Information Theory Days 1–9
Phase II
Attention & Transformers: The Architecture Revolution Days 10–30
Phase III
LLM Engineering: Training, Alignment & Deployment Days 31–44
Phase IV
Vision Transformers: From ViT to Video Understanding Days 45–58
Phase V
Vision-Language Models: CLIP → LLaVA → Fine-tuning Days 59–70
Phase VI
RL, Diffusion & Imitation Learning for Robotics Days 71–91
Phase VII
VLA Architectures: RT-1 → π₀ → Deployment Days 92–112

Weekly Schedule

Phase I

Week 1: DL Foundations

Phase I–II

Week 2: Attention & Transformers

Phase II

Week 3: Variants & GPT

Phase II

Week 4: Scaling & Decoders

Phase II–III

Week 5: LLM Training

Phase III

Week 6: LLM Engineering

Phase III–IV

Week 7: Vision Transformers

Phase IV

Week 8: 3D Vision & Video

Phase IV–V

Week 9: VLMs — CLIP to LLaVA

Phase V

Week 10: VLM Practice

Phase VI

Week 11: RL & Diffusion

Phase VI

Week 12: Imitation Learning

Phase VI

Week 13: Data & Evaluation

Phase VII

Week 14: VLA Architectures

Phase VII

Week 15: Training & Transfer

Phase VII

Week 16: Deployment & Capstone