Week 6: LLM Engineering

Days 36–42 · 17.5 hours

This week covers the engineering side of LLMs: evaluation, quantization, in-context learning, long context, RAG, tool use, and applying LLMs to robotics. Culminates in Phase III Capstone Day 1.

Daily Lessons

Day	Topic	Focus
36	LLM Evaluation	Perplexity, MMLU, HumanEval, LLM-as-judge
37	Quantization & Inference	INT4/INT8, GPTQ, AWQ, vLLM
38	In-Context Learning	Zero/few-shot, mesa-optimization
39	Long Context & Reasoning	RoPE scaling, ring attention, o1-style
40	RAG & Tool Use	Retrieval-augmented generation, function calling
41	LLM for Robotics	SayCan, Code as Policies, fleet planning
42	Phase III Capstone Day 1	Fine-tune robotics assistant + RAG

Key Concepts

Evaluation: How to measure if an LLM is actually good — benchmarks, contamination, Chatbot Arena
Quantization: Compress 7B models to run on consumer hardware with minimal quality loss
In-context learning: The most surprising emergent ability — learning from examples in the prompt
Long context & reasoning: Scaling context windows and chain-of-thought for complex tasks
RAG: Augment LLMs with external knowledge without fine-tuning
LLMs for robotics: From language understanding to physical world actions

Week 6: LLM Engineering

Daily Lessons

Key Concepts

Study Notes References