Phase III — LLMs: Training & Alignment | Week 7 | 2.5 hours "Before moving forward, prove you understand what's behind you."
Compile your Phase III capstone into a structured deliverable:
Phase III Capstone: Robotics Assistant
======================================
1. Architecture Overview
- System diagram (LLM + LoRA + RAG + tools)
- Component responsibilities and data flow
- Design decisions and trade-offs
2. Training Summary
- Dataset: N examples, M categories
- LoRA config: r=?, α=?, target modules
- Training: epochs, lr, final loss
- Trainable params: X / Y total (Z%)
3. RAG Configuration
- Knowledge base: N documents, chunking strategy
- Embedding model and dimensions
- Retrieval: top-k=?, similarity threshold
- Vector store implementation
4. Evaluation Results
┌──────────────────┬──────────┬──────────┬──────────┐
│ Metric │ Base │ LoRA │ LoRA+RAG │
├──────────────────┼──────────┼──────────┼──────────┤
│ Knowledge recall │ │ │ │
│ Diagnosis recall │ │ │ │
│ Command accuracy │ │ │ │
│ Hallucination % │ │ │ │
│ Avg latency (ms) │ │ │ │
└──────────────────┴──────────┴──────────┴──────────┘
5. Error Analysis
- Top failure modes with examples
- Which component (LoRA vs RAG) addresses each failure
- Remaining gaps and proposed solutions
6. Lessons Learned
- What surprised you?
- What would you do differently?
- How does this connect to VLA training?
"""
Day 44 Capstone: Generate final report and deliverable.
"""
from dataclasses import dataclass
from datetime import datetime
@dataclass
class CapstoneReport:
title: str = "Phase III Capstone: Robotics Assistant"
date: str = ""
architecture: str = ""
training_summary: dict = None
rag_config: dict = None
eval_results: dict = None
error_analysis: list = None
lessons: list = None
def __post_init__(self):
self.date = datetime.now().strftime("%Y-%m-%d")
if self.training_summary is None:
self.training_summary = {}
if self.rag_config is None:
self.rag_config = {}
if self.eval_results is None:
self.eval_results = {}
if self.error_analysis is None:
self.error_analysis = []
if self.lessons is None:
self.lessons = []
def to_markdown(self) -> str:
lines = [
f"# {self.title}",
f"*Generated: {self.date}*\n",
"## 1. Architecture",
self.architecture or "*[Fill in system diagram]*\n",
"## 2. Training Summary",
]
for key, value in self.training_summary.items():
lines.append(f"- **{key}:** {value}")
lines.append("\n## 3. RAG Configuration")
for key, value in self.rag_config.items():
lines.append(f"- **{key}:** {value}")
lines.append("\n## 4. Evaluation Results")
if self.eval_results:
configs = list(self.eval_results.keys())
metrics = set()
for config_scores in self.eval_results.values():
metrics.update(config_scores.keys())
header = "| Metric | " + " | ".join(configs) + " |"
sep = "|" + "---|" * (len(configs) + 1)
lines.extend([header, sep])
for metric in sorted(metrics):
row = f"| {metric} |"
for config in configs:
val = self.eval_results[config].get(metric, "—")
if isinstance(val, float):
row += f" {val:.1%} |"
else:
row += f" {val} |"
lines.append(row)
lines.append("\n## 5. Error Analysis")
for i, err in enumerate(self.error_analysis, 1):
lines.append(f"{i}. {err}")
lines.append("\n## 6. Lessons Learned")
for lesson in self.lessons:
lines.append(f"- {lesson}")
return "\n".join(lines)
# Example report
if __name__ == "__main__":
report = CapstoneReport(
architecture="LLM (TinyLlama 1.1B) + LoRA adapter (r=16) + "
"TF-IDF RAG over 5 technical documents + "
"rule-based command parser with safety validator.",
training_summary={
"Base model": "TinyLlama 1.1B Chat",
"Dataset": "6 robotics instruction pairs",
"LoRA config": "r=16, α=32, target=q/k/v/o_proj",
"Training": "3 epochs, lr=2e-4, cosine schedule",
"Trainable params": "~4M / 1.1B (0.36%)",
},
rag_config={
"Documents": "5 technical spec documents",
"Chunking": "Full document (small docs)",
"Embedding": "TF-IDF bag-of-words",
"Retrieval": "Top-3, cosine similarity",
},
eval_results={
"Base": {"knowledge": 0.30, "diagnosis": 0.20, "command": 0.50},
"LoRA": {"knowledge": 0.55, "diagnosis": 0.50, "command": 0.70},
"LoRA+RAG": {"knowledge": 0.80, "diagnosis": 0.65, "command": 0.75},
},
error_analysis=[
"Reasoning questions remain weak across all configs — need CoT",
"RAG retrieval misses when question phrasing differs from docs",
"Command parser fails on ambiguous multi-step instructions",
],
lessons=[
"Data quality >> quantity for SFT",
"RAG fixes knowledge gaps that fine-tuning can't address cheaply",
"Safety validation layer is non-negotiable for robotics",
"Evaluation design is as important as model training",
],
)
print(report.to_markdown())
Answer each question in 3-5 sentences with equations or code where appropriate. Score yourself honestly: each question is worth 1 point, minimum 4/6 to proceed.
Describe the 3-stage modern LLM training pipeline. For each stage, state: (a) the training objective, (b) the data type and typical size, (c) what capability it provides.
Write the LoRA weight update equation. Explain each term, state typical values for rank $r$ and scaling $\alpha$, and calculate the parameter savings for a 4096×4096 weight matrix with $r=16$.
Compare DPO and RLHF. Write the DPO loss function, explain why it doesn't need a reward model, and state when you would choose RLHF over DPO.
Explain how in-context learning works. Why do even random labels help? How does ICL relate to data compression?
Explain quantization from FP16 to INT4. What is the absmax quantization formula? What is GPTQ's key innovation? Why does AWQ outperform GPTQ?
Explain speculative decoding. Why does it produce outputs identical to the large model? Under what conditions does it not provide speedup?
| Score | Assessment | Action |
|---|---|---|
| 6/6 | Excellent — ready for Phase IV | Proceed to Vision |
| 5/6 | Strong — minor gaps | Review the weak topic, then proceed |
| 4/6 | Adequate — some gaps | Spend 30 min reviewing weak areas, then proceed |
| 3/6 or below | Needs review | Re-read Days 31-42, redo exercises before proceeding |
Phase III taught us to teach LLMs. Phase IV will teach us to give them eyes. Vision Transformers (ViT) take the same architecture we've mastered — attention, transformers, scaling — and apply it to images. The key insight: an image is just a sequence of patches, exactly like a sentence is a sequence of tokens. Same architecture, different modality. This is the path to VLAs.
Day 45: ViT — Image as Tokens begins Phase IV: Vision. We'll learn how to split images into patches, embed them as tokens, and process them through the same transformer architecture we've been studying. The multimodal journey begins.