← Back to Optimization

Optimization for Robotics Engineers

From gradient descent to pose-graph SLAM — the math behind Ceres, MPC, and Nav2

Total: ~50–60 hours across 6 weeks

Expanded: ~280 hours across 16 weeks (includes Lie Groups & Game Theory)


Why This Track Exists

Every major OKS subsystem runs an optimizer:

Subsystem Optimizer What it solves
Navigation estimator Ceres Solver (LM) Sensorbar line-fit, pose correction
Control (MPC path) QP / NLP solver Trajectory tracking with constraints
SLAM / localization Pose-graph optimizer (g2o) Loop closure, map consistency
Path planner Graph search + cost minimization Shortest/safest path through costmap
Calibration Nonlinear least-squares Camera-lidar-wheel extrinsics

If you can't read an optimizer's cost function, you can't debug why the robot drives wrong.


Dependency Graph

01-foundations
│  (gradients, Hessians, convexity)
│
├──→ 02-unconstrained-optimization
│      (gradient descent, Newton, line search, trust region)
│      │
│      ├──→ 03-least-squares  ← HIGHEST OKS VALUE
│      │      (Gauss-Newton, Levenberg-Marquardt, Ceres Solver)
│      │
│      └──→ 07-numerical-methods
│             (convergence, conditioning, sparsity)
│
├──→ 04-constrained-optimization
│      (Lagrange, KKT, duality, penalty/barrier)
│      │
│      └──→ 05-convex-optimization
│             (LP, QP, SDP, interior-point, CVXPy)
│
└──→ 06-graph-optimization  ← SECOND HIGHEST OKS VALUE
       (factor graphs, pose-graph SLAM, g2o, GTSAM)

Critical path for OKS work: 01 → 02 → 03 (least-squares / Ceres) — this is 60% of the RCA value.


Study Materials

learn/optimization/
├── 00-learning-plan.md                    ← YOU ARE HERE
├── 01-foundations.md                       ← Gradients, Hessians, convexity, optimality conditions
├── 02-unconstrained-optimization.md        ← GD, Newton, line search, trust region, convergence
├── 03-least-squares.md                     ← Linear LS, NLLS, Gauss-Newton, LM, Ceres deep dive
├── 04-constrained-optimization.md          ← Lagrange multipliers, KKT, duality, penalty methods
├── 05-convex-optimization.md               ← LP, QP, SOCP, SDP, interior-point, CVXPy
├── 06-graph-optimization.md                ← Factor graphs, pose-graph SLAM, g2o, GTSAM, Ceres
├── 07-numerical-methods.md                 ← Conditioning, convergence rates, sparsity, Cholesky
├── 08-matrix-essentials.md                 ← Matrix Lie groups, exponential map, adjoint
├── 09-lie-groups.md                        ← SO(3)/SE(3) geometry, manifold optimization
├── 10-game-theory.md                       ← Nash equilibria, mechanism design, multi-agent
├── exercises/
│   ├── 01-gradients-convexity.md           ← Hand-derive gradients, classify convexity
│   ├── 02-gradient-descent.md              ← Implement GD/Newton from scratch in Python
│   ├── 03-least-squares-ceres.md           ← Fit curves with scipy + Ceres, reproduce sensorbar fit
│   ├── 04-constrained-kkt.md               ← Solve KKT by hand, verify with solver
│   ├── 05-convex-cvxpy.md                  ← Model LP/QP problems in CVXPy
│   ├── 06-pose-graph.md                    ← Build a mini pose-graph optimizer
│   ├── 07-08-09/                           ← Lie groups & manifold exercises
│   └── 10/                                 ← Game theory exercises
└── code/
    ├── lie_groups/                          ← SO(3)/SE(3) Python implementations
    └── game_theory/                         ← Nash, minimax, auction modules

Week-by-Week Schedule

Week 1 — Foundations + Unconstrained (10 hrs)

Read: 01-foundations.md + 02-unconstrained-optimization.md Do: Exercises 01 + 02

Checkpoint — answer without looking: - [ ] What are the first-order and second-order necessary conditions for a minimum? - [ ] When is a function convex? Give the Hessian condition. - [ ] Gradient descent step: $x_{k+1} = x_k - \alpha \nabla f(x_k)$. What happens if $\alpha$ is too large? - [ ] Newton's method converges quadratically near a minimum. Why? What's the downside? - [ ] What is the Armijo condition for line search?

Week 2 — Least Squares (10 hrs) 🎯

Read: 03-least-squares.md Do: Exercise 03 (the Ceres one is critical)

Checkpoint: - [ ] Derive the normal equations for linear least squares - [ ] Gauss-Newton approximates the Hessian as $J^T J$. Why is this valid for small residuals? - [ ] LM adds $\lambda I$ to $J^T J$. What happens as $\lambda → 0$? As $\lambda → ∞$? - [ ] In Ceres, what is a CostFunction, a LossFunction, and a ParameterBlock? - [ ] The sensorbar uses CauchyLoss(c=40.0). What does this do to outliers?

Week 3 — Constrained Optimization (8 hrs)

Read: 04-constrained-optimization.md Do: Exercise 04

Checkpoint: - [ ] State the KKT conditions for $\min f(x)$ s.t. $g(x) \leq 0$, $h(x) = 0$ - [ ] What is complementary slackness? Give a geometric interpretation. - [ ] What is strong duality? When does it hold? - [ ] How does a penalty method convert a constrained problem to unconstrained?

Week 4 — Convex Optimization (8 hrs)

Read: 05-convex-optimization.md Do: Exercise 05

Checkpoint: - [ ] Classify: LP ⊂ QP ⊂ SOCP ⊂ SDP ⊂ general convex. Why does this hierarchy matter? - [ ] Write the standard-form LP. What does "vertex of the feasible polytope" mean? - [ ] Interior-point methods use a barrier function. What is it? Why $O(\sqrt{n})$ iterations? - [ ] Model a simple robot resource allocation as a QP in CVXPy.

Week 5 — Graph Optimization + SLAM (10 hrs) 🎯

Read: 06-graph-optimization.md Do: Exercise 06

Checkpoint: - [ ] What is a factor graph? How does it differ from a Bayesian network? - [ ] In pose-graph SLAM, what are the nodes and edges? What is the cost function? - [ ] How does Gauss-Newton apply to the pose-graph problem? - [ ] What is the role of the information matrix (inverse covariance) in the edge cost? - [ ] How do robust kernels (Huber, Cauchy) handle outlier loop closures?

Week 6 — Numerical Methods + Integration (6 hrs)

Read: 07-numerical-methods.md Do: Review and integrate all exercises

Checkpoint: - [ ] What is the condition number of a matrix? Why does it matter for optimization? - [ ] Sparse vs dense Cholesky: when does sparsity win? (SLAM problems!) - [ ] What is the difference between convergence rate: linear, superlinear, quadratic? - [ ] Given a 10000-variable sparse least-squares problem, which solver would you pick?


Key External Resources

See RESOURCES.md — Optimization section for the full curated list.

Start with these three: 1. Boyd & Vandenberghe — Convex Optimization (free PDF) — the bible, chapters 1–5 + 9–11 2. Kochenderfer & Wheeler — Algorithms for Optimization (free PDF) — practical, code-first 3. Ceres Solver documentation — the actual tool you debug against daily

Video courses (pick one): - Stanford EE364A (Boyd) — rigorous, theory-heavy, gold standard - CMU Convex Optimization (Tibshirani) — slightly more applied - Optimization Methods for ML (KIT) — engineering-oriented


OKS Relevance Map

Study note Directly used in OKS for...
01 Foundations Reading any cost function in the codebase
02 Unconstrained Understanding why GD is used in ML tuning but Newton in estimator
03 Least squares Ceres sensorbar fit, calibration, every NLLS in the stack
04 Constrained Understanding MPC constraint formulation
05 Convex QP in MPC, understanding solver guarantees
06 Graph optimization Pose-graph SLAM, loop closure, localization
07 Numerical methods Why Ceres uses sparse Cholesky, conditioning issues
08 Matrix essentials Matrix decompositions, SVD, pseudoinverse in calibration
09 Lie groups SE(3) pose-graph SLAM, manifold optimization, IMU preintegration
10 Game theory Fleet coordination, congestion-aware routing, auction-based allocation

Phase VI — Lie Groups & Manifold Optimization (Weeks 11–13)

Week 11 — Group Theory & Rotations (17.5 hrs)

Read: 09-lie-groups.md (Sections 1–4) Code: code/lie_groups/lie_groups.py

Day Topic Key concepts
71 Group Axioms Closure, associativity, identity, inverse; SO(2), SO(3)
72 SO(2) Rotations 2D rotation matrix, angle wrap, composition
73 SO(3) & Rodrigues Axis-angle, Rodrigues formula, exp map
74 Quaternions Unit quaternions ↔ SO(3), Hamilton product
75 Quaternion Ops SLERP, interpolation, avoiding gimbal lock
76 Rotation Representations Euler angles, rotation vectors, comparison chart
77 Week 11 Review Synthesis + self-assessment

Week 12 — Rigid Motions & Lie Algebra (17.5 hrs)

Read: 09-lie-groups.md (Sections 5–8)

Day Topic Key concepts
78 SE(2) Rigid Motions Homogeneous coords, composition, inverse
79 SE(3) Rigid Body 4×4 transformation, adjoint action
80 Exponential Map se(3) → SE(3), hat/vee operators
81 Lie Algebra Tangent space, generators, bracket
82 Adjoint & BCH Adjoint representation, BCH formula
83 Jacobians on Manifolds Left/right Jacobians, perturbation models
84 Week 12 Review Synthesis + self-assessment

Week 13 — Manifold Optimization & Capstone (17.5 hrs)

Read: 09-lie-groups.md (Sections 9–10)

Day Topic Key concepts
85 Manifold Gauss-Newton Retraction, tangent-space linearization
86 Pose-Graph SLAM SE(2) Graph formulation, anchoring, sparsity
87 Pose-Graph SLAM SE(3) 3D extension, robust kernels (Huber, Cauchy)
88 Ceres & GTSAM Local parameterization, factor graphs
89 IMU Preintegration Forster et al., bias correction
90 Calibration & Interpolation Hand-eye AX=XB, SLERP, trajectory interp
91 Capstone: Mini VIO SE(3) + IMU + landmarks, sliding-window GN

Phase VII — Game Theory for Optimization & Robotics (Weeks 14–16)

Week 14 — Normal Form & Zero-Sum Games (17.5 hrs)

Read: 10-game-theory.md (Sections 1–4) Code: code/game_theory/game_theory.py

Day Topic Key concepts
92 Normal Form Games Payoff matrices, dominant strategies, IESDS
93 Pure Nash Equilibria Best response, multiple equilibria, coordination
94 Mixed Nash Equilibria Support enumeration, indifference principle
95 Equilibrium Selection Pareto, risk dominance, focal points
96 Zero-Sum & Minimax Minimax theorem, saddle points
97 Zero-Sum LP LP formulation, dual interpretation
98 Week 14 Review Synthesis + self-assessment

Week 15 — Cooperative & Dynamic Games (17.5 hrs)

Read: 10-game-theory.md (Sections 5–7)

Day Topic Key concepts
99 Cooperative Games & Core Coalitional games, core stability
100 Shapley Value Axioms, computation, allocation fairness
101 Nash Bargaining Bargaining solution, axiomatic approach
102 Dynamic Games Extensive form, backward induction
103 Repeated Games Folk theorem, trigger strategies
104 Mechanism Design VCG, incentive compatibility, revelation principle
105 Week 15 Review Synthesis + self-assessment

Week 16 — Algorithmic GT & Capstone (17.5 hrs)

Read: 10-game-theory.md (Sections 8–10)

Day Topic Key concepts
106 Computing Nash Lemke-Howson, PPAD complexity
107 Fictitious Play Empirical strategy learning, convergence
108 No-Regret Learning Regret matching, multiplicative weights
109 Congestion Games Potential games, Braess's paradox, PoA
110 Multi-Robot & Adversarial Auctions, pursuit-evasion, GAN minimax
111 Robust Optimization Minimax as game against nature, H∞ control
112 Capstone: Multi-Robot Delivery Auction + congestion + robust rerouting