Step 2

Autograd

The same multilayer perceptron (MLP), but now trained with automatic differentiation — no more hand-written backward pass.

2.1 What Changes
2.2 The Value Class: Wrapping Numbers
2.3 Recording Operations: Add and Multiply
2.4 More Operations and Syntactic Sugar
2.5 Backward: Walking the Graph
2.6 Parameters as Values
2.7 The New Training Loop
2.8 Same Results, Less Code

The big idea: Every operation records how it was computed. The chain rule can then be applied automatically by walking backward through the computation graph.

← Step 1: Gradient Descent Step 3: Attention →