Inside AI Models
← All articles
FundamentalsDeep Learning

How Gradient Descent Actually Works

Jun 22, 2026 · 2 min read

Every trained neural network — from a tiny classifier to a frontier LLM — was fit by the same simple idea: gradient descent. Here's the whole thing, without the mystery.

The picture

Imagine a hilly landscape. The height at each point is your loss — how wrong the model is. You want the lowest valley. You can't see the whole map, but at any point you can feel which way is downhill. So you take a small step that way, and repeat.

The "which way is downhill" is the gradient: the derivative of the loss with respect to each parameter.

θθηθL\theta \leftarrow \theta - \eta \, \nabla_\theta \mathcal{L}
  • θ\theta — the parameters (weights).
  • θL\nabla_\theta \mathcal{L} — the gradient (the uphill direction).
  • η\eta — the learning rate (how big a step you take).

We subtract the gradient because the gradient points uphill and we want to go down.

Twelve lines that fit a line

Let's minimize a simple loss by hand — fitting y = w·x to some data:

import numpy as np
 
x = np.array([1, 2, 3, 4], dtype=float)
y = np.array([2, 4, 6, 8], dtype=float)   # true relationship: y = 2x
w = 0.0
lr = 0.01
 
for step in range(200):
    pred = w * x
    loss = ((pred - y) ** 2).mean()       # mean squared error
    grad = (2 * x * (pred - y)).mean()    # d loss / d w
    w -= lr * grad                        # the update
print(round(w, 3))                        # ≈ 2.0

It starts at w = 0, feels the slope, and walks straight to w = 2. That's gradient descent. A real network just does this for millions of parameters at once, with the gradients computed automatically by backpropagation.

The learning rate is everything

  • Too small → training crawls; you never reach the valley.
  • Too large → you overshoot and bounce around (or diverge).
  • Just right → steady, fast descent.

Most "my model won't train" problems trace back to this one number. When in doubt, lower it and watch the loss curve.

Related articles