Chapter 3 · Part 2

The forward process

We've added noise by hand and we know exactly what noise is. Now let's make it systematic. The forward process is the fixed recipe that takes a clean image all the way to static — the same dial you scrubbed in Chapter 1, but now we'll see the rule behind it.

Two ideas do all the work. First, noise is added on a schedule: a little at the start, more later, following a curve that's decided in advance and never learned. Second — and this is the magic trick — you don't have to walk the schedule step by step. There's a closed-form shortcut that drops you at any timestep in one calculation.

Scroll to move along the schedule and watch the image jump straight to that step.

At t = 0 the schedule value ᾱ is 1: all signal, no noise. The image is untouched.

scroll

A schedule, not a free-for-all

Think of the forward process as a chain of T tiny steps (often T = 1000). Each step takes the previous image and mixes in a small amount of fresh Gaussian noise, governed by a number βₜ — the variance schedule. Early steps use a tiny βₜ (barely any noise); later steps use larger ones. Because each step depends only on the one before it, the whole thing is a Markov chain.

If you had to run all 1000 steps every time you wanted a noisy image, training would crawl. So we define a running product of the schedule, written ᾱₜ ("alpha-bar"), that summarizes all the noise added up to step t. It starts at 1 and slides down to 0 — exactly the teal curve in the visual.

The shortcut that makes it practical

With ᾱₜ in hand, the entire chain collapses into a single equation. Any noisy image xₜ is just a weighted blend of the original image and one patch of noise:

This shortcut is the reason diffusion training is feasible at all:

schedule.py — jump to any timestep in one step
import numpy as np

# Cosine schedule: ᾱ slides from ~1 down to ~0 across T steps.
T = 1000
t_norm = np.linspace(0, 1, T)
alpha_bar = np.cos(t_norm * np.pi / 2) ** 2

def q_sample(x0, t):
  """Noise x0 straight to timestep t — no loop over earlier steps."""
  a = alpha_bar[t]
  eps = np.random.randn(*x0.shape)
  xt = np.sqrt(a) * x0 + np.sqrt(1 - a) * eps
  return xt, eps                         # eps is the training target

During training we pick a random timestep for each image, jump straight to it with q_sample, and ask the network to predict the eps we used. No simulation, no waiting — just one blend per training example.

Where we're headed

The forward process is now fully pinned down: a fixed schedule, a closed-form jump, and — every single time — a known noise patch ε that produced the result. That ε is a free, exact answer key.

Next we put it to work: training a network to look at a noisy xₜ and predict the noise — the one piece of learning this whole system needs.