Chapter 6 · Part 2

Features: edges & gradients

In Chapter 5 we slid a kernel across an image and a vertical edge lit up. That wasn't a trick — it points at something deep about images: most pixels are boring.

A patch of blue sky is thousands of nearly-identical numbers. Knowing one of them lets you guess the rest. The places that actually carry information are where the numbers change — the boundary between sky and hill, the rim of the sun. Those are edges, and they're the raw material of almost every classical vision feature.

Scroll to strip the picture down to just its changes.

The original image — mostly smooth regions that repeat themselves.

scroll↓

What is a gradient?

At every pixel we can ask: which way does brightness increase fastest, and how fast? The answer is a little arrow called the gradient. We get it with the two Sobel kernels from Chapter 5 — one measures the horizontal change Gₓ, the other the vertical change G_y:

Magnitude √(Gₓ² + G_y²) — how strong the edge is. This is the brightness of the edge map.
Direction atan2(G_y, Gₓ) — which way the edge faces. This is where each arrow points.

Why features beat raw pixels

Imagine describing a face to a computer. Raw pixels give you "the pixel at (102, 56) is brightness 173" — fragile and almost meaningless. Edges and gradients give you "there's a strong horizontal edge here, two dark blobs above a curved edge" — the beginnings of eyes above a mouth.

This is the insight classical computer vision was built on:

Edges mark object boundaries.
Corners (where edges meet) are stable points to track between frames.
Gradient histograms over small regions describe local texture and shape — the basis of features like HOG and SIFT (next chapter).

gradients.py — edges from two convolutions

import numpy as np
from scipy.ndimage import convolve

sobel_x = np.array([[-1,0,1],[-2,0,2],[-1,0,1]])
sobel_y = sobel_x.T

gx = convolve(img.astype(float), sobel_x)
gy = convolve(img.astype(float), sobel_y)

magnitude = np.hypot(gx, gy)          # edge strength  -> the edge map
direction = np.arctan2(gy, gx)        # edge orientation

The catch

Edges and gradients are a huge step up from raw pixels — fewer numbers, far more meaning. But notice that we designed these detectors by hand. We picked the Sobel weights. We decided edges and corners were the things worth measuring.

What if the best features for telling a cat from a dog aren't edges at all, but something no human would think to write down? That question — hand-designed vs. learned features — is exactly where Part 3 begins, and it's the doorway to neural networks.