Chapter 2 · Part 1
Color & channels
In Chapter 1 every pixel was a single number — its brightness. That's enough for a grayscale image, but the world isn't grey. To store colour, one number per pixel isn't enough. We need three.
Scroll through the photo below. It looks like one picture, but watch it come apart into the layers it's secretly made of.
A colour photo — it looks like a single picture.
Three numbers per pixel
A colour pixel stores three values, not one:
- R — how much red, 0–255.
- G — how much green, 0–255.
- B — how much blue, 0–255.
So the bright sun pixel above isn't "yellow" to the computer — it's
(255, 215, 70). Yellow is just a lot of red, a lot of green, and a little
blue. Every colour you can see on a screen is some mix of these three.
Each channel, on its own, is exactly the kind of grayscale grid from Chapter 1. A colour image is just three of those grids stacked together.
Why adding the layers works
Screens make colour with light, and light adds. Start from black and add some red, some green, some blue, and the colours pile up toward white. That's why the three tinted layers in the animation merge back into the original photo when they overlap — you're literally adding the red, green and blue light back together.
- Red + Green = Yellow
- All three at full = White
- All three at zero = Black
In code
A grayscale image was a 2D array. A colour image is a 3D array — height, then
width, then a length-3 stack of [R, G, B] per pixel:
from PIL import Image
import numpy as np
img = np.array(Image.open("sky.png").convert("RGB"))
print(img.shape) # (140, 140, 3) -> height, width, channels
print(img[20, 100]) # [255 215 70] -> one pixel: R, G, B
red = img[:, :, 0] # just the red layer (a 2D grayscale grid)
green = img[:, :, 1] # just the green layer
blue = img[:, :, 2] # just the blue layer
print(red.shape) # (140, 140)That extra 3 on the end of the shape is a big deal. An image is no longer a
flat grid — it's a small stack of grids. In the next chapter
we'll give that stack its proper name and see why thinking of images this way is
exactly what neural networks need.