Chapter 3 · Part 2
The CPU: a few brilliant generalists
The CPU (central processing unit) is the chip in everything — your laptop, phone, the server answering this page. Its design philosophy is the brilliant mathematician from last chapter: a small number of very powerful, very flexible cores, each optimized to rip through one stream of instructions as fast as possible.
Scroll to look inside a CPU and watch it do what it's best at.
A CPU has just a few cores — but each is large and powerful.
Built for latency and flexibility
Look at where a CPU spends its transistors and you see its priorities. Only a fraction goes to actual arithmetic; a huge share goes to making a single thread fast and smart:
- Large caches keep frequently-used data nanoseconds away instead of fetching from slow main memory.
- Branch prediction guesses which way an
ifwill go and runs ahead, so the pipeline doesn't stall. - Out-of-order execution reorders instructions on the fly to keep the core busy.
All of that machinery exists to do one thing after another, quickly and correctly, no matter how unpredictable the code. A modern CPU has maybe 4 to 64 cores — powerful, but nowhere near the thousands you'd want for our matrix mountain.
Great at everything except bulk matrix math
This makes the CPU the right tool for the messy, branchy, decision-heavy code that is most software: operating systems, databases, the logic glueing an AI app together, and running smaller models where simplicity beats raw throughput. It can absolutely do matrix multiplication — it's just that with only a few dozen cores, it's the brilliant mathematician trying to single-handedly do a million sums.
Where we're headed
If the bottleneck is "not enough workers for a giant pile of independent sums," the fix is obvious: build a chip that trades a few brilliant cores for thousands of simple ones. That chip was invented for video games — and accidentally became the engine of the AI boom. Next: the GPU.