Chapter 6 · Part 3

Why it makes things up

Everything in this course has pointed at one mechanism: the model reads the text so far and predicts a plausible next token, over and over. Notice the word that's missing from that sentence — true. Nowhere in the loop does anything check whether the text is correct. It only checks whether it sounds right.

That's the whole story behind hallucinations. When you ask a question shaped like it has an answer, the model fills the blank with the most fluent, plausible-looking completion — whether or not such a fact exists. A real-author-name-shaped gap gets a real-sounding name.

Scroll through three prompts and watch the same confident machinery handle a fact it knows and a fact it doesn't.

Ask something common — 'water boils at…' — and the most plausible next token happens to be the correct one. ✓

scroll↓

Plausible, not true

A language model has no database it looks things up in. Its knowledge is smeared across billions of weights as statistical patterns of language. When a fact appeared often and consistently in training — water boiling at 100 °C, who wrote Pride and Prejudice — the most probable completion lines up with reality, and you get a correct answer.

But when the fact is rare, missing, or invented, there's no probability spike for "the truth." The model still has to emit something, so it produces the answer that best fits the shape of the prompt: a plausible name, a believable date, a real-looking citation. It's not lying — it has no concept of a lie. It's doing exactly what it always does.

Confidence is not knowledge

The most dangerous part is the tone. A hallucinated answer arrives in the same calm, authoritative voice as a correct one — because both are just high-probability text. The model has no built-in "I'm not sure about this" signal; that confident register is itself a learned pattern, not a measure of how well-grounded the answer is.

This is why hallucinations are most common with specifics: exact quotes, dates, statistics, legal citations, API names, people's biographies. The general shape is easy to fake; the precise detail is what it doesn't actually know.

What actually helps

You can't make the mechanism check facts, but you can change what it's working with:

Give it the source. Paste the document, or use retrieval (RAG) to put the real text into the context window. Now the plausible answer and the true answer are the same thing.
Let it use tools. A calculator, a search call, or a database lookup replaces a guess with a real result.
Ask for sources and verify them — and treat any specific name, number, or citation as a claim to check, not a fact.

ground.py — why retrieval cuts hallucinations

# Ungrounded: the model fills the blank from patterns alone.
answer = model.generate(question)            # may sound right, may be invented

# Grounded: put the real facts in the context first.
docs   = search(knowledge_base, question)    # fetch relevant, true text
answer = model.generate(docs + question)     # now "plausible" ≈ "supported"

You now know how it works

That's the whole machine. Strip away the mystique and ChatGPT is one idea, repeated:

It reads text as tokens, not words.
For each step it produces a probability for every token.
Temperature decides how boldly it samples from them.
It can only see what fits in its context window.
And because it optimizes for plausible rather than true, it can be confidently wrong.

Put together, that's autocomplete taken to an astonishing extreme — powerful, useful, and worth understanding well enough to trust it only where it earns it.

Thanks for reading. If this clicked, the other course takes the same visual approach to how AI sees images.