Part 1: What is Behind the Text Box? Understanding the "Internet Simulator"

When you open ChatGPT or Gemini, you see a simple blank box. You type a question, press Enter, and suddenly, words appear as if a "digital ghost" is thinking and typing back to you. It feels like magic, but to use it effectively, we need to peel back the curtain and see what is actually happening.

1. It is not "Thinking"—It is "Predicting"

The most important mental model to have is this: AI is a giant "Autocomplete" system.

Think about when you're texting on your phone and it suggests the next word. AI works on the same principle, just on a much more massive scale. When you give it a prompt, it isn't "thinking" in the human sense; it is calculating probabilities to guess which word should come next.

Example: If you type "Emily buys three...", the AI looks at its vast memory and calculates that the next word is likely "apples" (80% chance) or "oranges" (15% chance).
The "Biased Coin": Because it uses probabilities, it's like flipping a biased coin every time it picks a word. This is why the same question can get different answers each time—a property called Stochastic.

Token: AI doesn't see words like we do. It breaks text into small "chunks" called Tokens. A token can be a word, a part of a word, or even just a space. Think of them as the "atoms" of language that the AI uses to build sentences.

2. The "Know-it-all Professor" Stage (Base Model)

How did the AI get so good at guessing the next word? It spent months "reading" almost the entire public internet.

Imagine a "Professor" who has read every Wikipedia page, every news article, and every public forum, but has never been taught how to be an assistant. This is what experts call a Base Model.

If you ask this "Base Model" Professor: "What is the capital of France?", it might not answer you. Instead, it might write: "And what is the capital of Germany?".
Why? Because on the internet, questions are often followed by other questions (like in a quiz or a school test). The Base Model is simply simulating the patterns it saw online.

3. How the "Library" was Built (Pre-training)

The stage where the AI reads the entire internet to build its foundation is called Pre-training.

This isn't just a random data dump. Tech companies filter the data "aggressively" to make it useful:

Cleaning the "Trash": They remove malware, spam, and low-quality websites.
Safety First: They use algorithms to find and remove PII (Personally Identifiable Information) like social security numbers or private addresses.

Pre-training: This is the "basic knowledge" phase. The AI reads trillions of tokens from the web to learn the statistical patterns of human language before it is ever taught to follow specific instructions.

Summary of Part 1: AI does not understand the world like we do; it understands numbers and patterns. It is a powerful "Internet Simulator" that has learned to mimic how humans write.

Recognizing that you are talking to a language predictor helps you stay grounded: treat AI as a brilliant first-draft generator, but remember that it is always just "guessing" the next best word based on what it read in its massive digital library.

In Part 2, we will look at Tokenization—why this "Next-Token Predictor" can solve PhD-level physics but sometimes can't count the letters in the word "strawberry".