AI models don't process text character by character or word by word — they process tokens. A token is roughly a word fragment:
Why it matters for you: AI pricing, limits, and speed are all measured in tokens. When a service says you have a '1 million token context window,' that's roughly 750,000 words — about 10 average novels.
The context window is how much text the AI can 'see' at once — including your entire conversation history, any documents you've shared, and the AI's own previous responses.
Think of it like working memory: once text falls outside the context window, the AI effectively 'forgets' it.
| Model | Approximate context window |
|---|---|
| GPT-3.5 (original) | ~4,000 tokens (~3,000 words) |
| GPT-4 (2023) | 8,000–128,000 tokens |
| Claude 3.7 Sonnet | 200,000 tokens (~150,000 words) |
| Gemini 1.5 Pro | 1,000,000 tokens (~750,000 words) |
What this means practically: Long conversations may 'lose' early context. Pasting a large document means the model can reference all of it. Context limits are why AI sometimes seems to 'forget' something you told it earlier in a long conversation.
Temperature controls how creative or predictable the AI's responses are:
Most chat interfaces don't expose temperature directly. But understanding it explains why asking an AI the same question twice can produce different answers — at normal temperatures, there's intentional variation in the output.
You don't need to tune these settings manually in most tools. But knowing they exist shapes smarter use: - For precise factual tasks: keep responses concise and verify - For creative tasks: re-run the prompt if the first output isn't what you wanted - For long documents: paste them early in a conversation to keep them in context
AI models don't process text character by character or word by word — they process tokens. A token is roughly a word fragment:
Why it matters for you: AI pricing, limits, and speed are all measured in tokens. When a service says you have a '1 million token context window,' that's roughly 750,000 words — about 10 average novels.
The context window is how much text the AI can 'see' at once — including your entire conversation history, any documents you've shared, and the AI's own previous responses.
Think of it like working memory: once text falls outside the context window, the AI effectively 'forgets' it.
| Model | Approximate context window |
|---|---|
| GPT-3.5 (original) | ~4,000 tokens (~3,000 words) |
| GPT-4 (2023) | 8,000–128,000 tokens |
| Claude 3.7 Sonnet | 200,000 tokens (~150,000 words) |
| Gemini 1.5 Pro | 1,000,000 tokens (~750,000 words) |
What this means practically: Long conversations may 'lose' early context. Pasting a large document means the model can reference all of it. Context limits are why AI sometimes seems to 'forget' something you told it earlier in a long conversation.
Temperature controls how creative or predictable the AI's responses are:
Most chat interfaces don't expose temperature directly. But understanding it explains why asking an AI the same question twice can produce different answers — at normal temperatures, there's intentional variation in the output.
You don't need to tune these settings manually in most tools. But knowing they exist shapes smarter use: - For precise factual tasks: keep responses concise and verify - For creative tasks: re-run the prompt if the first output isn't what you wanted - For long documents: paste them early in a conversation to keep them in context