Tokens, Context Windows, and Temperature: What They Mean for You (Topic 2) in Module 2 – AI-Basics (BG)

Tokens, Context Windows, and Temperature: What They Mean for You

What Is a Token?

AI models don't process text character by character or word by word — they process tokens. A token is roughly a word fragment:

  • 'ChatGPT' = 1 token
  • 'understanding' = 1 token
  • 'unbelievably' = 2–3 tokens
  • A typical English word ≈ 1.3 tokens
  • 1,000 words ≈ ~750 tokens

Why it matters for you: AI pricing, limits, and speed are all measured in tokens. When a service says you have a '1 million token context window,' that's roughly 750,000 words — about 10 average novels.

What Is a Context Window?

The context window is how much text the AI can 'see' at once — including your entire conversation history, any documents you've shared, and the AI's own previous responses.

Think of it like working memory: once text falls outside the context window, the AI effectively 'forgets' it.

Model Approximate context window
GPT-3.5 (original) ~4,000 tokens (~3,000 words)
GPT-4 (2023) 8,000–128,000 tokens
Claude 3.7 Sonnet 200,000 tokens (~150,000 words)
Gemini 1.5 Pro 1,000,000 tokens (~750,000 words)

What this means practically: Long conversations may 'lose' early context. Pasting a large document means the model can reference all of it. Context limits are why AI sometimes seems to 'forget' something you told it earlier in a long conversation.

What Is Temperature?

Temperature controls how creative or predictable the AI's responses are:

  • Low temperature (0–0.3): AI picks the most statistically likely response. More consistent, factual, predictable. Good for: data extraction, code, factual Q&A.
  • High temperature (0.7–1.0): AI samples from a wider range of possible next tokens. More varied, creative, sometimes surprising. Good for: brainstorming, creative writing, generating options.

Most chat interfaces don't expose temperature directly. But understanding it explains why asking an AI the same question twice can produce different answers — at normal temperatures, there's intentional variation in the output.

The Practical Takeaway

You don't need to tune these settings manually in most tools. But knowing they exist shapes smarter use: - For precise factual tasks: keep responses concise and verify - For creative tasks: re-run the prompt if the first output isn't what you wanted - For long documents: paste them early in a conversation to keep them in context

Sign in to join the discussion.
Recent posts
No posts yet.

Tokens, Context Windows, and Temperature: What They Mean for You

What Is a Token?

AI models don't process text character by character or word by word — they process tokens. A token is roughly a word fragment:

  • 'ChatGPT' = 1 token
  • 'understanding' = 1 token
  • 'unbelievably' = 2–3 tokens
  • A typical English word ≈ 1.3 tokens
  • 1,000 words ≈ ~750 tokens

Why it matters for you: AI pricing, limits, and speed are all measured in tokens. When a service says you have a '1 million token context window,' that's roughly 750,000 words — about 10 average novels.

What Is a Context Window?

The context window is how much text the AI can 'see' at once — including your entire conversation history, any documents you've shared, and the AI's own previous responses.

Think of it like working memory: once text falls outside the context window, the AI effectively 'forgets' it.

Model Approximate context window
GPT-3.5 (original) ~4,000 tokens (~3,000 words)
GPT-4 (2023) 8,000–128,000 tokens
Claude 3.7 Sonnet 200,000 tokens (~150,000 words)
Gemini 1.5 Pro 1,000,000 tokens (~750,000 words)

What this means practically: Long conversations may 'lose' early context. Pasting a large document means the model can reference all of it. Context limits are why AI sometimes seems to 'forget' something you told it earlier in a long conversation.

What Is Temperature?

Temperature controls how creative or predictable the AI's responses are:

  • Low temperature (0–0.3): AI picks the most statistically likely response. More consistent, factual, predictable. Good for: data extraction, code, factual Q&A.
  • High temperature (0.7–1.0): AI samples from a wider range of possible next tokens. More varied, creative, sometimes surprising. Good for: brainstorming, creative writing, generating options.

Most chat interfaces don't expose temperature directly. But understanding it explains why asking an AI the same question twice can produce different answers — at normal temperatures, there's intentional variation in the output.

The Practical Takeaway

You don't need to tune these settings manually in most tools. But knowing they exist shapes smarter use: - For precise factual tasks: keep responses concise and verify - For creative tasks: re-run the prompt if the first output isn't what you wanted - For long documents: paste them early in a conversation to keep them in context

Sign in to join the discussion.
Recent posts
No posts yet.
Info
You aren't logged in. Please Log In or Join for Free to unlock full access.