AI language models like GPT-5, Claude, and Gemini do not process raw text. Instead, they break text into smaller units called tokens using a process called tokenization. Most modern LLMs use Byte Pair Encoding (BPE), which splits text into subword units based on frequency patterns learned during training.
How BPE tokenization works: The algorithm starts with individual characters, then iteratively merges the most frequent adjacent pairs to build a vocabulary of common subword units. Common words like "the" become a single token, while rare words get split into multiple tokens. For example, "tokenization" might become ["token", "ization"] -- two tokens instead of one.
Why token counts matter for costs: AI providers charge per token for both input (your prompt) and output (the model's response). Output tokens typically cost 3-4x more than input tokens. Understanding your token usage helps you choose the right model and optimize prompts to reduce API costs. TokenCalc uses the official tiktoken library to provide 100% accurate counts for OpenAI models and highly accurate estimates for Claude and Gemini.