What is OpenAI's prompt caching and how does it save money?

OpenAI's prompt caching automatically caches repeated prompt prefixes, offering a 50% discount on cached input tokens. If you send the same system prompt with each request, subsequent requests will use cached tokens at half price. This is automatic, no code changes needed. Best for applications with consistent system prompts.

GPT-5 vs GPT-4o, which should I use?

Use GPT-4o for most production workloads, it's faster, cheaper ($2.50/1M input vs $15/1M), and excellent for 95% of tasks. Use GPT-5 only when you need the highest capability for complex reasoning, coding, or creative tasks. For simple tasks like classification or extraction, GPT-4o-mini at $0.15/1M is the most cost-effective.

OpenAI Token Calculator. GPT-5, GPT-4o Pricing

OpenAI API Pricing (February 2026)

Current pricing for all OpenAI models. Prices are per 1 million tokens. Use our calculator to get exact costs for your prompts.

Model	Input Price	Output Price	Context Window
GPT-5Flagship	$15.00 / 1M	$45.00 / 1M	256K tokens
GPT-5.2	$18.00 / 1M	$54.00 / 1M	256K tokens
GPT-4oRecommended	$2.50 / 1M	$10.00 / 1M	128K tokens
GPT-4o-miniBest Value	$0.15 / 1M	$0.60 / 1M	128K tokens
o3	$10.00 / 1M	$40.00 / 1M	200K tokens
o3-mini	$1.10 / 1M	$4.40 / 1M	200K tokens

Pro Tip: Prompt Caching

OpenAI offers 50% discount on cached input tokens. If your system prompt stays the same across requests, you automatically get cheaper pricing on subsequent calls.

How to Count OpenAI Tokens

OpenAI uses the tiktoken library for tokenization. Here are the encoding types for each model:

Encoding by Model

GPT-5, GPT-4o, GPT-4o-mini: cl100k_base encoding
o3, o3-mini: cl100k_base encoding
GPT-3.5-turbo: cl100k_base encoding

Python Code Example

import tiktoken

# For GPT-5, GPT-4o, GPT-4o-mini
enc = tiktoken.encoding_for_model("gpt-4o")
tokens = enc.encode("Your text here")
print(f"Token count: {len(tokens)}")

# Calculate cost
input_tokens = len(tokens)
cost = (input_tokens / 1_000_000) * 2.50  # GPT-4o input price
print(f"Input cost: ${cost:.6f}")

Token Estimation Rules

1 token = approximately 4 characters in English
1 token = approximately 0.75 words
100 words = approximately 133 tokens
Code typically uses more tokens than natural language

How to Reduce OpenAI API Costs

1. Choose the Right Model

Don't use GPT-5 for everything. Match model capability to task complexity:

Simple tasks (classification, extraction): GPT-4o-mini ($0.15/1M)
Standard tasks (chat, generation): GPT-4o ($2.50/1M)
Complex reasoning: GPT-5 or o3 only when needed

2. Optimize Your Prompts

Remove unnecessary whitespace and formatting
Use concise system prompts, they're sent with every request
Avoid repeating instructions in every message

3. Use Prompt Caching

OpenAI automatically caches prompt prefixes. Keep your system prompt consistent to get 50% off on cached tokens.

4. Batch Requests

Use the Batch API for non-time-sensitive workloads to get 50% discount on all tokens.

Frequently Asked Questions

How much does ChatGPT API cost per 1000 tokens?

ChatGPT API costs vary by model. GPT-4o costs $2.50 per 1M input tokens ($0.0025 per 1K) and $10 per 1M output tokens. GPT-4o-mini is much cheaper at $0.15 per 1M input tokens ($0.00015 per 1K). GPT-5 costs $15 per 1M input tokens. For the most cost-effective option, use GPT-4o-mini for simple tasks.

How many tokens can GPT-5 process?

GPT-5 has a context window of 256,000 tokens, allowing it to process approximately 192,000 words or 400+ pages of text in a single request. This is a significant upgrade from GPT-4's 128K context window.

What's the difference between input and output tokens?

Input tokens are the tokens in your prompt (what you send to the API), while output tokens are in the model's response. OpenAI charges different rates for each, output tokens are typically 3-4x more expensive because generating text requires more computation.

How do I count tokens in Python?

Use the official tiktoken library: import tiktoken; enc = tiktoken.encoding_for_model('gpt-4o'); tokens = enc.encode('your text'); print(len(tokens)). TokenCalc uses this same library for 100% accurate counts.

Compare Other AI Models

Claude Token Calculator

Opus 4.5, Sonnet, Haiku

Gemini Token Calculator

Gemini 3 Pro & Flash

LLM Cost Calculator

Compare 300+ models