TokenCalc Logo

LLM Cost Calculator

Compare costs across 300+ AI models from OpenAI, Anthropic, Google, DeepSeek, and more. Your prompts stay on your machine.

Your Prompts Never Leave Your Browser 300+ AI Models Free Forever
Compare Model Costs
300+
AI Models
10+
Providers
6
Pro Tools
100%
Free

Complete LLM Pricing Comparison (February 2026)

All prices per 1 million tokens. Sorted by input price from cheapest to most expensive.

Model Provider Input Output Context
DeepSeek V3Cheapest DeepSeek $0.07 $0.28 64K
Gemini 3 FlashBest Value Google $0.075 $0.30 2M
GPT-4o-mini OpenAI $0.15 $0.60 128K
Claude Haiku 4.5 Anthropic $0.25 $1.25 200K
o3-mini OpenAI $1.10 $4.40 200K
Gemini 3 Pro Google $1.25 $5.00 2M
DeepSeek R1 DeepSeek $2.00 $8.00 128K
GPT-4o OpenAI $2.50 $10.00 128K
Claude Sonnet 4.5 Anthropic $3.00 $15.00 200K
o3 OpenAI $10.00 $40.00 200K
Claude Opus 4.5 Anthropic $12.00 $60.00 200K
GPT-5 OpenAI $15.00 $45.00 256K

Best LLM by Use Case

Budget Tasks

  • DeepSeek V3. Cheapest overall
  • Gemini 3 Flash. Best context size
  • GPT-4o-mini. Most reliable

Coding

  • Claude Opus 4.5. Best overall
  • Claude Sonnet 4.5. Best value
  • DeepSeek R1. Budget option

Long Documents

  • Gemini 3 Pro, 2M context
  • GPT-5, 256K context
  • Claude Opus, 200K context

Chat / Conversation

  • GPT-4o. Best quality
  • Claude Sonnet. Balanced
  • Gemini Flash. Budget

How to Reduce Your LLM Costs

#1: Right-Size Your Model

Use smaller models for simple tasks. GPT-4o-mini or Claude Haiku can handle 80% of tasks at 10-100x lower cost than flagship models.

#2: Use Prompt Caching

Claude offers 90% discount on cached tokens. OpenAI offers 50%. If you have repeated system prompts, enable caching.

#3: Optimize Prompts

Remove unnecessary whitespace, be concise, and don't repeat instructions. Every token costs money.

Frequently Asked Questions

What is the cheapest AI model in 2026?

DeepSeek V3 at $0.07/1M tokens and Gemini 3 Flash at $0.075/1M are the cheapest. For reliable quality, GPT-4o-mini ($0.15/1M) and Claude Haiku ($0.25/1M) are excellent budget options.

How do I reduce my LLM API costs?

1) Use smaller models for simple tasks. 2) Enable prompt caching. 3) Optimize prompts. 4) Use batch APIs for non-urgent work. 5) Consider open-source models for high volume.

GPT-5 vs Claude Opus, which is better?

Both are flagship models. Claude Opus is better for coding and cheaper for inputs ($12 vs $15/1M). GPT-5 is cheaper for outputs ($45 vs $60/1M) and has larger context (256K vs 200K).

Which LLM has the largest context window?

Gemini 3 Pro and Flash have the largest at 2 million tokens. GPT-5 offers 256K, Claude models offer 200K. For very long documents, Gemini is the best choice.

Model-Specific Calculators