How much does Gemini API cost?

Gemini pricing varies by model. Gemini 3 Flash is extremely affordable at $0.075/1M input tokens. Gemini 3 Pro costs $1.25/1M input tokens. Gemini 2.5 Pro costs $1.25/1M. All models offer context caching for additional discounts. Google also offers a free tier with limited usage.

What is Gemini's context window size?

Gemini 3 Pro and Flash support up to 2 million tokens in their context window, the largest of any major AI model. This is equivalent to approximately 1.5 million words or 3,000+ pages of text, enabling analysis of entire codebases or book-length documents.

Does Google Gemini have a free tier?

Yes, Google offers a free tier for Gemini API through AI Studio. The free tier includes limited requests per minute and per day. For production usage, you'll need to set up billing. Gemini 3 Flash free tier allows 15 requests per minute and 1,500 per day.

How does Gemini context caching work?

Gemini context caching stores parts of your prompt for reuse. Cached tokens cost 75% less than regular tokens. Unlike Claude's automatic caching, you need to explicitly create a cached context using the API. Caches have a minimum TTL of 5 minutes and cost $0.001/1K tokens/hour for storage.

Gemini Token Calculator. Gemini 3 Pro, Flash Pricing

Google Gemini API Pricing (February 2026)

Current pricing for all Gemini models. Google offers extremely competitive pricing, especially for Flash models. Prices shown are per 1 million tokens.

Model	Input Price	Output Price	Context Window
Gemini 3 ProFlagship	$1.25 / 1M	$5.00 / 1M	2M tokens
Gemini 3 FlashBest Value	$0.075 / 1M	$0.30 / 1M	2M tokens
Gemini 2.5 Pro	$1.25 / 1M	$5.00 / 1M	1M tokens
Gemini 2.5 Flash	$0.075 / 1M	$0.30 / 1M	1M tokens

2,000,000

Tokens context window, the largest of any major AI model

Equivalent to 1.5 million words or 3,000+ pages

Free Tier Available

Google offers a generous free tier through AI Studio:

Gemini 3 Flash: 15 requests/minute, 1,500 requests/day
Gemini 3 Pro: 2 requests/minute, 50 requests/day
Perfect for prototyping and small projects

Why Gemini 3 Flash is So Cheap

Gemini 3 Flash at $0.075/1M tokens is one of the cheapest production-ready AI models available. Here's how it compares:

Model	Input Price	Price Ratio
Gemini 3 Flash	$0.075 / 1M	1x (baseline)
DeepSeek V3	$0.07 / 1M	0.9x (slightly cheaper)
GPT-4o-mini	$0.15 / 1M	2x more expensive
Claude Haiku 4.5	$0.25 / 1M	3.3x more expensive
GPT-4o	$2.50 / 1M	33x more expensive

Pro Tip: Context Caching

Gemini offers context caching for 75% discount on cached tokens. Create a cached context using the API for repeated prompts. Storage costs $0.001/1K tokens/hour with minimum 5-minute TTL.

When to Use Gemini

Gemini 3 Pro. Best For:

Complex reasoning and analysis tasks
Long document processing (2M context)
Code generation and review
Multimodal tasks (image, video, audio)

Gemini 3 Flash. Best For:

High-volume, cost-sensitive workloads
Simple classification and extraction
Real-time applications requiring speed
Prototyping and development

Frequently Asked Questions

What is the cheapest AI model in 2026?

Gemini 3 Flash at $0.075/1M tokens and DeepSeek V3 at $0.07/1M are the cheapest production-ready AI models in 2026. Both offer excellent quality for their price point.

Does Gemini have a free tier?

Yes! Google AI Studio offers a free tier. Gemini 3 Flash: 15 requests/minute, 1,500/day. Gemini 3 Pro: 2 requests/minute, 50/day. Perfect for prototyping.

How big is Gemini's context window?

Gemini 3 models support up to 2 million tokens, the largest of any major AI model. This is approximately 1.5 million words or 3,000+ pages of text.

Gemini vs GPT-4o, which is better value?

For cost, Gemini wins. Gemini 3 Flash is 33x cheaper than GPT-4o. Gemini 3 Pro is 2x cheaper. GPT-4o may offer slightly better quality for some tasks, but Gemini provides excellent value.

Compare Other AI Models

OpenAI Token Calculator

GPT-5, GPT-4o, o3

Claude Token Calculator

Opus 4.5, Sonnet, Haiku

LLM Cost Calculator

Compare 300+ models