LLM Cost Calculator

Free AI cost calculator to estimate costs for ChatGPT (GPT-5, GPT-5 mini), Claude (Opus 4.5, Sonnet 4.5, Haiku), Gemini, and other language models. Compare pricing across OpenAI, Anthropic, Google AI providers and optimize your AI API spending.

Models are Added Regularly

Related: See the LLM Context Window Comparison for help matching document sizes to model context windows.

Select AI Model (ChatGPT, Claude, Gemini, etc.)

Token Usage

Input Tokens

Output Tokens

Cost Breakdown

Select a model and enter token counts to see the estimated cost.

Compare AI Model Pricing - ChatGPT vs Claude vs Gemini

Quick price comparison for popular AI models (per 1 million tokens) - OpenAI, Anthropic, Google, Meta, DeepSeek etc

Model	Provider	Input Price	Output Price	Features
Qwen3 Max (<32k input)	Alibaba	$1.200000	$6.000000	Caching Batch
Qwen3 Max (>32k input)	Alibaba	$2.400000	$12.000000	Batch Tiered
Qwen3 Omni Flash (Text)	Alibaba	$0.430000	$1.660000
Qwen Flash	Alibaba	$0.050000	$0.400000	Tiered
Qwen Long Latest	Alibaba	$0.072000	$0.287000
Qwen Omni Turbo (Text)	Alibaba	$0.070000	$0.270000
Qwen Plus Latest (Non Thinking)	Alibaba	$0.400000	$1.200000	Tiered
Qwen Plus Latest (Thinking)	Alibaba	$0.400000	$4.000000	Tiered
Qwen Turbo (Non Thinking)	Alibaba	$0.050000	$0.200000	Batch
Qwen Turbo (Thinking)	Alibaba	$0.050000	$0.500000	Batch
QWQ Plus	Alibaba	$0.800000	$2.400000
Haiku 3	Anthropic	$0.250000	$1.250000	Caching
Haiku 3.5	Anthropic	$0.800000	$4.000000	Caching
Haiku 4.5	Anthropic	$1.000000	$5.000000	Caching
Opus 3	Anthropic	$15.000000	$75.000000	Caching
Opus 4	Anthropic	$15.000000	$75.000000	Caching
Opus 4.1	Anthropic	$15.000000	$75.000000	Caching Batch
Opus 4.5	Anthropic	$5.000000	$25.000000	Caching
Sonnet 3.7	Anthropic	$3.000000	$15.000000	Caching
Sonnet 4	Anthropic	$3.000000	$15.000000	Caching
Sonnet 4.5	Anthropic	$3.000000	$15.000000	Tiered
DeepSeek V3.2 (Non-thinking Mode)	DeepSeek	$0.280000	$0.420000	Caching
DeepSeek V3.2 (Thinking Mode)	DeepSeek	$0.280000	$0.420000	Caching
Gemini 2.0 Flash	Google	$0.100000	$0.400000	Caching Batch
Gemini 2.0 Flash Lite	Google	$0.075000	$0.300000	Batch
Gemini 2.5 Flash	Google	$0.300000	$2.500000	Caching Batch
Gemini 2.5 Flash Lite	Google	$0.100000	$0.400000	Caching Batch
Gemini 2.5 Pro	Google	$1.250000	$10.000000	Tiered
Gemini 3 Flash Preview	Google	$0.500000	$3.000000	Caching Batch
Gemini 3 Pro Preview	Google	$2.000000	$12.000000	Tiered
MiniMax M2	MiniMax	$0.300000	$1.200000	Caching
MiniMax M2.1	MiniMax	$0.300000	$1.200000	Caching
MiniMax M2.1 Lightning	MiniMax	$0.300000	$2.400000	Caching
Codestral	Mistral AI	$0.300000	$0.900000	Batch
Magistral Medium	Mistral AI	$2.000000	$5.000000	Batch
Magistral Small	Mistral AI	$0.500000	$1.500000	Batch
Ministral 3 - 14B	Mistral AI	$0.200000	$0.200000	Batch
Ministral 3 - 3B	Mistral AI	$0.100000	$0.100000	Batch
Ministral 3 - 8B	Mistral AI	$0.150000	$0.150000	Batch
Mistral Large 3	Mistral AI	$0.500000	$1.500000	Batch
Mistral Medium 3	Mistral AI	$0.400000	$2.000000	Batch
Mistral NeMo	Mistral AI	$0.150000	$0.150000	Batch
Mistral Small 3.2	Mistral AI	$0.100000	$0.300000	Batch
Pixtral 12B	Mistral AI	$0.150000	$0.150000
Pixtral Large	Mistral AI	$2.000000	$6.000000
Kimi K2 0711 Preview	Moonshot AI	$0.600000	$2.500000	Caching
Kimi K2 0905 Preview	Moonshot AI	$0.600000	$2.500000	Caching
Kimi K2 Thinking	Moonshot AI	$0.600000	$2.500000	Caching
Kimi K2 Thinking Turbo	Moonshot AI	$1.150000	$8.000000	Caching
Kimi K2 Turbo Preview	Moonshot AI	$1.150000	$8.000000	Caching
Moonshot v1 128k	Moonshot AI	$2.000000	$5.000000
Moonshot v1 128k Vision Preview	Moonshot AI	$2.000000	$5.000000
Moonshot v1 32k	Moonshot AI	$1.000000	$3.000000
Moonshot v1 32k Vision Preview	Moonshot AI	$1.000000	$3.000000
Moonshot v1 8k	Moonshot AI	$0.200000	$2.000000
Moonshot v1 8k Vision Preview	Moonshot AI	$0.200000	$2.000000
ChatGPT Image Latest (Image Tokens)	OpenAI	$8.000000	$32.000000	Caching
codex mini latest	OpenAI	$1.500000	$6.000000	Caching
GPT 4	OpenAI	$30.000000	$60.000000
GPT 4.1	OpenAI	$2.000000	$8.000000	Caching Batch
GPT 4.1 mini	OpenAI	$0.400000	$1.600000	Caching Batch
GPT 4.1 nano	OpenAI	$0.100000	$0.400000	Caching Batch
GPT 4o	OpenAI	$2.500000	$10.000000	Caching Batch
GPT 4o (2024-05-13)	OpenAI	$5.000000	$15.000000	Batch
GPT 4o Audio Mini Preview (Audio Tokens)	OpenAI	$10.000000	$20.000000
GPT 4o Audio Preview (Audio Tokens)	OpenAI	$40.000000	$80.000000
GPT 4o mini	OpenAI	$0.150000	$0.600000	Caching Batch
GPT 4o Mini Realtime Preview (Audio Tokens)	OpenAI	$10.000000	$20.000000	Caching
GPT 4o Realtime Preview (Audio Tokens)	OpenAI	$40.000000	$80.000000	Caching
GPT 4 Turbo	OpenAI	$10.000000	$30.000000
GPT 5	OpenAI	$1.250000	$10.000000	Caching Batch
GPT 5.1	OpenAI	$1.250000	$10.000000	Caching Batch
GPT 5.1 Chat	OpenAI	$1.250000	$10.000000	Caching
GPT 5.1 Codex	OpenAI	$1.250000	$10.000000	Caching
GPT 5.1 Codex Max	OpenAI	$1.250000	$10.000000	Caching
GPT 5.1 Codex mini	OpenAI	$0.250000	$2.000000	Caching
GPT 5.2	OpenAI	$1.750000	$14.000000	Caching Batch
GPT 5.2 Chat	OpenAI	$1.750000	$14.000000	Caching
GPT 5.2 Pro	OpenAI	$21.000000	$168.000000	Batch
GPT 5 Chat	OpenAI	$1.250000	$10.000000	Caching
GPT 5 Codex	OpenAI	$1.250000	$10.000000	Caching
GPT 5 mini	OpenAI	$0.250000	$2.000000	Caching Batch
GPT 5 nano	OpenAI	$0.050000	$0.400000	Caching Batch
GPT 5 pro	OpenAI	$15.000000	$120.000000	Batch
GPT 5 search api	OpenAI	$1.250000	$10.000000	Caching
GPT Audio (Audio Tokens)	OpenAI	$32.000000	$64.000000
GPT Audio Mini (Audio Tokens)	OpenAI	$10.000000	$20.000000
GPT Image 1.5 (Image Tokens)	OpenAI	$8.000000	$32.000000	Caching
GPT Image 1.5 (Text Tokens)	OpenAI	$5.000000	$10.000000	Caching
GPT Image 1 (Image Tokens)	OpenAI	$10.000000	$40.000000	Caching
GPT Image 1 Mini (Image Tokens)	OpenAI	$2.500000	$8.000000	Caching
GPT Realtime (Audio Tokens)	OpenAI	$32.000000	$64.000000	Caching
GPT Realtime Mini (Audio Tokens)	OpenAI	$10.000000	$20.000000	Caching
GPT Realtime Mini (Text Tokens)	OpenAI	$0.600000	$2.400000	Caching
GPT Realtime (Text Tokens)	OpenAI	$4.000000	$16.000000	Caching
o1	OpenAI	$15.000000	$60.000000	Caching Batch
o1 mini (Deprecated)	OpenAI	$1.100000	$4.400000	Caching Batch
o1 pro	OpenAI	$150.000000	$600.000000	Batch
o3	OpenAI	$2.000000	$8.000000	Caching Batch
o3 deep research	OpenAI	$10.000000	$40.000000	Caching Batch
o3 mini	OpenAI	$1.100000	$4.400000	Caching Batch
o3 pro	OpenAI	$20.000000	$80.000000	Batch
o4 mini	OpenAI	$1.100000	$4.400000	Caching Batch
o4 mini deep research	OpenAI	$2.000000	$8.000000	Caching Batch
Grok 2 Vision	xAI	$2.000000	$10.000000
Grok 3	xAI	$3.000000	$15.000000	Caching
Grok 3 Mini	xAI	$0.300000	$0.500000	Caching
Grok 4	xAI	$3.000000	$15.000000	Caching Tiered
Grok 4.1 Fast (Non-Reasoning)	xAI	$0.200000	$0.500000	Caching Tiered
Grok 4.1 Fast (Reasoning)	xAI	$0.200000	$0.500000	Caching Tiered
Grok 4 Fast (Non-Reasoning)	xAI	$0.200000	$0.500000	Caching Tiered
Grok 4 Fast (Reasoning)	xAI	$0.200000	$0.500000	Caching Tiered
Grok Code Fast 1	xAI	$0.200000	$1.500000	Caching

The "Hidden" API Costs

Price per million tokens is only half the story. As a developer, here is what actually inflates your bill.

The "Thinking" Tax

Be very careful with reasoning models like GPT-5.2, Gemini 3 Pro, or DeepSeek-V3.2 (Thinking). They generate hidden "Chain of Thought" tokens that you never see in the final response, but you are billed for them.

A simple 500-token output might actually cost you 3,000 tokens in backend processing. While you can limit this via API parameters, doing so usually breaks the model's logic. Always buffer your budget by 4x for reasoning tasks.

Batch API is Your Best Friend

If you are running background jobs (like summarizing daily logs), stop paying full price. I consistently use the Batch API for 50% discounts. The 24-hour SLA sounds scary, but in practice, my logs show jobs usually finish in under 90 minutes. It is the single easiest way to cut your bill in half without changing models.

Stop Overpaying for Logic

For 90% of tasks (classification, extraction, regex), Gemini Flash and GPT-5 mini or nano are indistinguishable from their 'Pro' counterparts. You should only be paying the premium rates ($5.00+/1M) for complex creative writing or deep architectural coding with GPT-5 or Gemini 3 Pro. For everything else, the budget models have effectively won.

How LLM Pricing Works

What Are Tokens?

Tokens are the basic units LLMs use to process text. In English, 1 token ≈ 4 characters or ~0.75 words. The word "ChatGPT" is 2 tokens, while "AI" is 1 token. You pay based on tokens processed.

Use our Token Counter to count tokens in your text.

Input vs Output Tokens

Input tokens are your prompts and context. Output tokens are the AI's responses. Output typically costs 2-5x more because generating text requires more computation than reading it.

Save with Caching & Batch

Prompt caching saves 75-90% on repeated prompts. Batch API offers 50% off for non-urgent requests. Combine both to dramatically reduce your AI costs.

6 Ways to Reduce Your LLM API Costs

Count Your Tokens First

Use our Token Counter to accurately measure input/output tokens before making API calls. Knowing exact counts helps you choose the right model and avoid surprises.

Choose the Right Model

Use smaller, cheaper models (GPT-5 mini, Claude Haiku) for simple tasks. Reserve expensive models for complex reasoning.

Optimize Your Prompts

Shorter, clearer prompts use fewer input tokens. Remove unnecessary context and be specific about what you need.

Use Prompt Caching

If you send the same system prompt repeatedly, enable caching. Anthropic and OpenAI offer massive discounts on cached tokens.

Batch Non-Urgent Requests

For tasks that don't need instant responses, use Batch API. Get 50% off by allowing requests to complete within 24 hours.

Set Max Token Limits

Always set max_tokens in your API calls. This prevents unexpectedly long (and expensive) responses.

Frequently Asked Questions

What is an LLM token?

A token is a unit of text that language models process. In English, one token is roughly 4 characters or about 0.75 words. For example, "ChatGPT" is 2 tokens, while "AI" is 1 token. API pricing is based on the number of tokens processed.

How is LLM API pricing calculated?

LLM API pricing is typically calculated per million tokens, with separate rates for input (prompt) and output (completion) tokens. Output tokens are usually more expensive because they require more computation to generate.

What is prompt caching and how does it save money?

Prompt caching stores frequently used prompts so they don't need to be reprocessed. Cached input tokens are significantly cheaper (often 75-90% less) than regular input tokens. This is ideal for applications with repetitive system prompts or instructions.

What is batch API pricing?

Batch API allows you to send multiple requests at once for non-time-sensitive tasks. Most providers offer a 50% discount on batch requests because they can process them during off-peak hours. Results are typically available within 24 hours.

Which LLM is the most cost-effective?

Cost-effectiveness depends on your use case. For simple tasks like classification or summarization, smaller models like GPT-5 mini or Claude Haiku offer excellent value. For complex reasoning or coding, larger models may be more cost-effective despite higher per-token costs because they require fewer attempts to get accurate results.

How accurate are these price estimates?

Our calculator uses official API pricing from each provider. Prices are updated regularly, but we recommend checking the official pricing pages before making budget decisions. Actual costs may vary slightly based on your usage patterns and any negotiated enterprise rates.