Free tool · no signup

LLM API Cost Calculator

Work out what an LLM API workload will cost. Enter input tokens, output tokens and monthly volume, then compare price per request and per month across current Claude, GPT and Gemini models.

At this volume, GPT-4.1 nano is the cheapest at $4.00/mo — about 113× less than Claude Fable 5 ($450.00/mo).

ModelIn $/MOut $/MPer requestPer month
GPT-4.1 nanoOpenAI$0.1$0.4$0.00040$4.00
GPT-4.1 miniOpenAI$0.4$1.6$0.00160$16.00
Gemini 2.5 FlashGoogle$0.3$2.5$0.00185$18.50
Claude Haiku 4.5Anthropic$1$5$0.00450$45.00
Gemini 2.5 ProGoogle$1.25$10$0.00750$75.00
GPT-4.1OpenAI$2$8$0.00800$80.00
GPT-5.4OpenAI$2.5$15$0.0125$125.00
Claude Sonnet 4.6Anthropic$3$15$0.0135$135.00
Claude Opus 4.8Anthropic$5$25$0.0225$225.00
GPT-5.5OpenAI$5$30$0.0250$250.00
Claude Fable 5Anthropic$10$50$0.0450$450.00

Estimates use list prices per 1M tokens, last verified June 2026. Cached input, batch and volume discounts are not applied. Always confirm against the provider:

Anthropic pricing ↗OpenAI pricing ↗Google pricing ↗

This LLM API cost calculator estimates and compares what a workload will cost across current Claude, GPT and Gemini models. Enter the input tokens, output tokens and monthly request volume, and see the price per request and per month for each model side by side.

It’s a fast way to sanity-check an AI feature’s unit economics before you build, and to spot when a cheaper model would do the job for a fraction of the price.

How LLM API pricing works

Almost every provider bills per token, with separate rates for input (your prompt) and output (the model’s response). Cost per request is (input tokens ÷ 1,000,000 × input price) + (output tokens ÷ 1,000,000 × output price). Output is usually three to five times more expensive than input because generating text is more compute-intensive than reading it.

On reasoning models, the output figure also includes the model’s internal thinking tokens, so verbose reasoning can quietly inflate a bill.

How to estimate your monthly cost

Start with a realistic average prompt and response size, not the maximum. Multiply the per-request cost by your expected monthly volume. If your prompts vary a lot, run the calculator twice — once for a typical request and once for a heavy one — to bracket the range.

Tips to reduce API costs

Prompt caching can cut the cost of repeated context (system prompts, long documents) by around 90%. Batch APIs typically offer roughly 50% off for non-urgent work. Choosing a smaller model for simple tasks — classification, extraction, short replies — is often the single biggest saving. Always confirm current rates on each provider’s pricing page, since prices change regularly.

Related

Want the work done, not just the tool?

OpenHelm runs AI agents in a secure cloud environment to do the actual task — research, outreach, reporting, monitoring — and hands back the result for your sign-off.

More free tools

Frequently asked questions