INTERACTIVE

LLM TCO Calculator

Estimate the monthly total cost of ownership for a single LLM workload. Adjust the four inputs to see how token volume, model price, and request rate compound.

Most teams underestimate LLM cost in production by 3–5× because they price the model at the demo workload, not the production workload. Use the four inputs below to surface the gap before procurement signs the order form.

Inputs

Requests per dayreq/day

Production request rate, not your dev / pilot rate. Multiply pilot rate by your expected concurrency.

Average input tokens per requesttokens

Includes system prompt + retrieved context + user message.

Average output tokens per requesttokens

Input price (USD per million tokens)USD / M tokens

Sonnet 4.6 input is $3/M. Opus 4.7 is $15/M. Haiku 4.5 is $1/M.

Output price (USD per million tokens)USD / M tokens

Computed monthly cost

Input spend

requests_per_day * avg_input_tokens * input_price_per_M / 1_000_000 * 30

5,400 USD/mo

Output spend

requests_per_day * avg_output_tokens * output_price_per_M / 1_000_000 * 30

9,000 USD/mo

Total monthly spend

monthly_input_usd + monthly_output_usd

14,400 USD/mo

Spend tier

Mid workload

Your workload sits in the predictable-spend range. Self-serve commitments on Anthropic or Google direct API are likely the cleanest option.

Caveat

This estimate excludes vector store costs, embedding costs, evals, observability, and the human-review burden on outputs. For the all-in budget, add a 20–40% overhead line.