INTERACTIVE
LLM TCO Calculator
Estimate the monthly total cost of ownership for a single LLM workload. Adjust the four inputs to see how token volume, model price, and request rate compound.
Most teams underestimate LLM cost in production by 3–5× because they price the model at the demo workload, not the production workload. Use the four inputs below to surface the gap before procurement signs the order form.
Inputs
Production request rate, not your dev / pilot rate. Multiply pilot rate by your expected concurrency.
Includes system prompt + retrieved context + user message.
Sonnet 4.6 input is $3/M. Opus 4.7 is $15/M. Haiku 4.5 is $1/M.
Computed monthly cost
requests_per_day * avg_input_tokens * input_price_per_M / 1_000_000 * 30requests_per_day * avg_output_tokens * output_price_per_M / 1_000_000 * 30monthly_input_usd + monthly_output_usdSpend tier
Mid workloadYour workload sits in the predictable-spend range. Self-serve commitments on Anthropic or Google direct API are likely the cleanest option.
Caveat
This estimate excludes vector store costs, embedding costs, evals, observability, and the human-review burden on outputs. For the all-in budget, add a 20–40% overhead line.