Claude Sonnet 5 Pricing: What $2/$10 per Million Tokens Means for Your Margins
Claude Sonnet 5 launches at $2 per million input tokens and $10 per million output tokens (introductory, through August 31, 2026), then $3/$15 - but a new tokenizer means your real cost depends on tokens per task, not the sticker rate.
Jul 1, 2026 · 4 min read
Key takeaways
Introductory pricing is $2/M input and $10/M output through Aug 31, 2026, rising to $3/$15 after.
That is roughly 60% cheaper per token than Opus 4.8 ($5/$25) during the intro window, and about 40% cheaper at standard pricing.
Sonnet 5 uses an updated tokenizer: the same text can map to 1.0-1.35x more tokens, so effective cost per task can rise even when the per-token rate looks flat.
An 'effort level' setting lets you trade cost for performance, turning one model into a range of price points.
Prompt caching (up to 90% off) and batch processing (50% off) remain the biggest levers on your bill.
What did Anthropic actually change on price?
Claude Sonnet 5 shipped on June 30, 2026 at introductory API pricing of $2 per million input tokens and $10 per million output tokens, held through August 31, 2026. After that it moves to $3 per million input and $15 per million output. For reference, Opus 4.8 sits at $5 input and $25 output per million tokens. So on paper Sonnet 5 is the cheaper way to run agentic work: during the intro window it is about 60% cheaper per token than Opus on both input and output.
Why the sticker price is not your real cost
Here is the detail most launch coverage skips: Sonnet 5 ships with an updated tokenizer. Anthropic notes the same input can map to roughly 1.0 to 1.35x more tokens depending on content type. That matters because you are billed per token, not per word. If your workload lands at the high end of that range, a '$2 per million' input rate can behave more like $2.70 for the same source text.
Anthropic says it set the introductory pricing so the move from Sonnet 4.6 is 'roughly cost-neutral.' The translation for operators: model the change per task, not per token. Take a representative job, count the tokens it actually consumes on Sonnet 5, and compare the total, not the headline rate.
How does Sonnet 5 compare to Opus 4.8 on cost?
Per million tokens, Sonnet 5's introductory pricing ($2/$10) is about 60% cheaper than Opus 4.8 ($5/$25) on both input and output. At standard pricing ($3/$15) the gap narrows to roughly 40% cheaper. But raw token price is only half the story. Sonnet 5 exposes an effort level you can dial up or down. Lower effort finishes cheaper; higher effort spends more tokens and can approach Opus-class results on some tasks. So 'the price of Sonnet 5' is really a curve, and the right question is which effort level clears your quality bar at the lowest cost per completed task.
What should founders model before switching?
Three things drive your actual bill: tokens per task (now shaped by the new tokenizer), the effort level you run at, and how much you cache. Prompt caching can cut up to 90% off repeated context, and batch processing takes 50% off non-urgent jobs. For an agent that re-reads the same system prompt and tools on every step, caching usually swings the economics more than the per-token rate does.
A simple test: hold quality constant, then compare cost per successful task across Sonnet 4.6, Sonnet 5 at a couple of effort levels, and Opus 4.8. The cheapest per-token option is not always the cheapest per outcome.
The takeaway: Sonnet 5's headline price is attractive, but your margin depends on tokens per task, effort level, and caching, so model those together rather than the sticker rate alone. You can sketch this comparison in Calcaas to see where your workload lands.
Frequently asked questions
How much does Claude Sonnet 5 cost?
Introductory pricing is $2 per million input tokens and $10 per million output tokens through August 31, 2026. After that it moves to standard pricing of $3 per million input and $15 per million output.
Is Claude Sonnet 5 cheaper than Opus 4.8?
Yes on a per-token basis. Opus 4.8 is $5 input and $25 output per million tokens. Sonnet 5's introductory pricing is about 60% cheaper on both sides, and about 40% cheaper at standard pricing.
Why might my costs rise even at the same token price?
Sonnet 5 uses an updated tokenizer, so the same text can map to roughly 1.0 to 1.35x more tokens. Because billing is per token, effective cost per task can increase even when the per-token rate looks flat.
How can I lower Claude Sonnet 5 costs?
Use prompt caching for up to 90% savings on repeated context and batch processing for 50% savings on non-urgent work. Also tune the effort level so you are not paying for more reasoning than the task needs.
What is the 'effort level' in Sonnet 5?
It is a setting that trades cost for capability. Lower effort uses fewer tokens and costs less; higher effort spends more and can match Opus-class performance on some tasks.