Model ReleaseMay 30, 2026·8 min read

Anthropic kept the Opus list price flat for the third release running. The headline is the new Fast mode: $10/$50 at 2.5x speed.

Claude Opus 4.8 shipped on May 28 with standard pricing untouched at $5 input and $25 output per million tokens, the same number Anthropic has held since Opus 4.5 in late 2025. The interesting line on the rate card is the new Fast mode tier at $10/$50, 2.5x the speed of the standard endpoint and exactly a third of what the previous Opus Fast mode charged. SWE-bench Pro climbs five points, USAMO 2026 jumps 27, and the model is four times less likely to ignore flaws in its own code. Here is the full price card, where the benchmarks actually moved, and the one bench Anthropic quietly let slide.

Claude Opus 4.8 announcement hero image from Anthropic

Image source: Anthropic

The three things to take away

Standard pricing is unchanged at $5/$25 per 1M, the same rate Anthropic has held for three Opus releases.
New Fast mode tier: $10/$50 at 2.5x speed, a third of what Opus 4.7's Fast tier charged.
Five benches up, GPQA Diamond down 0.6 points. SWE-bench Pro climbs nearly five.

Fast mode is the only number that moved

Read the rate card top to bottom and one row stands out. Standard, Batch, and cache are all carried straight over from Opus 4.7. Fast mode is new on this generation, and Anthropic priced it at a third of the old Opus Fast tier. On 4.6 and 4.7 you paid $30/$150 to run Opus at higher throughput. On 4.8 you pay $10/$50 for the same 2.5x speed envelope. Anthropic frames Fast as a research preview, gated by waitlist on the API. In Claude Code it is already live: type /fast and the session switches harness mid-flight without billing the upgrade.

Tier	Input / 1M	Output / 1M	Notes
Standard	$5.00	$25.00	Same as 4.7. 1M context, 128K max output.
Fast mode (new)	$10.00	$50.00	2.5x speed. 3x cheaper than 4.7 Fast ($30/$150). API in preview, live in Claude Code.
Cache read	$0.50	n/a	90% off standard input on cache hits.
Cache write (5 min)	$6.25	n/a	1.25x input. Default cache TTL.
Cache write (1 hr)	$10.00	n/a	2x input. For longer-lived prefixes.
Batch API	$2.50	$12.50	50% off standard for async jobs.

The Fast number lands at a specific spot. It is exactly twice the standard rate, which matches the speed ratio, so on a per-second basis you are paying the same. That is a real shift from the 4.7 Fast tier, where the speed premium was effectively 6x. Read the new tier as Anthropic admitting that the old Fast price was punitive, and that anyone running an interactive coding agent already pays speed-bias rates without needing to be charged six times over for it.

Where the benchmarks moved, and where they didn't

Anthropic published Opus 4.8 against six headline benches. Five improve on 4.7. One ticks down. The big movers are SWE-bench Pro, USAMO 2026, and what Anthropic calls the "flaw-flagging" rate on generated code.

Benchmark	Opus 4.8	Opus 4.7	Delta
SWE-bench Verified	88.6	87.6	+1.0
SWE-bench Pro	69.2	64.3	+4.9
Terminal-Bench 2.1	74.6	n/p	new bench
GPQA Diamond	93.6	94.2	-0.6
USAMO 2026	96.7	69.3	+27.4
Online-Mind2Web	84.0*	n/p	customer figure

A couple of asterisks belong on that table. The GPQA Diamond regression is small but real, so if your workload is dense PhD-level reasoning rather than agent loops, Artificial Analysis shows Opus 4.7 still gets the extra half-point. The 84 on Online-Mind2Web also traces back to a testimonial from a Browserbase agent lead in the launch deck, not Anthropic's own benchmark table, so treat it as a directional read on browser agents rather than an apples-to-apples score against other labs.

The flaw-flagging stat is the one that translates to a bill. Anthropic says Opus 4.8 is roughly 4x less likely than 4.7 to let a flaw in code it generated pass without comment. If you run review-pass agents on Opus output, the practical effect is fewer cycles spent re-prompting the model to actually catch what it wrote. Cheaper sessions without the rate card moving.

Real workload costs, with Fast mode in the table

List rates only, no batch and no cache. Fast mode adds a second Opus 4.8 column so you can read the speed-vs-cost trade straight off the page. The competitor set is the same stack we used last week on the Qwen3.7 Max breakdown, minus Gemini and plus GPT-5.5 Pro, since the Pro tier is the only obvious peer for Opus 4.8 Fast in raw dollars per output token.

Workload	Opus 4.8 Std	Opus 4.8 Fast	GPT-5.5	GPT-5.5 Pro
Agent loop turn (50K in / 10K out)	$0.500	$1.000	$0.550	$3.300
Single chat reply (8K in / 1.5K out)	$0.078	$0.155	$0.085	$0.510
Repo-wide audit (400K in / 20K out)	$2.500	$5.000	$2.600	$15.600
Steady-state spend (1B/mo, 70/30 blend)	$11,000	$22,000	$12,500	$75,000

Standard Opus 4.8 lands a hair cheaper than GPT-5.5 across the board because OpenAI charges $30 on output and Anthropic charges $25. GPT-5.5 Pro at $30/$180 is in a different conversation: on the repo-wide audit row it runs you a bit over six times the Opus 4.8 Standard bill, and more than triple the Fast tier. Pro can still take the single-prompt reasoning bench, but the cost surface for any kind of repeated agentic loop tilts hard the other way.

When Fast mode pays for itself

Fast doubles your bill in exchange for 2.5x throughput. The math is dead simple: if your wall-clock cost (engineer time, queueing, downstream pipelines) exceeds your token cost by even a small margin, Fast is the cheaper option once you factor it in. A senior engineer waiting an extra 90 seconds on every agent loop costs more than the $5 doubling on a long whole-repo pass.

Where Fast doesn't pay is async batch work. If you are running an overnight ETL-style pipeline that doesn't care whether it finishes at 2am or 3am, the Standard tier or the Batch API ($2.50/$12.50, 50% off) is the only reasonable choice. Fast through the front door is double the rate; Batch through the back door is half. That is a 4x decision at the same final-quality output.

The waitlist matters here. Fast is in research preview on the API, so unless you are already on Anthropic's priority list, you are running Standard from your own code today. The path most teams will take in the next week or two is to test Fast inside Claude Code (where /fast works now), confirm the throughput delta is real on their prompts, and then plug it into agents once the API waitlist clears.

The bench Anthropic skipped

Three numbers are missing from the launch deck and worth saying out loud. There is no Mythos comparison: Opus 4.8 is positioned as the public-API flagship, while the Mythos preview model sits at $25/$125 for partner use only, and Anthropic is not inviting that head-to-head on the announcement page. There is no MCP-Atlas score, which Qwen3.7 Max led on last week. And there is no AA Intelligence Index print on launch day; Artificial Analysis usually files the number within a few days, but it was not in the press kit.

None of those is dishonest, but each one is a board where Opus 4.8 either ties or loses against a current rival. Read the launch as Anthropic showing the boards where it gained and quietly omitting the ones where it didn't.

Should you migrate from Opus 4.7?

For coding agents, yes. SWE-bench Pro up five points and the 4x flaw-flag improvement both land in the workloads tokencost readers actually pay for. The migration is a model-ID swap from claude-opus-4-7 to claude-opus-4-8, with the same tokenizer and the same prompt-cache header surface. Nothing else in your code needs to move.

Research and pure-reasoning use is the case for holding on Opus 4.7 a little longer. GPQA Diamond ticked down, and the USAMO jump, while real, is on a math contest that most product workloads do not resemble. If your Opus 4.7 bills have been running the size you expected, the upgrade is optional rather than urgent. There is no price reason to switch, only a capability reason.

Sources

Anthropic: Introducing Claude Opus 4.8 - Launch announcement, May 28 2026
Anthropic: Claude pricing - Standard $5/$25, Fast $10/$50, cache and batch rates
Anthropic: Opus 4.8 system card - Full benchmark table and safety evaluations
Anthropic: Models overview - 1M context, 128K max output, 300K output beta header
VentureBeat: Opus 4.8 launch coverage - 3x cheaper Fast mode framing, alignment improvements
9to5Mac: Opus 4.8 launch - SWE-bench Pro 69.2 vs 64.3, 4x flaw-flag improvement
LLM Stats: Opus 4.8 launch report - USAMO 2026 96.7 vs 69.3, GPQA Diamond 93.6

Compare all model prices Calculate your API cost