TokenBlog

Model releases, pricing breakdowns, and practical guides for developers.

GuideJuly 19, 2026

Latest·8 min read

Every major LLM API now knocks about 90% off the tokens you cache. The bill you don't see…

Cached input reads have quietly standardized near a tenth of the base price across OpenAI, Anthropic, and Google, and DeepSeek goes further at under 1%. The real spread is on the write side: Anthropic…

ComparisonJuly 18, 2026

American teams now route a third of their tokens to Chinese models to save money. The plot…

US companies have made up more than 30% of Chinese-model token traffic on OpenRouter every week since February, and the reason is the…

Model ReleaseJuly 17, 2026

For two years the story out of China was that prices only fall. Moonshot just tripled its…

Kimi K3 lists at $3/$15 per million tokens against K2.6's $0.95/$4.00, a 3.16x input and 3.75x output increase under three months apart. An…

Model ReleaseJuly 16, 2026

A Kuaishou team just priced 94 percent of Opus 4.8's coding score at a seventh of the…

Kwaipilot's KAT-Coder-Pro v2.5 lists at $0.74/$2.96 per million tokens, with a cheaper Air tier at $0.15/$0.60. On SWE-Bench Pro it scores…

ComparisonJuly 15, 2026

Claude runs five times apart from its cheapest tier to its priciest. The capability gap is…

Anthropic's lineup runs $1/$5 for Haiku 4.5, $3/$15 for Sonnet 5, and $5/$25 for Opus 4.8 - a clean 5x from floor to ceiling. Priced by the…

ComparisonJuly 13, 2026

Grok 4.5 sells the cheapest token, Fable 5 sells the smartest one, and neither is the…

GPT-5.6 Sol ($5/$30), Claude Fable 5 ($10/$50), and Grok 4.5 ($2/$6) span five times over on price but bunch near the top on capability…

GuideJuly 12, 2026

Claude Fable 5 is back on subscriptions, but not the way it left. Included access is gone…

Anthropic pulled Fable 5 from Pro, Max, and Team plans in June on capacity, then brought it back behind metered usage credits at its full…

Model ReleaseJuly 11, 2026

The first model Meta ever charged for is priced to win agent work, not coding. At $1.25…

Meta opened its first paid API on July 9 with Muse Spark 1.1 at $1.25 input and $4.25 output per million tokens, roughly a quarter of what…

Model ReleaseJuly 10, 2026

Grok 4.5 undercuts Opus 4.8 by two-thirds and burns fewer tokens getting there. The…

xAI shipped Grok 4.5 on July 8 at $2 input and $6 output per million tokens. Against Opus 4.8, Fable 5, and GPT-5.6 Sol it runs 68 to 84…

Model ReleaseJuly 4, 2026

Meituan's LongCat-2.0 costs 75 cents a million and tops the open coder benchmarks. You…

LongCat-2.0 went public June 30 at $0.75 input and $2.95 output per million tokens, with a launch promo cutting that to $0.30/$1.20. It…

IndustryJuly 3, 2026

OpenAI built a chip to make inference cheaper. That is a different thing from making your…

Jalapeño, OpenAI's first custom inference processor with Broadcom, landed June 24 alongside Google's Ironwood TPUs and AWS Trainium 3. The…

Model ReleaseJuly 2, 2026

OpenAI split GPT-5.6 into three tiers and priced the flagship exactly where GPT-5.5…

GPT-5.6 arrived June 26 as Sol, Terra, and Luna at $5/$30, $2.50/$15, and $1/$6 per million tokens. Sol, the flagship, costs to the cent…

Model ReleaseJuly 1, 2026

Claude Sonnet 5 lands level with Opus 4.8 on an independent benchmark and asks 60% less…

Anthropic shipped Sonnet 5 on June 30 at $2/$10 per million tokens, an introductory rate that runs through August 31 before rising to…

GuideJune 30, 2026

DeepSeek V4 will charge double during its busiest hours. Whether you ever pay the…

From the mid-July V4 launch, DeepSeek doubles its API rates during two daily windows - 9am to 12pm and 2pm to 6pm Beijing time. It is the…

IndustryJune 29, 2026

The US gated its two newest flagships in a day. The models you can still call are open…

On June 26 the US held OpenAI's GPT-5.6 to government-approved partners and let Anthropic re-open Claude Mythos 5 to a short list of trusted…

Model ReleaseJune 28, 2026

Doubao 2.1 Pro undercuts Claude Opus by 80%. The price is real, but it is quoted in yuan…

ByteDance launched Seed 2.1 (Doubao 2.1) on June 24 with Pro and Turbo tiers priced only in yuan: roughly $0.88 input and $4.41 output per…

Model ReleaseJune 26, 2026

Nex-N2-Pro scores like a frontier coder and bills like a budget one. The free weights are…

A little-known lab, Nex AGI, shipped a 397B open-weight model that posts 80.8 on SWE-Bench Verified and costs about a tenth of GPT-5.5…

IndustryJune 25, 2026

Google promised Gemini 3.5 Pro in June. It is late, has no price, and no benchmarks.

Google teased Gemini 3.5 Pro at I/O on May 19 with a 2M context window and Deep Think, then went quiet. No rate card, no benchmark sheet…

GuideJune 24, 2026

Claude won't bill you for a refusal. A fallback credit stops you paying twice.

Claude Fable 5 has been dark for twelve days, but the billing rules it shipped with outlast it. Anthropic does not charge you for a refused…

ComparisonJune 22, 2026

Three frontier coders now cost under a dollar a million. The cheapest sticker is not the…

DeepSeek V4-Pro, Qwen3.7 Plus, and MiniMax M3 all undercut $0.60/M input. Sort by input price and Qwen wins; run an actual coding month and…

ComparisonJune 21, 2026

GLM-5.2 costs a fifth of GPT-5.5 and says it codes better. One of those claims you can…

Z.ai's open-weight GLM-5.2 undercuts GPT-5.5 by roughly 5x per token and self-reports a coding-benchmark win on top. The price is on the…

Model ReleaseJune 20, 2026

GPT-5.6 is probably days away. Here is what the leaks confirm and what your bill might do.

A gpt-5.6 ID leaked from OpenAI's own Codex backend, the chief scientist reportedly called it a meaningful improvement, and the release-date…

IndustryJune 19, 2026

Anthropic charges double and keeps winning. Now OpenAI might blink on price.

OpenAI and Anthropic filed to go public a week apart, then OpenAI was reported to be weighing token price cuts to slow Anthropic down. The…

IndustryJune 18, 2026

Claude Fable 5 lasted three days. Every alternative is cheaper.

Anthropic disabled Claude Fable 5 and Mythos 5 on June 12, 2026 under a US export-control order, three days after launch. Fable 5 was the…

ComparisonJune 17, 2026

GLM 5.2 and Kimi K2.7 Code shipped five days apart. You still cannot benchmark one against…

Two Chinese labs put out open-weight coding models in the same week. Kimi K2.7 Code lists at $0.95/$4.00 per million tokens, GLM 5.2 at…