Skip to main content
TokenCost logoTokenCost

TokenBlog

Model releases, pricing breakdowns, and practical guides for developers.

Voice AI APIs in 2026: what Gemini TTS, Voxtral TTS, and OpenAI TTS actually cost per hour
ComparisonApril 20, 2026·7 min read

Voice AI APIs in 2026: what Gemini TTS, Voxtral TTS, and OpenAI TTS actually cost per hour

Per-token pricing makes TTS costs hard to reason about. We converted Gemini 3.1 Flash TTS, Mistral Voxtral TTS, OpenAI TTS-1, and ElevenLabs to cost per hour of audio at 150 wpm. The range is $0.74-$1.81/hr for API-native options - versus $8-12/hr for ElevenLabs overage.

Gemini 3 Flash: $0.50 per million tokens, thinking on by default, and it actually beats Pro on agentic tasks
Model ReleaseApril 19, 2026·7 min read

Gemini 3 Flash: $0.50 per million tokens, thinking on by default, and it actually beats Pro on agentic tasks

Tokenmaxxing can inflate your LLM API bill by 10x. On Gemini and GPT-5.4, it's worse.
GuideApril 18, 2026·7 min read

Tokenmaxxing can inflate your LLM API bill by 10x. On Gemini and GPT-5.4, it's worse.

Claude Opus 4.7: $5 per million tokens - and what that actually means now
Model ReleaseApril 17, 2026·8 min read

Claude Opus 4.7: $5 per million tokens - and what that actually means now

Claude Code Routines: what each automated run actually costs
GuideApril 16, 2026·7 min read

Claude Code Routines: what each automated run actually costs

GPT-5.4 computer use: what a real agent task actually costs
GuideApril 14, 2026·8 min read

GPT-5.4 computer use: what a real agent task actually costs

DeepSeek V4: $0.30 per million tokens for a 1 trillion parameter model
Model ReleaseApril 13, 2026·8 min read

DeepSeek V4: $0.30 per million tokens for a 1 trillion parameter model

Chatbot Arena April 2026: Claude leads everything, Grok 4.20 has the cheapest output
ComparisonApril 12, 2026·7 min read

Chatbot Arena April 2026: Claude leads everything, Grok 4.20 has the cheapest output

OpenAI's new $100 ChatGPT Pro: what you actually get on Codex, and when the API wins anyway
ComparisonApril 11, 2026·7 min read

OpenAI's new $100 ChatGPT Pro: what you actually get on Codex, and when the API wins anyway

GPT-5.5 "Spud": release date, pricing forecast, and what we actually know right now
Model ReleaseApril 11, 2026·7 min read

GPT-5.5 "Spud": release date, pricing forecast, and what we actually know right now

LLM API pricing in April 2026: from $0.05 to $125 per million tokens
ComparisonApril 10, 2026·9 min read

LLM API pricing in April 2026: from $0.05 to $125 per million tokens

Meta Muse Spark: no API pricing, no open weights, and one area where it's best in the world
IndustryApril 9, 2026·8 min read

Meta Muse Spark: no API pricing, no open weights, and one area where it's best in the world

Project Glasswing and Claude Mythos Preview: the AI that found a 27-year-old bug for under $50
IndustryApril 8, 2026·9 min read

Project Glasswing and Claude Mythos Preview: the AI that found a 27-year-old bug for under $50

Is Claude Code getting worse? The data says something did change on March 8.
ResearchApril 8, 2026·8 min read

Is Claude Code getting worse? The data says something did change on March 8.

Claude Max subscribers using OpenClaw now pay API rates. Here's the math.
IndustryApril 8, 2026·7 min read

Claude Max subscribers using OpenClaw now pay API rates. Here's the math.

Microsoft MAI models: what MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 actually cost
Model ReleaseApril 7, 2026·7 min read

Microsoft MAI models: what MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 actually cost

Gemini 3.1 Pro: $2 input, tied for #1 on benchmarks, and 20% cheaper than GPT-5.4
Model ReleaseApril 7, 2026·8 min read

Gemini 3.1 Pro: $2 input, tied for #1 on benchmarks, and 20% cheaper than GPT-5.4

Qwen3.5-Omni: the pricing, the audio benchmarks, and whether the architecture hype is real
Model ReleaseApril 6, 2026·7 min read

Qwen3.5-Omni: the pricing, the audio benchmarks, and whether the architecture hype is real

Qwen3.6-Plus: $0.28 per million input tokens, and the benchmark comparison Alibaba chose not to lead with
Model ReleaseApril 6, 2026·7 min read

Qwen3.6-Plus: $0.28 per million input tokens, and the benchmark comparison Alibaba chose not to lead with

Claude Haiku 3 retires April 19: it's not just a model ID swap
GuideApril 5, 2026·7 min read

Claude Haiku 3 retires April 19: it's not just a model ID swap

Gemma 4 is out: $0.14 per million tokens for a 31B model scoring 89% on AIME
Model ReleaseApril 5, 2026·8 min read

Gemma 4 is out: $0.14 per million tokens for a 31B model scoring 89% on AIME

Reasoning models in 2026: $0.55 to $20 per million tokens, and when each tier makes sense
ComparisonApril 5, 2026·7 min read

Reasoning models in 2026: $0.55 to $20 per million tokens, and when each tier makes sense

Gemini Flex and Priority inference: how Google's new tiers work and what they cost
GuideApril 4, 2026·8 min read

Gemini Flex and Priority inference: how Google's new tiers work and what they cost

Google Gemini API billing caps are live: what developers need to know
IndustryApril 1, 2026·7 min read

Google Gemini API billing caps are live: what developers need to know

OpenAI Deep Research API: what it costs, and why o3-deep-research is 5x pricier than o3
GuideApril 1, 2026·7 min read

OpenAI Deep Research API: what it costs, and why o3-deep-research is 5x pricier than o3

OpenAI killed Sora. The math explains why.
IndustryMarch 31, 2026·9 min read

OpenAI killed Sora. The math explains why.

ARC-AGI-3: the benchmark no AI can crack, and what running it costs
ResearchMarch 31, 2026·8 min read

ARC-AGI-3: the benchmark no AI can crack, and what running it costs

Gemini 2.0 Flash is deprecated: what migration actually costs you
GuideMarch 30, 2026·8 min read

Gemini 2.0 Flash is deprecated: what migration actually costs you

Kimi K2.5 vs GPT-5.4: the model Cursor built on, and what it actually costs
ComparisonMarch 29, 2026·9 min read

Kimi K2.5 vs GPT-5.4: the model Cursor built on, and what it actually costs

OpenAI Codex pricing: API costs, container billing, and how it stacks up against Claude Code
ComparisonMarch 28, 2026·8 min read

OpenAI Codex pricing: API costs, container billing, and how it stacks up against Claude Code

How much does Claude Code actually cost per session?
GuideMarch 28, 2026·8 min read

How much does Claude Code actually cost per session?

Claude Mythos pricing: what Anthropic's leaked new model will cost developers
Model ReleaseMarch 27, 2026·7 min read

Claude Mythos pricing: what Anthropic's leaked new model will cost developers

Grok 4.20 Beta: $2 per million tokens, 2M context, and the lowest hallucination rate measured so far
Model ReleaseMarch 26, 2026·8 min read

Grok 4.20 Beta: $2 per million tokens, 2M context, and the lowest hallucination rate measured so far

Llama 4 Scout vs Maverick: API pricing, self-hosting costs, and which one to use
ComparisonMarch 26, 2026·9 min read

Llama 4 Scout vs Maverick: API pricing, self-hosting costs, and which one to use

DeepSeek V3.2 vs GPT-5.4: Is the 30x price gap worth it?
ComparisonMarch 25, 2026·8 min read

DeepSeek V3.2 vs GPT-5.4: Is the 30x price gap worth it?

Qwen3.5 Small: the 9B model that beats gpt-oss-120B on four benchmarks
Model ReleaseMarch 24, 2026·7 min read

Qwen3.5 Small: the 9B model that beats gpt-oss-120B on four benchmarks

Anthropic drops the 2x long-context surcharge: what Claude now costs at 1M tokens
IndustryMarch 24, 2026·7 min read

Anthropic drops the 2x long-context surcharge: what Claude now costs at 1M tokens

Xiaomi MiMo-V2-Pro: the trillion-parameter model that fooled everyone into thinking it was DeepSeek
Model ReleaseMarch 23, 2026·9 min read

Xiaomi MiMo-V2-Pro: the trillion-parameter model that fooled everyone into thinking it was DeepSeek

Gemini 3.1 Flash-Lite: $0.25 per million tokens, 1M context, and benchmark scores that beat Claude Haiku
Model ReleaseMarch 23, 2026·8 min read

Gemini 3.1 Flash-Lite: $0.25 per million tokens, 1M context, and benchmark scores that beat Claude Haiku

Mistral Small 4: $0.15 per million input tokens for a multimodal MoE model
Model ReleaseMarch 23, 2026·7 min read

Mistral Small 4: $0.15 per million input tokens for a multimodal MoE model

GPT-5.4 Mini vs Nano: pricing, benchmarks, and which one to use
ComparisonMarch 23, 2026·9 min read

GPT-5.4 Mini vs Nano: pricing, benchmarks, and which one to use

Claude 5 release date: what Anthropic has actually said
IndustryMarch 20, 2026·8 min read

Claude 5 release date: what Anthropic has actually said

How to cut your LLM API bill by 60% without changing models
GuideMarch 20, 2026·10 min read

How to cut your LLM API bill by 60% without changing models

The AI Price Index: How LLM costs dropped 300x in three years
ResearchMarch 20, 2026·12 min read

The AI Price Index: How LLM costs dropped 300x in three years

GLM-5 Turbo: the first model built for OpenClaw. Is it worth $1.20 per million tokens?
Model ReleaseMarch 16, 2026·6 min read

GLM-5 Turbo: the first model built for OpenClaw. Is it worth $1.20 per million tokens?

NVIDIA Nemotron 3 Super: Pricing, Benchmarks & What 12B Active Parameters Actually Gets You
Model ReleaseMarch 13, 2026·8 min read

NVIDIA Nemotron 3 Super: Pricing, Benchmarks & What 12B Active Parameters Actually Gets You

Gemini Embedding 2: Pricing, Limits, and How It Compares to OpenAI
Model ReleaseMarch 11, 2026·6 min read

Gemini Embedding 2: Pricing, Limits, and How It Compares to OpenAI

GPT-6 Release Date: What We Actually Know Right Now
Model ReleaseMarch 10, 2026·7 min read

GPT-6 Release Date: What We Actually Know Right Now

Anthropic Built a Marketplace. No Commission, Complicated Timing.
IndustryMarch 9, 2026·5 min read

Anthropic Built a Marketplace. No Commission, Complicated Timing.

OpenAI GPT-5.4: Pricing, Benchmarks & What Developers Actually Need to Know
Model ReleaseMarch 5, 2026·8 min read

OpenAI GPT-5.4: Pricing, Benchmarks & What Developers Actually Need to Know

GPT-5.4 vs Claude Opus 4.6 vs Gemini 3.1 Pro: Which One Should You Actually Use?
ComparisonMarch 6, 2026·10 min read

GPT-5.4 vs Claude Opus 4.6 vs Gemini 3.1 Pro: Which One Should You Actually Use?