The AI Price Index: How LLM costs dropped 300x in three years
In March 2023, GPT-4 cost $30 per million input tokens. Today, you can get GPT-4-level quality for under $0.10. We tracked every major price change across OpenAI, Anthropic, Google, and DeepSeek to build this index. Here's the full timeline.

Photo by Stephen Dawson on Unsplash
TL;DR
- -GPT-4 launched at $30/1M input tokens in March 2023. By July 2024, GPT-4o mini delivered GPT-4-class performance for $0.15/1M. That's a 200x drop in 16 months.
- -The cheapest capable model today (Gemini 2.0 Flash) costs $0.10/1M input - about 300x less than GPT-4's launch price.
- -DeepSeek's cache-hit pricing ($0.07/1M) is over 400x cheaper than what GPT-4 cost three years ago.
- -Frontier models (the best available at any given time) have gotten about 12x cheaper: GPT-4 at $30/1M vs. GPT-5.4 at $2.50/1M.
Why track this?
We run a pricing comparison table that covers 400+ models. We update it every six hours. After doing this for a while, we started noticing something: the numbers kept getting smaller. Fast.
So we went back and collected every major pricing change from every major provider since March 2023. What came out of it is something like a consumer price index, but for AI tokens. We think it's useful for anyone trying to plan an AI budget, negotiate with a provider, or just figure out whether to wait another six months before building something.
All prices below are per 1 million tokens. Input prices unless otherwise noted. Sources are linked at the bottom.
OpenAI flagship model input price (per 1M tokens)
From $30.00 to $2.50 in three years. GPT-4o mini (green) hit $0.15 for GPT-4-class quality.
OpenAI: from $30 to $0.15 in 16 months
OpenAI set the starting point for this whole market. Their pricing moves over the past three years tell most of the story.
| Date | Model | Input / 1M | Output / 1M | vs. GPT-4 |
|---|---|---|---|---|
| Mar 2023 | GPT-4 (8K) | $30.00 | $60.00 | baseline |
| Mar 2023 | GPT-3.5 Turbo | $2.00 | $2.00 | 15x cheaper |
| Nov 2023 | GPT-4 Turbo | $10.00 | $30.00 | 3x cheaper |
| Jan 2024 | GPT-3.5 Turbo(cut) | $0.50 | $1.50 | 60x cheaper |
| May 2024 | GPT-4o | $5.00 | $15.00 | 6x cheaper |
| Jul 2024 | GPT-4o mini | $0.15 | $0.60 | 200x cheaper |
| Apr 2025 | GPT-4.1 | $2.00 | $8.00 | 15x cheaper |
| Mar 2026 | GPT-5.4 | $2.50 | $15.00 | 12x cheaper |
The GPT-4o mini launch in July 2024 was the turning point. For the first time, you could get GPT-4-level quality for under a dollar per million tokens. That was 16 months after GPT-4 launched at $30.
There's an interesting split in how prices moved. The frontier model (the best thing OpenAI sells at any given moment) went from $30 to $2.50 - about 12x. But the price of "good enough" dropped much faster. GPT-4o mini matches or beats the original GPT-4 on most benchmarks, and it costs 200x less.
That's the pattern worth paying attention to: the floor drops faster than the ceiling. If you're building something where last year's best model is fine, you're getting a better deal every quarter.
Anthropic: the Opus price cut nobody expected
Anthropic took a different path. Their premium tier stayed expensive for longer, then they cut it all at once.
| Date | Model | Input / 1M | Output / 1M |
|---|---|---|---|
| Jul 2023 | Claude 2 | $8.00 | $24.00 |
| Jul 2023 | Claude Instant 1.x | $0.80 | $2.40 |
| Mar 2024 | Claude 3 Opus | $15.00 | $75.00 |
| Mar 2024 | Claude 3 Sonnet | $3.00 | $15.00 |
| Mar 2024 | Claude 3 Haiku | $0.25 | $1.25 |
| Jun 2024 | Claude 3.5 Sonnet | $3.00 | $15.00 |
| May 2025 | Claude Opus 4 | $15.00 | $75.00 |
| ~Nov 2025 | Claude Opus 4.5-67% | $5.00 | $25.00 |
| 2026 | Claude Opus 4.6 | $5.00 | $25.00 |
The big moment for Anthropic was Claude Opus 4.5. For over a year, Opus had been priced at $15/$75 - by far the most expensive mainstream model. Then with 4.5, they dropped it to $5/$25. A 67% cut, overnight.
Meanwhile, Sonnet held steady at $3/$15 through four generations (3, 3.5, 4, 4.5, 4.6). Same price, increasingly better model each time. If you've been on Sonnet since 2024, you've effectively gotten three free upgrades.
Haiku tells a similar story: $0.25/$1.25 at launch in March 2024, up slightly to $1.00/$5.00 for the 4.5 version, but with far better capability. The cost of "intelligence per dollar" keeps going down even when the sticker price stays the same.
Google: the quiet price leader
Google doesn't get as much press for pricing moves, but they've consistently been the cheapest option for capable models.
| Date | Model | Input / 1M | Output / 1M |
|---|---|---|---|
| Dec 2023 | Gemini 1.0 Pro | $0.125 | $0.375 |
| May 2024 | Gemini 1.5 Pro | $3.50 | $10.50 |
| May 2024 | Gemini 1.5 Flash | $0.35 | $0.53 |
| Aug 2024 | Gemini 1.5 Flash-78% | $0.075 | $0.30 |
| Oct 2024 | Gemini 1.5 Pro-64% | $1.25 | $5.00 |
| 2025 | Gemini 2.0 Flash | $0.10 | $0.40 |
| 2026 | Gemini 2.5 Pro | $1.25 | $10.00 |
Google came out swinging with Gemini 1.0 Pro in December 2023, priced well below both OpenAI and Anthropic. At the time, GPT-4 Turbo was $10. Google was offering a capable model for a fraction of that.
The Flash line has been the real story though. It launched at $0.35/1M in May 2024, then Google cut it by 78% just three months later to $0.075/1M. By the time Gemini 2.0 Flash hit at $0.10/1M, the "300x cheaper than GPT-4" math checked out. And each version keeps getting better while the price barely moves.
DeepSeek: the price shock that changed everything
If Google was the quiet price leader, DeepSeek was the loud one. Their pricing in late 2024 forced every other provider to rethink their margins.
| Date | Model | Input / 1M | Output / 1M |
|---|---|---|---|
| Dec 2024 | DeepSeek V3 | $0.14 | $0.28 |
| Jan 2025 | DeepSeek R1 | $0.55 | $2.19 |
| Late 2025 | DeepSeek V3.2 | $0.27 | $1.10 |
| Late 2025 | V3.2(cache hit) | $0.07 | $1.10 |
When DeepSeek V3 landed in December 2024 at $0.14/1M, the reaction in the industry was somewhere between disbelief and panic. This was a frontier-class model priced at roughly 1/100th of what GPT-4 cost at launch. And it was good. Not "good for the price" good, but genuinely competitive on benchmarks.
Then came R1 in January 2025 - a reasoning model at $0.55/$2.19. For context, OpenAI's o1-preview (their first reasoning model) had launched just four months earlier at $15/$60. DeepSeek was offering a comparable capability at a 97% discount.
The cache-hit pricing of $0.07/1M input is striking. That's over 400x cheaper than GPT-4's launch price. Even accounting for the fact that cache-hit pricing only applies to repeated prompts, it shows where the floor can go.
The race to the bottom: cheapest capable model over time
Input price per 1M tokens. Each bar is the cheapest "good enough" option at that point.
Mistral and the 90% overnight cut
Worth a quick mention: Mistral slashed their Small model pricing by 90% in September 2024, going from $1/$3 to $0.10/$0.30. Their Large model went from $3/$9 to $2/$6 around the same time.
European providers like Mistral don't get as much pricing attention, but they've been part of the same downward pressure. When everyone is racing to the bottom, staying expensive is not an option. You can compare all of these on our full pricing table.
The full timeline: 14 moments that shaped the price
If you want the condensed version of three years of pricing moves, here it is:
Three reasons prices keep falling
It's not one thing. Three forces are working together, and they're all accelerating.
In early 2023, OpenAI had the market to themselves. By late 2024, there were at least six providers with frontier-class models (OpenAI, Anthropic, Google, DeepSeek, Mistral, Meta via API partners). DeepSeek's entry in particular forced across-the-board repricing. When someone offers comparable quality at 1/100th the price, margins evaporate.
The shift from H100 to B200 GPUs, better inference optimization (speculative decoding, quantization), and data center scale economies have all reduced the actual cost of running these models. NVIDIA's own estimates suggest inference cost per token dropped roughly 10x between the A100 and B200 generations.
Mixture of Experts (MoE) architectures, used by DeepSeek, Mixtral, and others, only activate a fraction of the model's total parameters for each token. DeepSeek V3 has 671B total parameters but activates only 37B per token. Better training recipes and distillation techniques mean smaller models can match what bigger ones did a year ago.
The combined effect is something like Moore's Law for AI inference: the cost of a given level of capability halves roughly every six to eight months. Unlike Moore's Law though, there's no obvious physical limit slowing this down yet.
Where the money is going
Prices are crashing, but total spending is going up. Way up.
Gartner projects worldwide AI spending will hit $2.52 trillion in 2026, a 44% increase over 2025. According to their research, 80%+ of enterprises will have used generative AI APIs or deployed GenAI-enabled applications by 2026, up from under 5% in early 2023.
That's the classic technology adoption pattern: lower prices unlock more use cases, which drives more total spending even as the unit cost drops. A company that couldn't justify $30/1M tokens for a chatbot can absolutely justify $0.15/1M tokens. And once they're in, they find more things to do with it.
For anyone planning an AI budget right now, the practical takeaway is: whatever you think it'll cost, it'll probably cost less by the time you ship. Plan for today's prices, but know that your margins will likely improve as you scale. Use our cost calculator to estimate your current costs, and check back as prices update.
What happens next
We don't have a crystal ball. But looking at the trajectory, a few things seem likely.
Frontier models will keep getting cheaper, but slowly. The 12x drop from GPT-4 to GPT-5.4 happened over three years. Don't expect the bleeding edge to be cheap anytime soon - providers need margins to fund training runs that cost hundreds of millions.
The "good enough" tier will keep getting better and cheaper. This is where the 200x-300x drops happen. Last year's frontier model becomes this year's budget option. If your application can tolerate being one generation behind, you save an order of magnitude.
Prompt caching and batch processing will matter more. Both OpenAI and Anthropic offer 50-90% discounts on cached or batched requests. As base prices fall, these optimizations become the main lever for cutting costs further. We wrote more about this in our GPT-5.4 pricing breakdown.
The bottom line
Three years ago, using GPT-4 for anything at scale was a serious budget line item. Today, you can get that same level of capability for the cost of a rounding error. The price of "good enough" AI dropped 200-300x. The price of the best available dropped 12x. Both numbers are still falling.
We'll keep updating this index as prices change. If you want to track current prices across all 400+ models, check our live pricing table, updated every six hours.
Sources
- 1.OpenAI API Pricing - openai.com/api/pricing
- 2.OpenAI DevDay November 2023 Announcements - openai.com
- 3.Anthropic Claude Pricing - docs.anthropic.com
- 4.Google Gemini API Pricing - ai.google.dev
- 5.DeepSeek API Pricing - api-docs.deepseek.com
- 6.CNBC: "OpenAI announces GPT-4 Turbo and cuts prices" (Nov 2023) - cnbc.com
- 7.VentureBeat: "DeepSeek V3.2 cuts API pricing in half" - venturebeat.com
- 8.Gartner: "AI Spending to Total $2.5 Trillion in 2026" - gartner.com
- 9.Gartner: "80% of Enterprises Will Have Used GenAI APIs by 2026" - gartner.com
- 10.Google Developers Blog: Gemini Price Reductions - developers.googleblog.com
- 11.Mistral AI Pricing - docs.mistral.ai
- 12.Fradkin et al., "The Emerging Market for Intelligence" (Dec 2025) - andreyfradkin.com