Claude Haiku 3 retires April 19: it's not just a model ID swap
claude-3-haiku-20240307 hits end of life in 14 days. The replacement quadruples the per-token cost, and switching model strings will break your API calls in four different ways. Here is what to fix before the deadline.

Image source: Anthropic
claude-3-haiku-20240307retires April 19, 2026 - after that, API requests fail outright, no redirect- Replacement is
claude-haiku-4-5-20251001at $1.00/$5.00 per million tokens (was $0.25/$1.25) - Swapping the model string is not enough - four API-level breaking changes need handling first
- If you have no Anthropic lock-in: GPT-4o mini ($0.15/$0.60) and Gemini 2.5 Flash-Lite ($0.10/$0.40) are both cheaper than Haiku 3 was
What is being retired
Anthropic deprecated claude-3-haiku-20240307 on February 19, 2026. It is the last Claude 3 model still running on the production API - Sonnet 3 and Opus 3 exited earlier. The retirement date is April 19.
| Model | Deprecated | Retires | Replace with |
|---|---|---|---|
claude-3-haiku-20240307 | Feb 19, 2026 | Apr 19, 2026 | claude-haiku-4-5-20251001 |
There is a date discrepancy in Anthropic's own docs: the deprecation history table shows April 20, while the models overview warning says April 19. Plan for April 19. There is no documented grace period.
The price options
Haiku 4.5 costs roughly four times what Haiku 3 did on both input and output. That is not rounding error. The third-party options below are cheaper than Haiku 3 was.
| Model | Input / 1M | Output / 1M | Context | Max output | Status |
|---|---|---|---|---|---|
claude-3-haiku-20240307 | $0.25 | $1.25 | 200K | 4,096 | Retiring |
claude-haiku-4-5-20251001 | $1.00 | $5.00 | 200K | 64,000 | Current |
gpt-4o-mini | $0.15 | $0.60 | 128K | 16,384 | Current |
gemini-2.5-flash-lite | $0.10 | $0.40 | 1M | 65,536 | Current |
Haiku 4.5 batch pricing: $0.50/$2.50 per 1M tokens (50% off, async delivery up to 24h). Sources: Anthropic, OpenAI, Google
What migration costs per month
Baseline: 50M input tokens and 50M output tokens per month. A reasonable volume for a production classification or summarization pipeline.
| Path | Input | Output | Monthly | vs Haiku 3 |
|---|---|---|---|---|
| Haiku 3 (retiring) | $12.50 | $62.50 | $75.00 | baseline |
| Haiku 4.5, standard pricing | $50.00 | $250.00 | $300.00 | +300% |
| Haiku 4.5, batch (50% off) | $25.00 | $125.00 | $150.00 | +100% |
| GPT-4o mini | $7.50 | $30.00 | $37.50 | -50% |
| Gemini 2.5 Flash-Lite | $5.00 | $20.00 | $25.00 | -67% |
The highlighted row is where most Anthropic-committed teams will land. Batch processing halves the price, but your workload needs to tolerate async delivery - results come back within 24 hours, not immediately.
Prompt caching can bring the effective input cost down further. Cache hits on Haiku 4.5 cost $0.10/1M - 10% of the standard input rate. A 100K-token system prompt sent 1,000 times a day: $100/day in input costs without caching, $10/day after the first request.
Four things that break if you just swap the model string
Haiku 4.5 is a Claude 4-generation model. The API changed between Claude 3 and Claude 4. These are documented breaking changes in Anthropic's migration guide - not edge cases.
1. temperature and top_p cannot both be set
Claude 3 accepted both sampling parameters. Haiku 4.5 requires picking one. If your code sets both, the request fails. The fix is one line: remove top_p.
// Fails with claude-haiku-4-5
client.messages.create({
model: "claude-haiku-4-5-20251001",
temperature: 0.7,
top_p: 0.9, // remove this
...
});2. Two new stop reasons
Haiku 4.5 can return refusal and model_context_window_exceeded as stop reasons. If your code switches on stop_reason and has a default/else branch for unknown values, add explicit handling. A silently ignored refusal produces empty or partial output with no obvious error.
3. Trailing newlines in tool parameters are preserved
Claude 3 stripped trailing newlines from tool parameter values. Haiku 4.5 passes them through. Parsers expecting trimmed strings will now receive values like "value\n" instead of "value". Add explicit .trim() calls or update parsers before migrating.
4. Rate limits are separate
Haiku 4.5 has its own rate limit tier, not shared with Haiku 3. If you had elevated quota approved on Haiku 3, it does not carry over automatically. Check the Anthropic Console and request matching limits on Haiku 4.5 before flipping production traffic.
What you get for quadruple the price
You are paying $0.75 more per million input tokens and $3.75 more on output. The question is whether the capabilities justify that jump for your workload.
| Feature | Haiku 3 | Haiku 4.5 |
|---|---|---|
| Max output tokens | 4,096 | 64,000 |
| GPQA Diamond | 40.4% | 67.2% |
| SWE-bench Verified | - | 73.3% |
| Knowledge cutoff | Aug 2023 | Feb 2025 |
| Extended thinking | No | Yes |
| Computer use | No | Yes |
| Context window | 200K | 200K |
The 4k output limit on Haiku 3 forced workarounds for anything generating long content. 64k removes that ceiling entirely. The GPQA Diamond jump from 40% to 67% reflects real improvement on domain reasoning - and Artificial Analysis gives Haiku 4.5 an intelligence index well above its price tier median.
For classification and extraction workloads - the kind of thing Haiku 3 ran most commonly - the quality gap is smaller. Summarization pipelines are similar. These are the use cases where GPT-4o mini or Gemini Flash-Lite make the most financial sense.
Picking a replacement
You need to stay on Anthropic (compliance, tooling, existing evals)
Go to Haiku 4.5 with batch processing at $0.50/$2.50. That is double the cost of Haiku 3, not quadruple. You also get the 64k output limit, better reasoning, and a knowledge cutoff 18 months more recent. Handle the four breaking changes before you flip the switch.
Cost is the priority and you have no provider lock-in
Test Gemini 2.5 Flash-Lite at $0.10/$0.40. It costs 60% less than Haiku 3 and has a 1M context window. For classification and structured extraction, quality is competitive. If it misses your bar, GPT-4o mini at $0.15/$0.60 is the next step up.
You need agentic coding or computer use
Haiku 4.5 at standard pricing is worth it here. 73.3% on SWE-bench Verified is competitive with models priced much higher. Haiku 3 could not do any of this - you are paying for an entirely new capability set, not a marginal quality bump.
On Google Cloud Vertex AI, you have four more months
Claude Haiku 3 on Vertex AI follows a different schedule. Google deprecated it on February 23, 2026, but the shutdown date is August 23, 2026. The Vertex model ID is claude-3-haiku@20240307.
If you are calling Claude through the Vertex AI API, the April 19 deadline does not apply. Plan for August 23 instead - and expect the same API breaking changes when you get there, since you will still be migrating to a Claude 4.x model.
Migration checklist
- 1.Search your codebase for
claude-3-haiku-20240307. Check config files, environment files, and hardcoded strings in tests. - 2.At each callsite, check whether both
temperatureandtop_pare set. Removetop_pbefore migrating. - 3.Add handling for
refusalandmodel_context_window_exceededin any code that readsstop_reason. - 4.If your tool handlers parse model output strings, add
.trim()or update parsers to handle trailing newlines that were previously stripped. - 5.Check rate limits for Haiku 4.5 in the Anthropic Console. Elevated quota on Haiku 3 does not transfer - request new limits before migration.
- 6.Run your actual prompts against Haiku 4.5 in staging before switching production. The model style is more concise - some prompts will need adjusting to get equivalent output length.
FAQ
When exactly does Claude Haiku 3 retire?
Anthropic's deprecation history table lists April 19, 2026. The models overview page shows April 20 in some places. Plan for April 19 - after that date, API requests return errors with no fallback.
Does the API redirect to a newer model or fail outright?
Requests fail. Anthropic's documentation says retired models return errors, not redirects. There is no automatic fallback to claude-haiku-4-5.
Is this just a model string change?
No. Haiku 4.5 is a Claude 4-generation model with API-level breaking changes: temperature and top_p cannot both be set, two new stop reasons were added, trailing newlines in tool parameters are now preserved, and rate limits are tracked separately.
What is the cheapest Anthropic replacement?
Claude Haiku 4.5 with batch processing at $0.50/$2.50 per million tokens - double Haiku 3's price, not four times. Batch delivery takes up to 24 hours.
What is the cheapest replacement across all providers?
Gemini 2.5 Flash-Lite at $0.10/$0.40 per million tokens - 60% less than Haiku 3 on input. GPT-4o mini at $0.15/$0.60 is also cheaper. Both work well for classification and extraction workloads.
What is the Vertex AI shutdown date?
August 23, 2026 - four months after the Anthropic API retirement. The Vertex model ID is claude-3-haiku@20240307.