Cheapest LLM Models in 2026
All 60+ LLM API models ranked from cheapest to most expensive by average token price. Find the most budget-friendly AI model for your project.
| Rank | Model | Provider | Input/1M | Output/1M | Avg/1M | Context |
|---|---|---|---|---|---|---|
| #1 | Llama 3.3 70B | Meta | $0.18 | $0.18 | $0.18 | 131K |
| #2 | Gemini 2.0 Flash-Lite | $0.075 | $0.3 | $0.19 | 1.0M | |
| #3 | Llama 4 Scout | Meta | $0.08 | $0.3 | $0.19 | 1.0M |
| #4 | Mistral Small | Mistral | $0.1 | $0.3 | $0.20 | 128K |
| #5 | GPT-5 Nano | OpenAI | $0.05 | $0.4 | $0.23 | 128K |
| #6 | GPT-4.1 Nano | OpenAI | $0.1 | $0.4 | $0.25 | 1.0M |
| #7 | Gemini 2.5 Flash-Lite | $0.1 | $0.4 | $0.25 | 1.0M | |
| #8 | Gemini 2.0 Flash | $0.1 | $0.4 | $0.25 | 1.0M | |
| #9 | Grok 4.1 Fast | xAI | $0.2 | $0.5 | $0.35 | 131K |
| #10 | Grok 4.1 Fast Reasoning | xAI | $0.2 | $0.5 | $0.35 | 131K |
| #11 | DeepSeek V3.2 (Chat) | DeepSeek | $0.28 | $0.42 | $0.35 | 128K |
| #12 | DeepSeek V3.2 (Reasoner) | DeepSeek | $0.28 | $0.42 | $0.35 | 128K |
| #13 | GPT-4o Mini | OpenAI | $0.15 | $0.6 | $0.38 | 128K |
| #14 | Nemotron 3 Super 120B | NVIDIA | $0.3 | $0.8 | $0.55 | 1.0M |
| #15 | Llama 4 Maverick | Meta | $0.27 | $0.85 | $0.56 | 1.0M |
| #16 | MiniMax M2.5 | MiniMax | $0.3 | $1.2 | $0.75 | 128K |
| #17 | Gemini 3.1 Flash-Lite | $0.25 | $1.5 | $0.88 | 1.0M | |
| #18 | GPT-4.1 Mini | OpenAI | $0.4 | $1.6 | $1.00 | 1.0M |
| #19 | Mistral Large 3 | Mistral | $0.5 | $1.5 | $1.00 | 262K |
| #20 | GPT-5 Mini | OpenAI | $0.25 | $2 | $1.13 | 400K |
| #21 | Qwen 3.5 27B | Alibaba | $0.3 | $2.4 | $1.35 | 128K |
| #22 | Gemini 2.5 Flash | $0.3 | $2.5 | $1.40 | 1.0M | |
| #23 | Nova 2.0 Lite | Amazon | $0.3 | $2.5 | $1.40 | 128K |
| #24 | Gemini 3 Flash | $0.5 | $3 | $1.75 | 1.0M | |
| #25 | Gemini 3 Flash Reasoning | $0.5 | $3 | $1.75 | 1.0M | |
| #26 | Kimi K2.5 | Moonshot | $0.6 | $3 | $1.80 | 128K |
| #27 | Qwen 3.5 397B | Alibaba | $0.6 | $3.6 | $2.10 | 128K |
| #28 | GLM-5 | Zhipu | $1 | $3.2 | $2.10 | 128K |
| #29 | Claude Haiku 3.5 | Anthropic | $0.8 | $4 | $2.40 | 200K |
| #30 | o4 Mini | OpenAI | $1.1 | $4.4 | $2.75 | 200K |
| #31 | o3 Mini | OpenAI | $1.1 | $4.4 | $2.75 | 200K |
| #32 | Claude Haiku 4.5 | Anthropic | $1 | $5 | $3.00 | 200K |
| #33 | Claude 4.5 Haiku Reasoning | Anthropic | $1 | $5 | $3.00 | 200K |
| #34 | DeepSeek R1 | DeepSeek | $1.35 | $5.4 | $3.38 | 128K |
| #35 | Grok 4-20 | xAI | $2 | $6 | $4.00 | 131K |
| #36 | GPT-4.1 | OpenAI | $2 | $8 | $5.00 | 1.0M |
| #37 | o3 | OpenAI | $2 | $8 | $5.00 | 200K |
| #38 | GPT-5.1 | OpenAI | $1.25 | $10 | $5.63 | 400K |
| #39 | GPT-5 | OpenAI | $1.25 | $10 | $5.63 | 400K |
| #40 | GPT-5 Medium | OpenAI | $1.25 | $10 | $5.63 | 400K |
| #41 | Gemini 2.5 Pro | $1.25 | $10 | $5.63 | 1.0M | |
| #42 | Nova 2.0 Pro Reasoning | Amazon | $1.25 | $10 | $5.63 | 128K |
| #43 | GPT-4o | OpenAI | $2.5 | $10 | $6.25 | 128K |
| #44 | Command A | Cohere | $2.5 | $10 | $6.25 | 128K |
| #45 | Gemini 3.1 Pro | $2 | $12 | $7.00 | 1.0M | |
| #46 | Gemini 3 Pro | $2 | $12 | $7.00 | 1.0M | |
| #47 | GPT-5.2 | OpenAI | $1.75 | $14 | $7.88 | 400K |
| #48 | GPT-5.3 Codex | OpenAI | $1.75 | $14 | $7.88 | 400K |
| #49 | GPT-5.4 | OpenAI | $2.5 | $15 | $8.75 | 1.1M |
| #50 | Claude Sonnet 4.6 Adaptive | Anthropic | $3 | $15 | $9.00 | 200K |
| #51 | Claude Sonnet 4.6 | Anthropic | $3 | $15 | $9.00 | 200K |
| #52 | Claude Sonnet 4.5 | Anthropic | $3 | $15 | $9.00 | 200K |
| #53 | Claude Sonnet 4 | Anthropic | $3 | $15 | $9.00 | 200K |
| #54 | Grok 4 | xAI | $3 | $15 | $9.00 | 2.0M |
| #55 | Sonar Pro | Perplexity | $3 | $15 | $9.00 | 128K |
| #56 | Claude Opus 4.6 Adaptive | Anthropic | $5 | $25 | $15.00 | 200K |
| #57 | Claude Opus 4.6 | Anthropic | $5 | $25 | $15.00 | 200K |
| #58 | Claude Opus 4.5 | Anthropic | $5 | $25 | $15.00 | 200K |
| #59 | o1 | OpenAI | $15 | $60 | $37.50 | 200K |
| #60 | GPT-5.4 Pro | OpenAI | $30 | $180 | $105.00 | 1.1M |
How We Rank the Cheapest LLMs
Models are ranked by their average price per 1 million tokens, calculated as (input price + output price) / 2. This gives a balanced view of overall cost since most workloads use both input and output tokens.
Keep in mind that the cheapest model isn't always the best choice. Consider quality benchmarks, context window size, and output speed when making your decision. Use our leaderboard to compare quality alongside cost.