Google Gemini API billing caps are live: what developers need to know
As of April 1, 2026, Google enforces mandatory monthly spend caps on all Gemini API billing accounts. Tier 1 accounts hit a $250/month wall. Here's how the system works, what triggers a tier upgrade, and what your options are if the cap is a problem.

Photo by Matthieu Beaumont on Unsplash
Starting today, Google caps Gemini API spending at the account level - $250/month for Tier 1, $2,000 for Tier 2. There is a 10-minute window after you hit the ceiling where requests keep processing and billing, and you are on the hook for the overage. Tier 3 requires either $1,000 in real paid charges or a direct call to Google Cloud sales - the sales route is usually faster if you need scale now.
Why this is happening now
Google was the last major AI provider without spending controls, and two billing disasters forced the issue. In August 2025, a pricing bug caused Gemini 2.5 Flash to charge for "Native Image Generation" tokens that were never actually generated. One developer reported $70,000+ in erroneous charges. Another watched their bill jump $200 in 20 minutes after they had already deleted their API keys. Google issued credits rather than refunds - and developers who filed bank disputes had their entire Google account (Cloud, Play, YouTube) suspended and were required to upload government ID for reinstatement.
Then in February 2026, a three-developer startup in Mexico had their Gemini API key stolen. Their normal monthly spend: $180. In 48 hours, the stolen key ran up $82,314.44. Researchers at Truffle Security had separately found 2,863 live Gemini API keys exposed in public repositories - the keys were originally designed as non-secret identifiers, which made this far too easy.
Anthropic and OpenAI offered developer-configurable limits years before any of this. Google did not. The caps are overdue. The implementation, though, is more complicated than what competitors built - the developer forum thread shows just how confused the rollout has been, with multiple conflicting explanations from different Google sources.
The four billing tiers
Your tier determines your monthly ceiling. The caps are mandatory and cannot be disabled:
| Tier | Monthly cap | How to qualify |
|---|---|---|
| Free | No cap (rate-limited) | Active project, no billing required |
| Tier 1 | $250/month | Link a billing account |
| Tier 2 | $2,000/month | $250 real spend + 30 days |
| Tier 3 | $20,000-$100,000+/month | $1,000 real spend + 30 days, or enterprise sales |
Free trial credits and promotional credits do not count toward tier qualification. Only real charges to a payment method count.
The 10-minute billing gap
When your account hits a tier cap, Google's system takes up to 10 minutes to detect it and block requests. During that window, API calls keep processing and keep billing. Google's documentation is explicit: "users are responsible for overages incurred during that period."
For a heavy Gemini 3.1 Pro workload ($2.00 input / $12.00 output per million tokens), 10 minutes of uncapped high-throughput traffic is real money. The practical fix: set a project-level cap in AI Studio at around 80% of your tier ceiling. That way the project cap triggers first and cuts off requests before the billing account cap is reached.
Prepay balance exhaustion works differently: when your prepaid credits hit $0, all API keys across all your projects stop immediately. No delay. So if you are running close to empty on prepaid, you want a buffer there too.
New accounts vs. existing accounts
If you signed up after March 23, 2026: prepaid billing only. Minimum purchase is $10, maximum balance is $5,000, credits expire after 12 months. The $300 Google Cloud Welcome credit does not apply to the Gemini API.
Existing accounts were auto-assigned to prepaid or postpaid based on account history. Postpaid billing going forward is Tier 3 only. If you were previously postpaid on Tier 1 or 2, expect a switch to prepaid at your next billing cycle.
How to upgrade your tier
Free to Tier 1
Open AI Studio, link a billing account, and purchase at least $10 in prepay credits. Tier 1 activates immediately.
Tier 1 to Tier 2
Spend $250 or more in real billed charges (across any Google Cloud services on the account) and wait 30 days from your first real payment. The upgrade triggers automatically, typically within 10 minutes of qualifying.
Tier 2 to Tier 3
Either spend $1,000+ in real charges and wait 30 days (automatic upgrade in 24-48 hours), or contact Google Cloud sales directly. The sales path has no spending prerequisite and usually resolves in days to a few weeks - faster than waiting through the spending threshold if you need scale now.
There is also an override request form in AI Studio billing settings if you need more than your current tier cap and cannot wait. Turnaround is not guaranteed.
Current Gemini pricing
Standard (non-batch) rates for prompts under 200K tokens. All models offer a 50% batch discount. Gemini 3.1 Pro is still in preview - not generally available:
| Model | Input / 1M tokens | Output / 1M tokens | Context |
|---|---|---|---|
| Gemini 3.1 Pro Preview | $2.00 | $12.00 | 200K standard |
| Gemini 2.5 Pro | $1.25 | $10.00 | 200K standard |
| Gemini 2.5 Flash | $0.30 | $2.50 | 200K standard |
| Gemini 3.1 Flash-Lite Preview | $0.25 | $1.50 | 200K standard |
| Gemini 2.5 Flash-Lite | $0.10 | $0.40 | 200K standard |
Prompts over 200K tokens are charged at 2x the standard input rate for most models. Audio input costs more than text.
What a $250 cap actually buys
Assuming roughly equal input and output volume (1:1 ratio), here is where the Tier 1 ceiling lands on different Gemini models:
Gemini 3.1 Pro Preview
~18M combined tokens
$2.00 input + $12.00 output = $14/MTok blended
Gemini 2.5 Flash
~178M combined tokens
$0.30 input + $2.50 output = $2.80/MTok blended
Gemini 2.5 Flash-Lite
~1B combined tokens
$0.10 input + $0.40 output = $0.50/MTok blended
Gemini 2.5 Flash (batch API, 50% off)
~357M combined tokens
$0.15 input + $1.25 output = $1.40/MTok blended - batch requests are async, results in minutes
If you are hitting the $250 wall on Gemini 3.1 Pro, Gemini 2.5 Flash at the same tier costs about 80% less per token - 178M combined tokens vs 18M for the same monthly budget. Running the batch API takes it further: ~357M tokens for $250.
Alternatives without mandatory caps
Anthropic and OpenAI both have configurable spending limits you set yourself. Soft limits send a notification while the API keeps running. Hard limits stop requests when reached - but you choose the number. Neither imposes a mandatory tier system. xAI has no documented account-level spend caps on the Grok API.
| Provider | Model | Input / 1M | Output / 1M | Spend caps |
|---|---|---|---|---|
| Gemini 3.1 Pro | $2.00 | $12.00 | Mandatory tier caps | |
| Anthropic | Claude Sonnet 4.6 | $3.00 | $15.00 | Configurable by developer |
| Anthropic | Claude Haiku 4.5 | $1.00 | $5.00 | Configurable by developer |
| OpenAI | GPT-5.2 | $1.75 | $14.00 | Configurable by developer |
| xAI | Grok 4.1 Fast | $0.20 | $0.50 | None documented |
Some developers hitting the new caps are routing traffic through API aggregators or setting up multi-provider fallbacks. Whether that complexity is worth it depends on how predictable your usage is.
What to do today
Check your current monthly Gemini API spend and figure out which tier you are on. If you are consistently under $250, the caps probably will not touch you. If you are anywhere near the ceiling, set a project-level cap in AI Studio at around 80% of your tier limit - that absorbs the 10-minute enforcement gap and keeps you from being surprised mid-month.
If the cap is limiting you and you are on Gemini 3.1 Pro, check whether a cheaper Gemini model handles your workload. Flash at $0.30 input / $2.50 output costs about 80% less per token than 3.1 Pro. Use the cost calculator to model the difference before committing to a tier upgrade wait.
For production workloads that cannot absorb a mid-month pause, the fastest path to higher limits is the Google Cloud sales route for Tier 3 - it bypasses the spending threshold entirely. Alternatively, the pricing page has current rates for Anthropic, OpenAI, and xAI if you want to compare before switching providers.
Sources
- Google AI - Gemini API billing documentation
- Google AI - Gemini API pricing
- Google Blog - More control over Gemini API costs (March 16, 2026)
- The Register - Dev stunned by $82K Gemini API bill after key theft
- Google AI Developers Forum - Gemini API usage tier updates and billing caps
- PPC.land - Google finally adds Gemini API spend caps after billing chaos