How much does Gemini Deep Research cost per task?

Google estimates $1-3 per task for Deep Research and $3-7 for Deep Research Max. The model runs on Gemini 3.1 Pro rates ($2.00/1M input, $12.00/1M output), but Google Search queries add $1.12-$2.24 per task at $14 per thousand queries. Context caching cuts inference costs by roughly half through 50-70% implicit cache hit rates on the agentic loop.

Is Gemini Deep Research available via API?

Yes. Google launched the Interactions API on April 21, 2026 as a public preview. The two agent IDs are deep-research-preview-04-2026 (standard) and deep-research-max-preview-04-2026 (Max). It requires async execution and is available on Gemini API paid tiers only. Vertex AI support is coming soon.

Is Gemini Deep Research cheaper than OpenAI Deep Research?

Compared to o3-deep-research ($10.00/M input), Gemini at $2.00/M input is cheaper on inference. But Gemini runs 80-160 search queries per task at $14/1K versus OpenAI's 10-30 queries at $10/1K. The search overhead alone runs $1.12-$2.24 for Gemini versus $0.10-$0.30 for OpenAI. For high-volume simple tasks, o4-mini-deep-research at roughly $0.41/task is cheaper overall.

What model does Gemini Deep Research use?

Gemini Deep Research is built on Gemini 3.1 Pro, billed at standard Gemini API rates with no agent markup. That is $2.00 per million input tokens and $12.00 per million output tokens for prompts under 200K tokens.

GuideApril 24, 2026·8 min read

Gemini Deep Research: what each AI research report actually costs

Google launched a developer API for Deep Research on April 21. The official estimate is $1-3 per task. That number is accurate but incomplete - it bakes in 50-70% cache hits and doesn't surface that Google Search queries alone add $1.12 per standard run. Here's where the money actually goes.

Google Gemini Deep Research Agent showing standard and Max versions

Image source: Google AI Blog

TL;DR

-Two agents launched April 21: deep-research-preview-04-2026 (standard, ~$2/task) and deep-research-max-preview-04-2026 (Max, ~$5/task).
-Runs on Gemini 3.1 Pro at standard rates ($2.00/1M input, $12.00/1M output). No markup on the agent layer.
-Google Search grounding is on by default: 80 queries per standard task, 160 per Max, at $14/1K. That adds $1.12-$2.24 per task in search costs alone.
-Implicit caching covers 50-70% of input tokens per task - the main reason inference stays affordable inside the agentic loop.
-Available via the Interactions API (not generate_content), async only, paid tiers only.

What Google actually shipped

Deep Research was already live in the Gemini app and NotebookLM before April 21. What changed is the developer API - the Interactions API in public preview. You can now call these agents directly from your code, not just from Google surfaces.

Both versions run on Gemini 3.1 Pro. The model scores 85.9% on BrowseComp - a benchmark measuring an agent's ability to find hard-to-locate information through persistent web browsing. That's 1.9 points above Claude Opus 4.6 (84.0%) and well above GPT-5.2 (65.8%). For a research agent where the entire value proposition is finding things on the web, that matters.

The two agents differ in how hard they push. Standard runs about 80 web searches, processes roughly 250K input tokens and 60K output tokens, and typically completes in 5-15 minutes. Max runs 160 searches, processes around 900K input tokens and 80K output tokens, and is designed for batch jobs where you want maximum source coverage.

Cost per task: the full breakdown

Google publishes official estimates on their pricing page. We've broken out the search cost separately because it's the component most likely to catch developers off guard.

Component	Deep Research	Deep Research Max
Input tokens (cumulative)	~250K	~900K
Output tokens (cumulative)	~60K	~80K
Implicit cache hit rate	50-70%	50-70%
Google Search queries	~80	~160
Search cost ($14/1K queries)	~$1.12	~$2.24
Inference cost (w/ 60% cache)	~$0.95	~$1.80
Total per-task estimate	$1–3	$3–7

Token estimates and cost ranges from Google AI Studio pricing docs (April 21, 2026). Inference calculated at Gemini 3.1 Pro standard rates with 60% implicit cache hit rate. All numbers are Google's official estimates - no independent developer measurements exist yet (the API launched three days ago).

Base model rates

The inference pricing is standard Gemini 3.1 Pro - no premium for using the Deep Research agent wrapper. The 200K threshold matters: individual calls in the agentic loop start small but can grow as the model accumulates context across search iterations.

Tier	Input /1M	Output /1M	Cached input /1M
Prompts ≤200K tokens	$2.00	$12.00	$0.20
Prompts >200K tokens	$4.00	$18.00	$0.40
Batch / Flex (async)	$1.00	$6.00	$0.10

Source: Google AI Studio pricing. Storage for explicit caches: $4.50/1M tokens per hour. No free tier on Gemini 3.1 Pro.

Why search queries cost more than inference

With 60% caching, the inference cost for a standard Deep Research task works out to roughly $0.95: about 100K uncached input tokens plus 150K cached, plus 60K output. That's not a large number at Gemini 3.1 Pro prices.

The 80 Google Search queries cost $1.12 at $14 per thousand. That's more than the inference. And it's outside developer control - the agent decides how many searches to run based on task complexity, and the current API preview doesn't expose a per-task query cap.

There are two ways to eliminate the search cost entirely. First, disable Google Search grounding and point the agent at private data via MCP servers - useful if you're doing research on internal documents. Second, use Deep Research through the consumer Gemini subscription products, where search is bundled into the subscription price.

One thing to track: the monthly free Search quota is 5,000 queries shared across all Gemini 3 models. At 80 queries per Deep Research task, you burn through that in 62 tasks. After that it's $14/1K for every query.

vs. OpenAI Deep Research

We covered OpenAI Deep Research API pricing when it launched. The per-task comparison is less straightforward than model rates suggest.

Agent	Input /1M	Output /1M	Search /task	Typical /task
Gemini Deep Research	$2.00	$12.00	~$1.12	~$2
Gemini Deep Research Max	$2.00	$12.00	~$2.24	~$5
OpenAI o3-deep-research	$10.00	$40.00	~$0.20	~$1.45
OpenAI o4-mini-deep-research	$2.00	$8.00	~$0.20	~$0.41

Gemini: Google Search at $14/1K, ~80-160 queries/task. OpenAI: Web Search tool at $10/1K, ~10-30 queries/task. Typical task costs are estimates using official token consumption data.

The gap that doesn't show up in model rates: Gemini runs search far more aggressively. Where o4-mini-deep-research uses 15-20 queries per task at $10/1K, Gemini Deep Research uses 80 at $14/1K. The search line item alone is $1.12 vs roughly $0.15. That's why o4-mini-deep-research ends up cheaper end-to-end at around $0.41 per task despite matching Gemini on input token price.

Coverage is the argument for Gemini. More searches typically means more sources, which matters for genuinely open-ended research tasks. Scoring 85.9% on BrowseComp is evidence the model uses those queries well. Whether that thoroughness justifies the search premium depends on what you're building.

Using the Interactions API

The Interactions API is separate from the standard generate_content endpoint. It's async-only - you submit a task, get an interaction ID, and poll until it completes. Most tasks finish in under 20 minutes. The API cap is 60 minutes.

from google import genai

client = genai.Client(api_key="YOUR_KEY")

# Submit the research task
interaction = client.interactions.create(
    input="What are the latest LLM API pricing changes in Q2 2026?",
    agent="deep-research-preview-04-2026",
    background=True,
)

# Poll until complete
while interaction.state != "COMPLETED":
    interaction = client.interactions.get(interaction.id)

print(interaction.output.text)

Swap in deep-research-max-preview-04-2026 for the Max version. MCP servers can be passed via the tools parameter to point the agent at private data sources. Omit the Search tool entirely to cut the per-query costs to zero.

When the search premium pays off

-Tasks where coverage justifies cost. Due diligence, competitive analysis, technical literature review. If a researcher would spend 3 hours on this manually, spending $2-5 on a 15-minute agent run is an easy trade.
-Private corpus research. Disable web search, connect your own documents via MCP. You keep the multi-step reasoning across sources - without the $1.12+ per task in search fees. This is probably the highest-value API use case right now.
-High volume runs favor o4-mini-deep-research. At roughly $0.41/task - one-fifth the cost of Gemini Deep Research standard - OpenAI's cheaper option fits workloads where you need acceptable research quality at scale rather than maximum web coverage.
-Cost predictability is still limited. Query volume isn't developer-controlled in the current preview. Complex tasks can trigger more searches than simple ones, and there's no per-task query cap exposed yet. Budget with some headroom.

The search overhead is the real number to watch

The underlying model costs are modest. Gemini 3.1 Pro at $2.00/1M input is not expensive, and implicit caching keeps inference under a dollar per standard task. Google's $1-3 estimate is accurate.

What the headline estimate doesn't surface is that search queries are the largest line item at current defaults. At 80 queries per task and $14/1K, search costs more than inference on most standard runs. If you're building something that runs hundreds of research tasks per month, that's the cost to optimize - either by switching to private data via MCP, or by comparing per-task total against o4-mini-deep-research before committing to Gemini.

Sources

Compare all model prices Calculate your API cost

Ankit Aglawe

April 24, 2026 · 8 min read