How much does o3-deep-research cost?

o3-deep-research costs $10 per million input tokens and $40 per million output tokens. Cache reads are $2.50 per million tokens. Web search is mandatory and costs $10 per 1,000 calls. A typical research query with 50K input tokens, 20K output tokens, and 15 web searches costs roughly $1.45.

How much does o4-mini-deep-research cost?

o4-mini-deep-research costs $2 per million input tokens and $8 per million output tokens. Cache reads are $0.50 per million tokens. Web search is mandatory at $10 per 1,000 calls. A typical research query runs about $0.41, compared to $1.45 for o3-deep-research.

What is the difference between o3 and o3-deep-research?

o3-deep-research costs 5x more per token than regular o3 ($10/$40 vs $2/$8 per million tokens). The deep research variant runs as an autonomous multi-step workflow: it automatically calls web search 10-30 times per query, synthesizes across sources, and produces long-form reports. It takes 2-3 minutes per query instead of 20 seconds. It also uses the Responses API, not Chat Completions.

When should I use o4-mini-deep-research instead of o3-deep-research?

o4-mini-deep-research costs about 72% less per query than o3-deep-research and is worth using for most standard research tasks. o3-deep-research makes sense for complex multi-domain analysis where the quality difference matters -- it scores better on hard benchmarks like GPQA Diamond (82.7% vs 78.4% for o4-mini). For most research workflows, o4-mini-deep-research is the better default.

GuideApril 1, 2026·7 min read

OpenAI Deep Research API: what it costs, and why o3-deep-research is 5x pricier than o3

o3-deep-research runs at $10 per million input tokens -- the same base model as o3, which costs $2. The difference is not the model. It is the workflow. Here is what you are actually paying for, what a query costs in practice, and when the premium makes sense.

Dark screen with research workflow showing AI analyzing multiple web sources

Photo by Taylor Vick on Unsplash

Two models: o3-deep-research at $10/$40 per million tokens, o4-mini-deep-research at $2/$8
Web search costs $10 per 1,000 calls and is mandatory -- it fires 10-30 times per query whether you want it to or not
A typical research query costs $0.41 with the mini model or $1.45 with o3. Both run on the Responses API -- not Chat Completions -- and take 2-3 minutes per query

What the deep research models do

When you call o3-deep-research, you are not getting a faster o3 or a model with different weights. You are getting o3 wrapped in an autonomous research workflow. The model receives your query, plans a research strategy, fires off 10-30 web searches automatically, reads the results, reasons across them, and produces a long-form report -- all without you orchestrating any of it.

That is why the latency looks strange. o3-deep-research averages around 118 seconds end-to-end. o4-mini-deep-research is actually slower at 183 seconds, probably because the smaller model runs more search iterations to compensate for less raw reasoning capacity per token. Regular o3 completes in about 20 seconds. If you need to feel interactive, deep research is not the right tool.

The other structural difference is the API itself. Deep research models use the Responses API -- not Chat Completions. That is not a minor detail. It is a different integration path, and OpenAI recommends webhooks over polling because a job can run for minutes. If your codebase is built around chat.completions.create(), switching requires real plumbing work.

Pricing comparison

Here is something worth pausing on: o4-mini-deep-research costs $2/$8 per million tokens -- exactly the same token rate as regular o3. You get deep research capability at standard o3 pricing, with a smaller base model underneath.

Model	Input / 1M	Output / 1M	Cache read / 1M	Web search	Avg latency
o3-deep-research	$10.00	$40.00	$2.50	$10/1K (mandatory)	~118s
o4-mini-deep-research	$2.00	$8.00	$0.50	$10/1K (mandatory)	~183s
o3 (regular)	$2.00	$8.00	$0.50	$10/1K (optional)	~20s
o4 Mini (regular)	$1.10	$4.40	$0.275	$10/1K (optional)	~12s

Sources: OpenRouter (reflects official OpenAI pricing) · tokencost.app/pricing

The mandatory web search cost

Web search on regular o3 and o4-mini is optional -- you enable it, you pay $10 per 1,000 calls, you control when it fires. On the deep research variants it is always on and you cannot disable it.

In practice a single research query triggers 10-30 web searches. At $10 per 1,000 calls that is $0.10-$0.30 per query in search costs alone. On a $0.41 total query with o4-mini-deep-research, that search overhead is a meaningful slice. On a $1.45 query with o3-deep-research it is less noticeable -- but it is still there and worth planning for at scale.

The searches also inflate your input token count because results get injected back into the context. That is part of why token usage runs higher on deep research queries than on the same question asked to regular o3 -- the model is reading web pages as it works, not reasoning purely from training data.

What a query actually costs

Deep research queries run heavier than typical completion calls. A realistic task -- competitive analysis, market research, literature survey -- tends to involve around 50K input tokens (your prompt plus injected search results) and 20K output tokens for the report, with roughly 15 web searches.

Model	Input cost	Output cost	Search (15 calls)	Total
o3-deep-research	$0.50	$0.80	$0.15	~$1.45
o4-mini-deep-research	$0.10	$0.16	$0.15	~$0.41
o3 (regular, no search)	$0.10	$0.16	--	~$0.26

Assumes 50K input tokens, 20K output tokens, 15 web searches. A quick factual lookup with 5K input and 5 searches costs under $0.07 on o4-mini-deep-research.

Worth noting: regular o3 at $0.26 is cheaper than o4-mini-deep-research at $0.41 when you strip out search costs. You are paying about 58% more for the automated research workflow. Whether that automation is worth $0.15 per query is a product decision, not a technical one.

To put a product number on it: say you are running a market intelligence tool that generates competitor reports on demand. At 500 reports per month, o4-mini-deep-research runs about $205 in model costs ($0.41 x 500). If you charge $15 per report, that is roughly a 36:1 revenue-to-inference ratio. Swap in o3-deep-research at $1.45/query and the same volume costs $725/month -- still viable, but the margins look meaningfully different, especially if your users are on a flat subscription.

Do the models perform differently?

The deep research models do not have separate benchmark scores -- they inherit base model performance. o3 scores 82.7% on GPQA Diamond and 20.0% on Humanity's Last Exam. o4-mini scores 78.4% on GPQA and 17.5% on HLE. The gap is consistent: o3 is the stronger reasoner, and that matters for complex multi-step research where you are synthesizing across conflicting sources.

For most research tasks -- competitive analysis, market research, standard literature summaries -- o4-mini-deep-research is probably sufficient. The quality difference between the two shows up most on hard scientific questions and tasks requiring nuanced cross-domain reasoning. At 3.5x the per-query cost, o3-deep-research needs to be noticeably better on your specific task to justify it.

How it compares to Perplexity Sonar Deep Research

Perplexity's Sonar Deep Research API runs at $2/$8 per million tokens plus citation tokens at $2/M, reasoning tokens at $3/M, and a $10/1K request fee. A typical medium-context query on Perplexity runs around $1.19 -- more expensive than o4-mini-deep-research at $0.41.

Worth knowing: Perplexity's model is tuned specifically for web research with clean citation formatting built in. OpenAI's deep research models are general reasoners with search attached. If you are building a product already on OpenAI's stack, o4-mini-deep-research is cheaper and requires fewer integration changes. If research quality and citation display are the core product, Perplexity is worth testing.

When the deep research API makes sense

The honest use case: you need current information plus multi-source synthesis, and you do not want to build a RAG pipeline yourself. A deep research call handles the search, the reading, and the synthesis. You pay roughly $0.15-$1.20 more per query vs. doing it manually with regular o3.

At the product level, it works if you charge per research report. A SaaS tool billing $5-$20 per research output can absorb $0.41-$1.45 in model costs. Consumer apps with free tiers cannot. The 2-3 minute latency is also a real constraint -- it rules out anything that needs to feel responsive.

It does not make sense for closed-domain tasks. If your data is already in context -- internal docs, structured databases, uploaded files -- the mandatory web search adds cost without adding value. Regular o3 or o4-mini is the right tool there.

Quick decision guide

Start with o4-mini-deep-research at $2/$8 per million tokens. It is the same token rate as regular o3, handles most research tasks well, and a typical query runs around $0.41. Upgrade to o3-deep-research ($10/$40) only if your task requires deep multi-domain reasoning and you have tested that the quality difference shows up in your specific outputs. Neither model works for closed-domain tasks or anything latency-sensitive. And budget for the mandatory $10/1K web search calls -- they add $0.10-$0.30 per query and cannot be turned off.

Sources

o3-deep-research pricing - OpenRouter, reflecting official OpenAI rates
o4-mini-deep-research pricing - OpenRouter
OpenAI Deep Research API launch announcement - The Decoder, June 26, 2025
Perplexity Sonar Deep Research pricing - docs.perplexity.ai

Compare all model prices Calculate your API cost