OpenAI Deep Research API: what it costs, and why o3-deep-research is 5x pricier than o3
o3-deep-research runs at $10 per million input tokens -- the same base model as o3, which costs $2. The difference is not the model. It is the workflow. Here is what you are actually paying for, what a query costs in practice, and when the premium makes sense.

Photo by Taylor Vick on Unsplash
- Two models: o3-deep-research at $10/$40 per million tokens, o4-mini-deep-research at $2/$8
- Web search costs $10 per 1,000 calls and is mandatory -- it fires 10-30 times per query whether you want it to or not
- A typical research query costs $0.41 with the mini model or $1.45 with o3. Both run on the Responses API -- not Chat Completions -- and take 2-3 minutes per query
What the deep research models do
When you call o3-deep-research, you are not getting a faster o3 or a model with different weights. You are getting o3 wrapped in an autonomous research workflow. The model receives your query, plans a research strategy, fires off 10-30 web searches automatically, reads the results, reasons across them, and produces a long-form report -- all without you orchestrating any of it.
That is why the latency looks strange. o3-deep-research averages around 118 seconds end-to-end. o4-mini-deep-research is actually slower at 183 seconds, probably because the smaller model runs more search iterations to compensate for less raw reasoning capacity per token. Regular o3 completes in about 20 seconds. If you need to feel interactive, deep research is not the right tool.
The other structural difference is the API itself. Deep research models use the Responses API -- not Chat Completions. That is not a minor detail. It is a different integration path, and OpenAI recommends webhooks over polling because a job can run for minutes. If your codebase is built around chat.completions.create(), switching requires real plumbing work.
Pricing comparison
Here is something worth pausing on: o4-mini-deep-research costs $2/$8 per million tokens -- exactly the same token rate as regular o3. You get deep research capability at standard o3 pricing, with a smaller base model underneath.
| Model | Input / 1M | Output / 1M | Cache read / 1M | Web search | Avg latency |
|---|---|---|---|---|---|
| o3-deep-research | $10.00 | $40.00 | $2.50 | $10/1K (mandatory) | ~118s |
| o4-mini-deep-research | $2.00 | $8.00 | $0.50 | $10/1K (mandatory) | ~183s |
| o3 (regular) | $2.00 | $8.00 | $0.50 | $10/1K (optional) | ~20s |
| o4 Mini (regular) | $1.10 | $4.40 | $0.275 | $10/1K (optional) | ~12s |
Sources: OpenRouter (reflects official OpenAI pricing) · tokencost.app/pricing
The mandatory web search cost
Web search on regular o3 and o4-mini is optional -- you enable it, you pay $10 per 1,000 calls, you control when it fires. On the deep research variants it is always on and you cannot disable it.
In practice a single research query triggers 10-30 web searches. At $10 per 1,000 calls that is $0.10-$0.30 per query in search costs alone. On a $0.41 total query with o4-mini-deep-research, that search overhead is a meaningful slice. On a $1.45 query with o3-deep-research it is less noticeable -- but it is still there and worth planning for at scale.
The searches also inflate your input token count because results get injected back into the context. That is part of why token usage runs higher on deep research queries than on the same question asked to regular o3 -- the model is reading web pages as it works, not reasoning purely from training data.
What a query actually costs
Deep research queries run heavier than typical completion calls. A realistic task -- competitive analysis, market research, literature survey -- tends to involve around 50K input tokens (your prompt plus injected search results) and 20K output tokens for the report, with roughly 15 web searches.
| Model | Input cost | Output cost | Search (15 calls) | Total |
|---|---|---|---|---|
| o3-deep-research | $0.50 | $0.80 | $0.15 | ~$1.45 |
| o4-mini-deep-research | $0.10 | $0.16 | $0.15 | ~$0.41 |
| o3 (regular, no search) | $0.10 | $0.16 | -- | ~$0.26 |
Assumes 50K input tokens, 20K output tokens, 15 web searches. A quick factual lookup with 5K input and 5 searches costs under $0.07 on o4-mini-deep-research.
Worth noting: regular o3 at $0.26 is cheaper than o4-mini-deep-research at $0.41 when you strip out search costs. You are paying about 58% more for the automated research workflow. Whether that automation is worth $0.15 per query is a product decision, not a technical one.
To put a product number on it: say you are running a market intelligence tool that generates competitor reports on demand. At 500 reports per month, o4-mini-deep-research runs about $205 in model costs ($0.41 x 500). If you charge $15 per report, that is roughly a 36:1 revenue-to-inference ratio. Swap in o3-deep-research at $1.45/query and the same volume costs $725/month -- still viable, but the margins look meaningfully different, especially if your users are on a flat subscription.
Do the models perform differently?
The deep research models do not have separate benchmark scores -- they inherit base model performance. o3 scores 82.7% on GPQA Diamond and 20.0% on Humanity's Last Exam. o4-mini scores 78.4% on GPQA and 17.5% on HLE. The gap is consistent: o3 is the stronger reasoner, and that matters for complex multi-step research where you are synthesizing across conflicting sources.
For most research tasks -- competitive analysis, market research, standard literature summaries -- o4-mini-deep-research is probably sufficient. The quality difference between the two shows up most on hard scientific questions and tasks requiring nuanced cross-domain reasoning. At 3.5x the per-query cost, o3-deep-research needs to be noticeably better on your specific task to justify it.
How it compares to Perplexity Sonar Deep Research
Perplexity's Sonar Deep Research API runs at $2/$8 per million tokens plus citation tokens at $2/M, reasoning tokens at $3/M, and a $10/1K request fee. A typical medium-context query on Perplexity runs around $1.19 -- more expensive than o4-mini-deep-research at $0.41.
Worth knowing: Perplexity's model is tuned specifically for web research with clean citation formatting built in. OpenAI's deep research models are general reasoners with search attached. If you are building a product already on OpenAI's stack, o4-mini-deep-research is cheaper and requires fewer integration changes. If research quality and citation display are the core product, Perplexity is worth testing.
When the deep research API makes sense
The honest use case: you need current information plus multi-source synthesis, and you do not want to build a RAG pipeline yourself. A deep research call handles the search, the reading, and the synthesis. You pay roughly $0.15-$1.20 more per query vs. doing it manually with regular o3.
At the product level, it works if you charge per research report. A SaaS tool billing $5-$20 per research output can absorb $0.41-$1.45 in model costs. Consumer apps with free tiers cannot. The 2-3 minute latency is also a real constraint -- it rules out anything that needs to feel responsive.
It does not make sense for closed-domain tasks. If your data is already in context -- internal docs, structured databases, uploaded files -- the mandatory web search adds cost without adding value. Regular o3 or o4-mini is the right tool there.
Quick decision guide
Start with o4-mini-deep-research at $2/$8 per million tokens. It is the same token rate as regular o3, handles most research tasks well, and a typical query runs around $0.41. Upgrade to o3-deep-research ($10/$40) only if your task requires deep multi-domain reasoning and you have tested that the quality difference shows up in your specific outputs. Neither model works for closed-domain tasks or anything latency-sensitive. And budget for the mandatory $10/1K web search calls -- they add $0.10-$0.30 per query and cannot be turned off.
Sources
- o3-deep-research pricing - OpenRouter, reflecting official OpenAI rates
- o4-mini-deep-research pricing - OpenRouter
- OpenAI Deep Research API launch announcement - The Decoder, June 26, 2025
- Perplexity Sonar Deep Research pricing - docs.perplexity.ai