Model ReleaseMay 18, 2026·10 min read

SubQ 1M-Preview shipped 12M tokens of context, no public rate card, and a $8-vs-$2,600 cost story that does not reconcile.

Subquadratic announced SubQ on May 5, 2026, with $29M in seed funding and a launch post that put two numbers on the screen: 95% accuracy on RULER 128K for $8, and 94% on the same benchmark on Claude Opus for around $2,600. The company is selling a 300x long-context cost reduction. We can read every page on subq.ai, every press piece, and the math still does not land. The price card is empty. The Opus comparison was already wrong by March. And the model anyone can actually call is 1M, not 12M.

Abstract sphere of connected nodes representing long-context LLM attention

Photo by Growtika on Unsplash

The 12M number sells the headline. The 1M number is what you can call.

Subquadratic has two SKUs. The research model handles 12 million tokens of context and is gated to a small set of partners, no waitlist signup available. The production model is SubQ 1M-Preview, accessible by waitlist through an OpenAI-compatible REST API with tool-use support. That product tops out at 1 million tokens of context. Every diagram, every press hit, and every social post leads with the 12M figure, but the rate you wait for is the 1M one.

That distinction matters when you are sizing a deal. A 12M-token single-shot prompt is a different system than a 1M-token agent loop. The launch language slides between the two. We use 1M throughout the rest of this piece because it is the surface area developers can actually buy.

No rate card exists. The two numbers that do are not the same number.

The most-quoted claim from Subquadratic is "roughly 1/5 the cost of Claude Opus or GPT-5.5." That phrase appears in the launch blog, the CEO interview with SiliconANGLE, and most of the press coverage. The second-most-quoted claim is the RULER 128K headline: $8 on SubQ to hit 95% accuracy, versus around $2,600 on Claude Opus. That ratio is roughly 300x, not 5x. Both numbers come from the same launch post.

Working backward from public Opus 4.7 pricing ($5 input, $25 output per million tokens, no long-context surcharge since March 13, 2026), the $2,600 figure only resolves if you assume one of three things. Either Subquadratic ran the full RULER suite at roughly 500 million tokens against Opus, or it used Opus 4.1 pricing ($15/$75 per million, the deprecated tier), or it included multi-pass reasoning that other models in the comparison did not run. Subquadratic has not disclosed which. Independent reviewers at DataCamp and FelloAI flagged the same reconciliation problem on the day of launch.

Third parties have filled the pricing vacuum with estimates. Independent trackers have circulated rates around $0.50 input and $1.50 output per million, but no Subquadratic source confirms those figures. That blended rate also does not produce a 300x ratio against current Opus pricing on any reasonable workload shape. Until the rate card lands, the $8 number is a marketing artifact, not a benchmark cost.

What 1M context actually costs across the published rate cards

Here is the same long-context workload (1M tokens of input, 10K tokens of output, the shape of a whole-repo analysis or a multi-document brief) priced against every frontier model with a public rate. SubQ is shown at the third-party-circulated estimate, flagged as such.

Model	Input / 1M	Output / 1M	Long-context note	Total for 1M/10K
GPT-5.5	$10.00	$45.00	2x input / 1.5x output above 272K	$10.45
Claude Opus 4.7	$5.00	$25.00	Surcharge dropped March 13	$5.25
Gemini 3.1 Pro	$4.00	$18.00	2x input / 1.5x output above 200K	$4.18
Claude Sonnet 4.6	$3.00	$15.00	Flat, 1M context	$3.15
DeepSeek V4-Pro (post-May 31)	$1.74	$3.48	Flat, 1M context	$1.77
SubQ 1M-Preview (third-party estimate)	~$0.50*	~$1.50*	Flat, 1M context (claimed)	~$0.52*
DeepSeek V4-Pro (promo, until May 31)	$0.435	$0.87	Flat, 1M context	$0.44

*SubQ rates are third-party circulated estimates, not Subquadratic figures. The company has not published a rate card.

Two reads from this table. First, SubQ at the rumored $0.52 per 1M-context run is not dramatically cheaper than DeepSeek V4-Pro at promo pricing ($0.44), and is only modestly cheaper than V4-Pro's post-May-31 list ($1.77). The frontier comparison Subquadratic leads with (Opus, GPT-5.5) stopped being a fair fight in March, when Anthropic dropped the long-context surcharge that used to double Opus input above 200K. The $2,600 number quietly assumes a pricing regime that no longer exists.

What the SSA architecture actually changes

Subquadratic's technical pitch is SSA, which they describe as a content-dependent sparse attention scheme that picks the most relevant tokens at each step rather than running all-pairs attention across the full context. The claimed complexity is O(n · k) where k is the selected tokens per step, which is closer to linear in n than the standard O(n²) transformer cost. Per Subquadratic's own benchmarks, SSA scales to roughly 52x faster than FlashAttention at 1M tokens, with smaller speedups at shorter contexts. The numbers are internal measurements; no independent reproduction has been published.

Two caveats are stacked into that paragraph. The model itself, per Subquadratic's own admission to DataCamp, is not trained from scratch; it is a sparse-attention fine-tune of an existing open-source base. And the academic precedent for subquadratic attention at frontier scale (Mamba, RWKV, DeepSeek Sparse Attention) is mostly a ledger of architectures that ran efficiently but lost a quality tier relative to dense transformers. SSA may or may not break that pattern. No third-party paper has been published.

If SSA holds up, the right comparison is not against the price of attention compute alone, it is against the price of an entire inference run. Faster attention at 1M tokens lowers GPU minutes, which lowers the floor price a provider can offer. That is the long-term economic story SubQ is reaching for. The near-term commercial story still requires shipping the model to enough developers to put a rate card against the claim.

The benchmark sheet has a 17-point gap between research and production

Subquadratic published three benchmark numbers. RULER 128K at 95.6% (or 95.0% in the DataCamp restatement), MRCR v2 at 8-needle 1M context at 65.9% on the production model, and SWE-Bench Verified at 81.8%. The same MRCR v2 benchmark is reported at 83% on the research model. That is a 17-point quality gap between the model anyone can call (1M production) and the model the launch headlines reference (12M research). DataCamp called the gap unexplained, which it is. The Opus and GPT-5.5 columns below come from Subquadratic's comparison chart; no third-party run has reproduced them.

Benchmark	SubQ (production)	SubQ (research)	Opus 4.7	GPT-5.5
RULER 128K	95.6	n/p	94.8	n/p
MRCR v2 (8-needle, 1M)	65.9	83	32.2	74.0
SWE-Bench Verified	81.8	n/p	87.6	88.7

One careful note. Subquadratic's launch post claims SWE-Bench Verified 81.8% beats Opus 4.6 (80.8%), which is technically true. The current Anthropic flagship is Opus 4.7 at 87.6%, and GPT-5.5 lands at 88.7%. SubQ's production model is closer to a strong mid-tier (Kimi K2.6 at 80.2, Qwen3.6 Plus at 78.8) than to a frontier replacement on Verified. None of these scores have been third-party reproduced.

The Magic.dev parallel everyone is whispering about

In August 2024, a startup called Magic.dev raised $320M (taking cumulative funding to roughly $465M) on a similar pitch: a research model called LTM-2-mini with a 100-million-token context window. The model never shipped as a public product, the benchmarks were never reproduced, and the company quietly pivoted. Subquadratic now raises $29M on a 12-million-token claim, with the same combination of headline benchmark, no model card, and gated access. VentureBeat's coverage of the launch surfaced this comparison explicitly. Byteiota titled their review "Breakthrough or Theranos."

That history does not mean SubQ is the same story. It does mean the burden of proof is on Subquadratic, and the proof has to be third-party reproducible benchmarks, a public rate card, and a model card with architecture details that an outside researcher can attack. None of those exist as of May 18, 2026.

What to actually do with this

Watchlist, not a swap. If you are running long-context workloads today (whole-repo coding, multi-document research, transcript analysis past 200K), the published cheap option is DeepSeek V4-Pro at promo pricing through May 31, then post-promo at roughly four times that. The next cheapest published option with quality holding up is Gemini 3.1 Pro at $4 input above 200K. Claude Sonnet 4.6 at flat $3 input across 1M is the closed-frontier mid-tier with the cleanest long-context economics. Opus 4.7 at $5 flat is the quality ceiling.

Sign up for the SubQ waitlist if you want a seat when the rate card lands. Do not architect around it yet. The two specific things to wait for: a public per-token price (so the "1/5 of Opus" claim can be checked against a real number), and at least one independent benchmark run on the production 1M model (so the MRCR v2 gap between 65.9% and 83% can be explained). Until both exist, the comparison workloads we model on the pricing page and the calculator use the rates that have rate cards behind them.

For background on the long-context pricing shifts SubQ is reacting to, see the Anthropic flat-pricing piece and the DeepSeek V4-Pro May 31 cliff.

Sources

Subquadratic: Introducing SubQ - May 5, 2026 launch post with RULER, MRCR v2, and SWE-Bench numbers
VentureBeat: Researchers demand independent proof - Breakdown of reproducibility concerns and Magic.dev parallel
DataCamp: SubQ AI explained - 17-point research-vs-production MRCR gap, base-model speculation
FelloAI: SubQ LLM review - Conflicting Opus benchmark numbers, sparse-attention finetune analysis
SiliconANGLE: Subquadratic launches with $29M - Funding details, $500M valuation, investor list
The Decoder: Anthropic drops long-context surcharge - March 13, 2026 surcharge removal that reset the Opus comparison
Anthropic: Claude pricing - Opus 4.7 $5/$25, Sonnet 4.6 $3/$15, 1M context flat
DeepSeek: API pricing - V4-Pro promo $0.435/$0.87 through May 31, list $1.74/$3.48
Google: Gemini API pricing - Gemini 3.1 Pro $2/$12 under 200K, $4/$18 above
OpenAI: API pricing - GPT-5.5 $5/$30 under 272K, 2x input / 1.5x output above
Byteiota: Breakthrough or Theranos - Social-media reaction summary and skeptical takes

Compare all model prices Calculate your API cost