Skip to main content
TokenCost logoTokenCost
Model ReleaseJune 2, 2026·8 min read

MiniMax M3 claims GPT-5.5-class coding for a tenth of the price. The benchmarks are self-reported and the cheap tier stops at 512K.

M3 went live on June 1 at $0.60 input and $2.40 output per million tokens, with a 50% launch promo cutting that to $0.30/$1.20 for the first week. It carries a 1M context window, takes image and video input, and ships on a new sparse-attention design MiniMax says runs long context for about a twentieth of the prior compute. The launch deck puts it level with GPT-5.5 on SWE-Bench Pro and ahead of Opus 4.7 on autonomous browsing. Two things to hold onto before you wire it in: every one of those numbers is MiniMax's own, and the cheap rate only holds below 512K input.

Abstract dark blue and purple light streaks representing MiniMax M3, a low-cost frontier coding model

Photo by Inigo Concepcion Concepcion on Unsplash

The cheap rate is real, and so is the asterisk

M3 lists at $0.60 input and $2.40 output per million tokens for requests under 512K, and a seven-day launch promo halves both numbers to $0.30/$1.20, which is the exact rate M2.7 charges as its permanent price. For that you get a 1M context window, native image and video input, and a sparse-attention design MiniMax says cuts per-token compute at full context to about a twentieth of the previous generation.

The asterisk runs two ways. The headline benchmarks of SWE-Bench Pro 59.0 and BrowseComp 83.5 are MiniMax's own, with no independent Intelligence Index score at launch, and the cheap rate doubles the moment a request crosses 512K input. Both shape the real bill.

The rate card, tier by tier

M3 prices on two axes at once: input length and processing priority. The number most people will see is the standard tier under 512K input, and right now it is running at a 50% discount. The fine print is the 512K line. Send a request past it and both input and output rates double, which matters for exactly the long-document and whole-repo jobs the 1M window is meant to attract.

TierInput / 1MOutput / 1MCache read / 1M
Standard, ≤512K (promo, 7-day)$0.30$1.20$0.06
Standard, ≤512K (list)$0.60$2.40$0.12
Standard, >512K input$1.20$4.80$0.24
Priority, ≤512K (list)$0.90$3.60$0.18

One detail the table cannot show: the promo is a clock, not a coupon. MiniMax states a seven-day window from launch and does not print a calendar end date, so build your budget on the $0.60/$2.40 list rate and treat the discount as a one-week trial credit. We have watched DeepSeek turn a launch promo into a permanent cut and watched others let theirs lapse on schedule, so the honest planning number is the one that survives the week.

The benchmarks, and who measured them

The pitch is coding and agentic work, and the board MiniMax leads with is SWE-Bench Pro. On the company's own runs M3 lands at 59.0, which it frames as past GPT-5.5 and Gemini 3.1 Pro and within reach of Claude Opus 4.7. The browsing and tool-use numbers are stronger still relative to the field. Here is what the launch deck reported.

BenchmarkMiniMax M3What MiniMax says it means
SWE-Bench Pro59.0Past GPT-5.5 and Gemini 3.1 Pro, approaching Opus 4.7
BrowseComp83.5Above Opus 4.7 at 79.3 on autonomous browsing
Terminal-Bench 2.166.0Shell and tool-use agent tasks
MCP Atlas74.2Model Context Protocol tool routing
SVG-Bench63.7Vector-graphics generation, above Opus 4.7
SWE-fficiency34.8Token cost to close a SWE-Bench task

Every figure above came from MiniMax. At launch there was no Artificial Analysis Intelligence Index entry and no LMArena placement for M3, and TechTimes ran the release under a headline calling the frontier claims unverified. That does not make the numbers wrong. It does mean the safe move is to run M3 against your own eval set before you let the SWE-Bench Pro line drive a migration. The competitor scores MiniMax printed beside its own (GPT-5.5 at 58.6, Gemini at 54.2) were also read off MiniMax charts rather than re-run, so the deltas are directional at best.

What it costs against the field

We ran three workloads at list rates. The browsing-agent and monthly-volume rows stay under 512K input, so M3 bills at the standard tier. The full-codebase row crosses 512K, so M3 jumps to the higher tier and GPT-5.5 trips its own 272K surcharge to $10/$45. Numbers are dollars, rounded.

WorkloadM3 (list)DeepSeek V4-ProGPT-5.5Opus 4.8
Browsing-agent task (100K in / 20K out)$0.11$0.06$1.10$1.00
Full-codebase pass (600K in / 10K out)$0.77$0.27$6.45$3.25
400M tokens/mo (60/40 in/out)$528$244$6,000$5,200

Against the US frontier the gap is enormous: M3 at list runs an order of magnitude under GPT-5.5 and Opus 4.8 on every row, and the launch promo doubles that lead for a week. Against the cheap Chinese frontier the picture flips. DeepSeek V4-Pro is cheaper than M3 on all three workloads, and on the full-codebase pass its flat 1M pricing comes in at $0.27 against M3's $0.77, because V4-Pro never trips a 512K surcharge. M3's argument there is not price, it is the multimodal input and the browsing scores, neither of which V4-Pro offers.

Where it sits in the cheap-frontier bracket

There is now a dense cluster of sub-$3 input models that all claim near-frontier coding, and M3 walks straight into it. List rates and context, ranked by input price.

ModelInput / 1MOutput / 1MContextMultimodal in
DeepSeek V4-Pro$0.435$0.871MText only
MiniMax M3 (list)$0.60$2.401M*Image + video
Gemini 3.5 Flash$1.50$9.001MImage + audio
Gemini 3.1 Pro$2.00$12.001MImage + audio
Qwen3.7 Max$2.50$7.501MText only

The asterisk on M3's context is the 512K pricing line again: you get the full million tokens, you just pay double for the half above 512K. Read against this row, M3 is the only entry pairing image-and-video input with sub-$1 list input. If your workload is text-only and price-first, V4-Pro already owns that corner. If you need a model that can read a screenshot or a video frame without paying Gemini or Opus rates, M3 is the new floor.

The sparse-attention bet underneath the price

The reason M3 can price a 1M window this low is MiniMax Sparse Attention. Instead of every token attending to every other token, MSA selects blocks of the key-value cache and skips the rest, which is what lets MiniMax claim roughly a twentieth of the prior per-token compute at full context and a 15.6x speedup on long inputs. That design choice is also the most likely source of any quality gap the independent benches eventually surface, because block selection trades some recall for speed. MiniMax has said the weights will be posted to Hugging Face and GitHub, so the sparse-attention claims will be checkable rather than taken on faith, which is more than the closed frontier offers. Parameter counts were not disclosed at launch, and the numbers floating around secondary sites are contradictory, so we are not printing one.

The discount buys a trial, not a verdict

M3 is the rare launch where the price alone justifies the experiment. A browsing or coding agent sitting on GPT-5.5 or Opus is paying roughly ten times the per-token rate for work M3 claims to match, and the promo stretches that gap wider for a week. Knock a few points off MiniMax's SWE-Bench Pro figure with your own evals and the dollar headroom still absorbs it, while the image-and-video input picks up screenshot and frame workflows the text-only cheap models cannot run at all.

What the price cannot settle is whether the benchmarks hold. Nobody outside MiniMax has scored M3 yet, so a migration driven by the launch deck is a bet on a vendor chart. The weights are headed to Hugging Face, which puts the real answer weeks out rather than never, and that is the sane moment to commit budget. Teams already on DeepSeek V4-Pro for text-only work have the least reason to move early, since M3 does not beat them on dollars and waiting costs nothing.

One number survives whatever you decide: the 1M window is real, the price on the half above 512K is double. If your jobs run long, model the bill at the higher tier from the start, because the headline rate is not the one you will pay.

Put M3 next to the rest of the field on the full pricing table or run your own token mix through the calculator before the promo clock runs out.

Sources