How much does GLM 5.2 cost?

As of June 16, 2026, GLM 5.2 has no published per-token API price. The only way to use it is the GLM Coding Plan subscription: roughly $10/month for Lite (about 400 prompts a week), $30 for Pro (about 2,000 a week), and $80 for Max (about 8,000 a week). A standalone API and MIT-licensed open weights were promised for the week of June 16.

What is GLM 5.2's context window?

GLM 5.2 ships a 1 million token context window, up from 200K on GLM 5.1. Maximum output is 131,072 tokens. Some third-party listings still show 203K, which appears to be stale metadata carried over from GLM 5.1.

What benchmarks did GLM 5.2 publish?

None. Z.ai launched GLM 5.2 with no official benchmark scores at all: no SWE-bench, no LiveCodeBench, no AIME, no GPQA. The company said numbers would arrive with the standalone API. Any GLM 5.2 benchmark you see right now is either a third-party estimate or a figure carried over from GLM 5.1.

Will GLM 5.2 get a per-token API and open weights?

Yes. Z.ai said a standalone API, the Z.ai chatbot, and MIT-licensed open weights would arrive the week of June 16, 2026. GLM 5.1 listed near $0.98 input and $3.08 output per million tokens, so GLM 5.2's API rate will likely land in that range, between Kimi K2.7 Code and DeepSeek V4 Flash.

Model ReleaseJune 16, 2026·7 min read

GLM 5.2 has no token price and no benchmarks. The whole pitch is a $10 subscription.

Z.ai launched GLM 5.2 on June 13 with a 1M-token context and a coding-first pitch. What it did not launch with is a per-token API rate or a single published benchmark. The one thing you can actually buy today is the GLM Coding Plan, $10 to $80 a month. Here is what that subscription gets you, why the per-prompt math flatters it, and roughly where the metered API will land when it finally shows up.

Abstract dark technology background with flowing network lines

Photo by A Chosen Soul on Unsplash

Update · June 17, 2026

Z.ai has since closed the two gaps this post was written around. GLM 5.2 now has a per-token API rate, $1.40 input and $4.40 output per million, the same as GLM 5.1, plus MIT-licensed open weights on Hugging Face and a set of self-reported benchmarks (SWE-Bench Pro 62.1, Terminal-Bench 2.1 81.0). The subscription analysis below still holds, but for the metered numbers see our GLM 5.2 vs Kimi K2.7 Code comparison.

Most model launches lead with a rate card. GLM 5.2 led with a subscription tier list and a shrug. There is no dollar-per-million-tokens number for it anywhere on Z.ai's own docs, the model card is not on Hugging Face yet, and the announcement carried zero benchmark scores. For a tool whose entire job is comparing what models cost per token, that is an awkward model to write about. So let us write about the gap instead, because the gap is the story.

What Z.ai did ship is a usable product: GLM 5.2 is live right now inside the GLM Coding Plan, the same $10-to-$80 subscription that already powered 5.1. You point Claude Code or Cline at an Anthropic-compatible endpoint and it works. You just cannot meter it, and you cannot yet check whether the thing is any good against a public leaderboard.

The only price that exists: the Coding Plan

GLM 5.2 is included on every Coding Plan tier at no surcharge. The plan does not bill tokens. It bills "prompts," where one prompt is roughly 15 to 20 underlying model calls once the agent fans out across tool use and retries. The quotas are weekly, refreshed in rolling five-hour windows, so the real ceiling is how hard you can lean on it in a sitting, not a monthly token bucket.

Tier	Price / mo	Quota	Notes
Lite	~$10	~400 prompts / week	~120 per 5-hr window
Pro	~$30	~2,000 prompts / week	Faster responses, vision, web search
Max	~$80	~8,000 prompts / week	~1,100 per 5-hr window
Team	Per seat	Org-priced	Pooled quota

Figures from z.ai/subscribe. Prices are approximate and have shifted between billing cycles; the page lists the plan as "Powered by GLM-5.2 & GLM-5-Turbo." An earlier round of coverage quoted Lite at $18, which no longer matches the current tiers.

Why a flat $80 looks unbeatable, and where the catch hides

A subscription and a per-token API are not the same product, so any comparison is rough. But the order of magnitude is the whole point. Take a developer running a coding agent hard: call it 50M input and 10M output tokens in a month, the input-heavy mix you get when an agent keeps re-reading files. On the per-token coders, that month costs real money.

How you pay	Model	~50M in / 10M out
Flat	GLM 5.2 (Max plan)	$80
Metered	Kimi K2.7 Code	$88
Metered	Claude Opus 4.8	$500
Metered	GPT-5.5	$550
Metered	Claude Fable 5	$1,000

Read that the right way. GLM 5.2 on the Max plan does not cost $80 for 60M tokens. It costs $80 for as many prompts as you can fit inside 8,000 a week, and the token volume underneath those prompts is whatever it happens to be. If your usage maps cleanly onto the prompt cap, you are paying frontier-tier compute for the price of a metered budget model, which is the entire appeal.

The catch is the cap itself. Hit the rolling five-hour limit on the Lite plan and you wait, you do not overflow into pay-as-you-go. A per-token API has no ceiling and no floor: you pay for exactly what you burn. That is the trade Z.ai is asking you to make until the metered endpoint arrives, and for bursty agent work it can go either way.

A coding model that shipped with no coding numbers

This is the part that should make you slow down. GLM 5.2 is sold as coding-first, built to plan, execute, and iterate on engineering tasks on its own. And it launched without SWE-bench, without LiveCodeBench, without Terminal-Bench, without an AIME or GPQA line. MarkTechPost flagged the same gap on launch day. Z.ai said the scores would come with the standalone API. A coding model that withholds its coding benchmarks at launch is asking for the benefit of the doubt.

For now the only numbers on the table belong to the model it replaces. GLM 5.1 posted 95.3% on AIME, 86.2% on GPQA, and 58.4 on SWE-Bench Pro; the base GLM-5 hit 77.8 on SWE-bench Verified. Those are a floor, not a forecast, and 5.2's headline change is context length, not a claimed jump in reasoning. The one fresh figure circulating, a BridgeBench reasoning score that supposedly tops Fable 5, traces to a third party on social media, not to Z.ai. Treat it as a rumor.

Model	SWE-Bench Pro	Context
GLM 5.2	Not published	1M
GLM 5.1 (predecessor)	58.4	200K
Claude Opus 4.8	69.2	1M
Claude Fable 5	80.3	1M

GLM 5.1 and competitor figures are vendor-reported on differing scaffolds and do not line up cleanly. Standardized leaderboards run lower across the board. The GLM 5.2 row stays blank on purpose until Z.ai or an independent eval fills it.

Where the metered price will probably land

When the standalone API opens, the safest anchor is the last two GLM releases. GLM-5 listed around $0.60 input and $1.92 output per million; GLM 5.1 stepped up to roughly $0.98 and $3.08. A 5.2 that holds the 5.1 line, or nudges it for the bigger context, would slot it right between Kimi K2.7 Code and the budget floor.

Model	Input / 1M	Output / 1M
DeepSeek V4 Flash	$0.14	$0.28
Kimi K2.7 Code	$0.95	$4.00
GLM 5.1 (the anchor)	$0.98	$3.08
GLM 5.2 (expected)	~$1	~$3
Claude Opus 4.8	$5.00	$25.00

The GLM 5.2 row is a forecast off predecessor pricing, not a quote. We will swap in real numbers and update the pricing page the day Z.ai publishes them.

The actual upgrade: 1M tokens you can use

Strip away the missing numbers and one real change remains. GLM 5.2 takes the context window from 200K to a full million tokens, with output capped at 131,072. For repo-scale agentic work, that is the difference between feeding the model a handful of files and handing it most of a codebase. Z.ai is careful to call it a "usable" 1M, a quiet jab at windows that technically accept the tokens but quietly forget the middle of them.

One wrinkle to know: some routers and aggregators still show GLM 5.2 at 203K. That looks like metadata copied from the 5.1 listing rather than a real spec, since every editorial source and Z.ai's own messaging put it at a million. If you depend on the long window, confirm it on whatever endpoint you actually call before you architect around it.

Under the hood it is the GLM-5 foundation: a mixture-of-experts in the 744B-total, roughly 40B-active range, served through an Anthropic-compatible endpoint so the agent tools that already speak Claude work on day one. There are two thinking-effort presets, High and Max, with Max aimed at the gnarly multi-step jobs. Both run deliberate by default, so do not expect a fast-and-cheap mode here.

What to do until the metered API ships

If you already pay for the GLM Coding Plan, 5.2 is a free upgrade with a much bigger context, and the only sane move is to use it and watch whether it ships working diffs. The subscription was a good deal at 5.1 and it did not get more expensive. The risk is near zero because the price did not move.

If you are choosing a coder from scratch and you bill by the token, there is nothing to evaluate yet. No API, no rate, no public benchmark. Wait for the metered endpoint and the open weights, which Z.ai pointed at the week of June 16, then run it against Kimi K2.7 Code and DeepSeek V4 on your own repo. Until those land, GLM 5.2 is a subscription you can try and a price you can only guess at.

Sources

Z.ai launches GLM 5.2 - MarkTechPost, June 14, 2026 (1M context, two thinking levels, no benchmarks)
GLM Coding Plan - z.ai/subscribe (tier list, "Powered by GLM-5.2 & GLM-5-Turbo")
GLM 5.2 release: 1M context, coding-first - Codersera (plan quotas, open-weights timing)
GLM 5.1 pricing - OpenRouter ($0.98 input / $3.08 output, the forecast anchor)
GLM 5.2 open-source under MIT - Pandaily (license, architecture)

Compare all model prices Calculate your API cost