How much does Kimi K2.7 Code cost via API?

Kimi K2.7 Code costs $0.95 per million input tokens and $4.00 per million output tokens on Moonshot's direct API. A cache hit drops input to $0.19 per million. The input and output rates are identical to Kimi K2.6; only the cached-input rate moved, from $0.16 to $0.19.

Is Kimi K2.7 Code actually better than K2.6?

Moonshot reports a 21.8% gain on its own Kimi Code Bench v2 and roughly 30% fewer reasoning tokens than K2.6. Every benchmark it published is a proprietary Moonshot suite. As of June 2026 there are no independent third-party results on public benchmarks like SWE-bench Verified or Terminal-Bench, so the improvement is unverified outside Moonshot.

Is Kimi K2.7 Code open source?

Yes. Kimi K2.7 Code ships under a Modified MIT License with weights on Hugging Face at moonshotai/Kimi-K2.7-Code. It is a 1-trillion-parameter mixture-of-experts model with 32B active parameters and a 256K context window, and runs on vLLM, SGLang, or KTransformers.

Is Kimi K2.7 Code the cheapest coding model?

No. DeepSeek V4-Pro at $0.44 input and $0.87 output per million tokens undercuts K2.7 Code on both rates. K2.7 Code's argument is coding-agent quality at a low price, not the absolute cheapest per-token rate.

Model ReleaseJune 13, 2026·7 min read

Kimi K2.7 Code costs exactly what K2.6 did. The rest you take on faith.

Moonshot shipped K2.7 Code on June 12 at $0.95 in and $4 out per million tokens, the same rate card as the model it replaces. The pitch is a 21.8% coding gain and 30% fewer reasoning tokens. Every number behind that pitch is one Moonshot scored itself, on benchmarks only Moonshot runs.

Dark computer screen filled with dense rows of terminal data and code

Photo by Lukas on Unsplash

The first thing worth saying about Kimi K2.7 Code is the thing Moonshot did not change. Input stays at $0.95 per million tokens. Output stays at $4.00. Those are the exact figures K2.6 carried since April. If you already have a budget line for K2.6, you do not need to touch it.

One number did move. Cached input went from $0.16 to $0.19 per million, a 19% bump on the discount you get for repeated context. For an agent that re-sends the same system prompt thousands of times a day, that is a real if small tax. For everything else it rounds to nothing.

The rate card

Model	Input / 1M	Cached / 1M	Output / 1M	Context
Kimi K2.7 Code	$0.95	$0.19	$4.00	256K
Kimi K2.6	$0.95	$0.16	$4.00	256K

Source: platform.kimi.ai. Model ID: kimi-k2.7-code

Moonshot is grading its own homework

Here is the part to read carefully. The headline claim is +21.8% on something called Kimi Code Bench v2. The supporting numbers come from Program Bench, MLS Bench Lite, Kimi Claw 24/7 Bench, MCP Atlas, and MCP Mark Verified. You may not recognize those names. That is because they are all Moonshot's, and Moonshot is the only party that runs them.

Benchmark (all Moonshot-run)	K2.7 Code	K2.6
Kimi Code Bench v2	62.0	50.9
MLS Bench Lite	35.1	26.7
MCP Mark Verified	81.1	72.8
Program Bench	53.6	48.3

There is no SWE-bench Verified line. No Terminal-Bench, no LiveCodeBench, no GPQA. K2.6 reported all of those in April, which is what makes the absence here read as a decision rather than an oversight. VentureBeat put it bluntly: a claim of "21.8% better than our last model on our own eval" is true and unfalsifiable in the same breath. Nobody outside Moonshot can check it yet.

Watch for one specific trap in the coverage. At least one writeup framed "MCP Mark Verified, K2.7 at 81.1 vs Claude Opus 4.8 at 76.4" as a head-to-head win. It is not an independent comparison. That Opus figure sits inside Moonshot's own table. Treat every K2.7 number you see this week as a vendor number until Artificial Analysis or a public leaderboard says otherwise.

If the bill drops, it won't be the rate card

Since the per-token price is frozen, the only way K2.7 saves you money is by generating fewer tokens. Moonshot claims exactly that: about 30% less "thinking" on the way to an answer. On a model where output costs $4 against $0.95 for input, output is where the bill lives. Cutting it is worth more than any input discount.

Walk it through on one coding session that reads 2M tokens of context and writes 1M tokens of reasoning and code. On K2.6 that is $1.90 of input plus $4.00 of output, so $5.90. If K2.7 genuinely trims output by 30% to 700K tokens at the same rate, the same session runs $1.90 plus $2.80, or $4.70. That is roughly 20% off the session without the price ever changing.

The catch is the same as above. The 30% figure is Moonshot's, measured on Moonshot's tasks. Reasoning-token counts swing hard by workload, so the only honest way to know what K2.7 costs you is to run your own traffic through both and compare the output-token totals on your invoice.

Cheap, but not the cheapest

Against the frontier closed models, K2.7 looks like a steal. Against the rest of the open and Chinese field, it sits in the middle. DeepSeek V4-Pro is cheaper on both input and output. Qwen3.7 Max costs more but reads images. The point of K2.7 is not the floor price, it is coding-agent behavior at a fraction of what Opus or GPT-5.5 charge to write output.

Model	Input / 1M	Output / 1M	Weights
DeepSeek V4-Pro	$0.44	$0.87	Open
Kimi K2.7 Code	$0.95	$4.00	Open
Qwen3.7 Max	$2.50	$7.50	Closed
Claude Opus 4.8	$5.00	$25.00	Closed
GPT-5.5	$5.00	$30.00	Closed

List prices. DeepSeek V4-Pro reflects the 75% discount Moonshot's rival made permanent in May. Qwen3.7 Max has shown a 50% launch promo on some routers. See tokencost.app/pricing for live rates.

Run a month of coding through each one

Coding agents are input-heavy: they re-read files, diffs, and tool output far more than they write. Take a small team burning 50M input and 10M output tokens a month, an 80/20 split, and the spread across the field looks like this:

Model	Monthly cost	vs K2.7
DeepSeek V4-Pro	$30	0.3x
Kimi K2.7 Code	$88	1x
Qwen3.7 Max	$200	2.3x
Claude Opus 4.8	$500	5.7x
GPT-5.5	$550	6.3x

K2.7 lands at one-sixth of Opus and one-third of Qwen3.7 Max, then gets undercut three-to-one by DeepSeek. The interesting comparison is not K2.7 against the frontier, where the gap was already huge with K2.6. It is K2.7 against DeepSeek V4-Pro, and that one comes down to which model writes better code on your stack, not which has the lower sticker.

The model itself, and how to run it

Architecturally K2.7 Code is the K2 family carried forward: a 1-trillion-parameter mixture of experts with 32B active per token, 384 experts, and a 256K context window. The weights are on Hugging Face as moonshotai/Kimi-K2.7-Code under a Modified MIT License, servable through vLLM, SGLang, or KTransformers if you have the hardware for a 1T-parameter file.

On the API, the endpoint at platform.kimi.ai is OpenAI-compatible, so swapping in kimi-k2.7-code is a one-line model ID change from K2.6. Moonshot also sells hosted Kimi Code plans from $19 to $199 a month if you would rather not meter tokens at all.

Should you switch?

If you already run K2.6, the upgrade is free in every sense that matters: same rate card, same model ID shape, and a credible argument that it writes less to get to the same place. Test it on your own workload, compare the output-token totals, and keep it if the bill drops. There is no downside risk on price.

If you are picking a coder fresh, do not let the launch numbers decide it. None of them have been checked by anyone but Moonshot. Run K2.7 Code against DeepSeek V4-Pro on your actual repository, look at which one ships working diffs, and let the three-to-one price gap break the tie only when the quality is close.

Sources

Kimi K2.7 Code API pricing - platform.kimi.ai (input $0.95 / cached $0.19 / output $4.00, 256K context)
Kimi K2.7 Code model card - Hugging Face (specs, license, Moonshot benchmark table)
Moonshot releases Kimi K2.7 Code - MarkTechPost, June 12, 2026
Practitioners on the benchmark gaps - VentureBeat
Kimi K2.6 pricing - kimi.com (for the K2.6 comparison: $0.16 cached)

Compare all model prices Calculate your API cost