Kimi K2 — Moonshot AI’s Next-Gen Agentic Intelligence
Kimi K2 is Moonshot AI’s breakthrough trillion-parameter Mixture-of-Experts (MoE) model, built for deep reasoning, long-form context, and autonomous tool use.
Kimi K2 is Moonshot AI’s breakthrough trillion-parameter Mixture-of-Experts (MoE) model, built for deep reasoning, long-form context, and autonomous tool use.
Trusted by users from 10,000+ companies
Kimi K2 blends scale, reasoning prowess, and tool-enabled autonomy to extraordinary performance.

Kimi K2 employs a mixture-of-experts design with 1 trillion total parameters, but only activates ~32 billion per token via expert routing. This provides massive capacity without incurring the full inference cost, enabling deep knowledge.

With a 128,000-token context length, the Kimi K2 model can process entire books, lengthy codebases, or multi-document conversations in a single pass. This makes it ideal for tasks that require sustained memory.
On key evaluations, Kimi K2 achieves 71.6% on SWE-bench (coding), 65.8% on agentic task benchmarks, and 53.7% on LiveCodeBench v6. Kimi K2 Thinking goes even further (44.9% on Humanity’s Last Exam).
Feed Kimi K2 long text, huge reports, or extended conversations, and it will keep every detail in context. Thanks to its long-window memory, you don’t have to break up your work into smaller chunks.

Moonshot AI Kimi K2 is engineered to be smart, efficient, and deeply capable.
Uses 384 “experts” but only activates a few at a time, giving huge knowledge without massive slowdowns.
Trained using the MuonClip optimizer, which prevents the training from becoming unstable even at massive scale.
Can read and work with super long documents (up to 128,000 tokens), so nothing gets cut off.
Released under a modified MIT license, meaning it’s free to use, adapt, and integrate for many kinds of projects.
Designed to call tools, run code, and make decisions — not just chat, but do.
Performs very well on coding benchmarks, making it a great helper for programming.
Trained with a massive dataset, so it handles many languages and domains smoothly.
Even with trillion-parameter model, it only uses about 32 billion per inference, making it cheaper to run.
Built to plan, evaluate, and act on complex tasks with minimal human help — ideal for multi-step workflows.
Still got questions? Learn more about how Kimi K2 operates.