Kimi K2 0905

moonshotai/kimi-k2-0905

Kimi K2 0905 is Moonshot AI's September 2025 K2 checkpoint, a refined release focused on agentic coding workflows with a context window of 256K tokens, available through AI Gateway via fireworks.

Tool Use

import { streamText } from 'ai'

const result = streamText({
  model: 'moonshotai/kimi-k2-0905',
  prompt: 'Why is the sky blue?'
})

Playground

Try out Kimi K2 0905 by Moonshot AI. Usage is billed to your team at API rates. Free users get $5 of credits every 30 days, and you are considered a free user if you haven't made a payment.

About Kimi K2 0905

Kimi K2 0905 carries a date stamp (September 5, 2025), following Moonshot AI's convention of identifying checkpoints by release date. The 0905 release is a distinct checkpoint from the original K2, not a silent in-place update.

The context window of 256K tokens is the main structural change. Agentic coding sessions accumulate context quickly: task descriptions, file contents, tool outputs, reasoning steps, and error messages stack up over many turns. The window of 256K tokens keeps an entire project-scale context in scope without truncation. This matters when an agent reviews multiple files, runs tests, and iterates on fixes across a long session.

For teams already using the base K2, the 0905 checkpoint is a drop-in upgrade. It brings training refinements and the extended context window. The API interface, tool-calling format, and integration patterns stay the same. Switch by updating the model string to moonshotai/kimi-k2-0905.

The narrower provider set is the main operational difference. This checkpoint routes across fireworks, while the base K2 covers a wider set. Weigh that against the 0905 training improvements if you need the largest failover coverage.

Kimi K2 0905 is available through AI Gateway at $0.6 per million input tokens and $2.5 per million output tokens.

Providers

The AI Gateway supports routing requests across multiple AI providers. You can control provider preferences using the provider slugs available for copying with the buttons below. For more see the AI Gateway provider options documentation. By using the AI provider you acknowledge you reviewed and agree to their terms listed in the Legal section under the AI provider's name.

Provider

Context	Max Output	Latency	Throughput	Input	Output	Cache	Image Gen	Video Gen	Web Search	Per Query	Capabilities	ZDR	No Training	HIPAA	Release Date

Legal:Terms

•

Privacy

256K

128K

$0.60/M

$2.50/M

Read:$0.3/M

Write:—

—

09/05/2025

Metrics

Based exclusively on usage through AI Gateway.

Throughput24 hours

More models by Moonshot AI

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

262K

0.9s

88tps

$0.95/M

$4.00/M

Read:$0.16/M

Write:—

—

04/20/2026

262K

0.3s

101tps

$0.50/M

$2.80/M

Read:$0.1/M

Write:—

—

01/26/2026

262K

0.9s

19tps

$0.60/M

$2.50/M

Read:$0.15/M

Write:—

—

11/06/2025

262K

0.7s

116tps

$1.15/M

$8.00/M

Read:$0.15/M

Write:—

—

11/06/2025

256K

0.7s

$1.15/M

$8.00/M

Read:$0.15/M

Write:—

—

09/05/2025

131K

1.0s

36tps

$0.57/M

$2.30/M

—

09/05/2025

What To Consider When Choosing a Provider

Zero Data Retention
AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
Authentication
AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

This checkpoint routes across fewer providers than the base K2. Monitor provider-level status during high-demand periods if you observe elevated latency.

When to Use Kimi K2 0905

Best For

Long agentic sessions:
Accumulated context (tool outputs, file contents, multi-turn history) pushes beyond the base K2 context window
September 5, 2025 training refinements:
Workloads targeting the newer checkpoint's agentic coding improvements
Full-codebase review:
Multi-file code review or generation where the context of 256K tokens enables a complete codebase view in one call
Drop-in upgrade:
Existing base K2 integrations seeking a direct upgrade to the newer checkpoint

Consider Alternatives When

Chain-of-thought traces:
Kimi K2 Thinking variants are designed for explicit reasoning output
Maximum routing redundancy:
Base Kimi K2 routes across a wider provider set than this checkpoint
Fastest K2 inference:
Kimi K2 Turbo is the speed-optimized variant
Shorter context needs:
Tasks that don't require the full 256K tokens benefit from base K2's broader failover pool

Conclusion

Kimi K2 0905 delivers September 5, 2025 training refinements for agentic coding alongside a context window of 256K tokens that accommodates the long histories of extended coding agent sessions. For teams running base K2 in agentic coding workflows, it's the checkpoint update with the larger context window. Switch by changing the model string to moonshotai/kimi-k2-0905 with no other integration changes.

FAQ

Agentic coding. The checkpoint refines multi-step development tasks, tool use in coding workflows, and sustained context across long coding sessions.

Coding agents accumulate context rapidly: file contents, function signatures, test outputs, error logs, and multi-turn reasoning traces all consume tokens. A window of 256K tokens keeps a much larger project scope in context at once, which cuts truncation workarounds.

Update the model string in your API call to moonshotai/kimi-k2-0905. Authentication, tool-calling format, and the rest of the integration stay the same.

AI Gateway routes Kimi K2 0905 across fireworks. Failover between them is automatic.

Yes, in the same lineage as other open-weight K2-family models. Check Moonshot AI's Hugging Face repository for license terms specific to this checkpoint.

Yes. Tool calling through the standard function-calling interface matches the agentic coding focus of the 0905 training refinements.

If context length isn't a constraint, the base Kimi K2 routes across a wider provider set and may give more availability headroom for high-uptime production use.

AI Cloud

Core Platform

Security

Company

Learn

Open Source

Use Cases

Tools

Users

Kimi K2 0905

Playground

About Kimi K2 0905

Providers

More models by Moonshot AI

What To Consider When Choosing a Provider

Zero Data Retention

Authentication

When to Use Kimi K2 0905

Best For

Long agentic sessions:

September 5, 2025 training refinements:

Full-codebase review:

Drop-in upgrade:

Consider Alternatives When

Chain-of-thought traces:

Maximum routing redundancy:

Fastest K2 inference:

Shorter context needs:

Conclusion

FAQ

Playground

About Kimi K2 0905

Providers

More models by Moonshot AI

About Kimi K2 0905

What To Consider When Choosing a Provider

Zero Data Retention

Authentication

When to Use Kimi K2 0905

Best For

Long agentic sessions:

September 5, 2025 training refinements:

Full-codebase review:

Drop-in upgrade:

Consider Alternatives When

Chain-of-thought traces:

Maximum routing redundancy:

Fastest K2 inference:

Shorter context needs:

Conclusion

FAQ

What was the focus of the 0905 checkpoint update?

Why does the context window of 256K tokens matter for agentic coding specifically?

How does switching from base K2 to kimi-k2-0905 work?

What providers serve this checkpoint through AI Gateway?

Is the 0905 checkpoint open-weight?

Does kimi-k2-0905 support tool calling?

What if the context of 256K tokens is more than my tasks need?

About Kimi K2 0905