MoonshotAI: Kimi K2.5

Moonshot AI · Budget · Context 262K

moonshotai/kimi-k2.5

Data as of:

LLM API list prices change frequently (new models and price cuts are common) and vary by tier, region, batch / cache usage and time. These are list prices captured at the time shown; always verify the current price with the provider before relying on it.

Price summary

Input $/1M $0.4

per 1M input tokens

Output $/1M $1.9

per 1M output tokens

Blended $/1M $0.775

0.75×input + 0.25×output (factual)

Cache read $/1M $0.09

per 1M cached-input tokens

Blended $/1M is a published convenience figure: 0.75 × input + 0.25 × output (a stated 3:1 input:output mix). It is descriptive arithmetic, not a value verdict.

Specifications

Model
MoonshotAI: Kimi K2.5
Provider
Moonshot AI
Input $/1M
$0.4
Output $/1M
$1.9
In+Out $/1M
$2.3
Context
262K tokens
Max output
262K tokens
Cache read $/1M
$0.09
Modalities
text, image → text
Cross-checked
Differs

Capability

Capability score
MMLU-PRO
GPQA

Capability values are the published per-model score from Open LLM Leaderboard (Hugging Face), shown as-is with no edit and no “best” verdict. The leaderboard evaluates open-weight models only and lags the newest releases, so many models (including closed/proprietary APIs) have no value and show “—”. Different benchmarks rank models differently; treat this as one signal among many. As of 2026-05-25. Open LLM Leaderboard (Hugging Face) (Apache-2.0).

Try it / official references

External links open the provider's own pages; list prices and availability there are authoritative.

Estimated cost per use case

Use caseinput tokensoutput tokensCost (per 1,000 requests)
Chat / assistant 1,000 500 $1.35
RAG / Q&A 8,000 800 $4.72
Coding agent 6,000 2,000 $6.2
Summarization 12,000 600 $5.94

Each row is (input_tokens/1M)×input_price + (output_tokens/1M)×output_price, scaled to 1,000 requests. Assumptions are as shown in the table. Not a recommendation.