Qwen2.5 72B Instruct
Qwen (Alibaba) · Budget · Context 131K
qwen/qwen-2.5-72b-instruct
Data as of:
LLM API list prices change frequently (new models and price cuts are common) and vary by tier, region, batch / cache usage and time. These are list prices captured at the time shown; always verify the current price with the provider before relying on it.
Price summary
per 1M input tokens
per 1M output tokens
0.75×input + 0.25×output (factual)
Blended $/1M is a published convenience figure: 0.75 × input + 0.25 × output (a stated 3:1 input:output mix). It is descriptive arithmetic, not a value verdict.
Specifications
Capability
Capability values are the published per-model score from Open LLM Leaderboard (Hugging Face), shown as-is with no edit and no “best” verdict. The leaderboard evaluates open-weight models only and lags the newest releases, so many models (including closed/proprietary APIs) have no value and show “—”. Different benchmarks rank models differently; treat this as one signal among many. As of 2026-05-25. Open LLM Leaderboard (Hugging Face) (Apache-2.0).
Official benchmark (maker-published)
These are the model maker's own published benchmark scores, reproduced as-is with the publisher source and an as-of date — not a Quanteta score and not a recommendation. They are raw percentages on the named benchmark and are NOT on the same scale as the open-weight leaderboard scores above; do not compare the two directly. The exact evaluation setting (e.g. 5-shot vs 0-shot chain-of-thought) is shown per value because it changes the number; only same-setting values are plotted together. Source: Qwen2.5 Technical Report (arXiv:2412.15115) / Qwen2.5-LLM blog: Qwen2.5-72B MMLU 86.1 (5-shot) (as of 2024-12-20).
Try it / official references
- OpenRouter model page (specs + try-it chat)
- Provider API documentation — Qwen (Alibaba)
- Provider playground — Qwen (Alibaba)
External links open the provider's own pages; list prices and availability there are authoritative.
Estimated cost per use case
| Use case | input tokens | output tokens | Cost (per 1,000 requests) |
|---|---|---|---|
| Chat / assistant | 1,000 | 500 | $0.56 |
| RAG / Q&A | 8,000 | 800 | $3.2 |
| Coding agent | 6,000 | 2,000 | $2.96 |
| Summarization | 12,000 | 600 | $4.56 |
Each row is (input_tokens/1M)×input_price + (output_tokens/1M)×output_price, scaled to 1,000 requests. Assumptions are as shown in the table. Not a recommendation.