CumulusCumulus Labs
LeaderboardCategories
PlaygroundLogin
Get Started
LeaderboardCategoriesPlayground
Get Started
Back to Leaderboard
LLMs

Qwen 2.5 7B

+8.2%

~$QW25-7

Fast 7B parameter LLM with strong instruction following. Optimized for high-throughput workloads where speed matters. Ideal for real-time agentic pipelines.

35.0k stars9.1M HF downloads
Deploy Now
#11 overall · 4 deploys
1.2s
Cold Start
avg on Ion
186ms
Avg Inference
per request
4
Active Replicas
right now
4
Total Deployments
all time

7-Day Trend

+8.2% this week
MonTueWedThuFriSatSun

Cold Start vs Competitors

Lower is better
●~$QW25-71.2s
~$QW3-81.2s
~$QW301.7s
~$QW321.8s

Cold start comparison vs similar models. Lower is better.

Cost Estimate

Cost per 1K Inferences
$0.40
Est. Daily (1K req/day)
$0.40
Est. Monthly (30K req)
$12.00

No subscriptions. Buy credits, pay per inference. Scale to zero when idle.

Quick Deploy

cumulus-sdk
import cumulus from "cumulus-sdk"

// Deploy Qwen 2.5 7B on Ion
const client = await cumulus.deploy("qwen2-5-7b")

// Run inference
const result = await client.run({
  prompt: "Your prompt here",
  // model-specific params...
})

More in LLMs

Qwen 3 8B~$QW3-8
1.2s+22.4%
Qwen 3 30B A3B~$QW30
1.7s+16.4%
Qwen 2.5 32B~$QW32
1.8s+11.3%
Qwen 2.5 14B~$QW14
1.4s+9.8%
View all LLMs models →
← Iondex

© 2026 Cumulus Compute Labs Corporation