Qwen 2.5 VL 7B

+7.6%

~$Q25VL

Efficient vision-language model at 7B scale. Handles images, documents, and video frames with strong multimodal reasoning. Runs on all 4 universal-a nodes.

35.0k stars 7.3M HF downloads

Deploy Now

#12 overall · 4 deploys

1.3s

Cold Start

avg on Ion

445ms

Avg Inference

per request

Active Replicas

right now

Total Deployments

all time

7-Day Trend

+7.6% this week

Cold Start vs Competitors

Lower is better

●~$Q25VL1.3s

~$QW3VL1.3s

~$IVNL1.4s

Cold start comparison vs similar models. Lower is better.

Cost Estimate

Cost per 1K Inferences

$0.50

Est. Daily (1K req/day)

$0.50

Est. Monthly (30K req)

$15.00

No subscriptions. Buy credits, pay per inference. Scale to zero when idle.

Quick Deploy

cumulus-sdk

import cumulus from "cumulus-sdk"

// Deploy Qwen 2.5 VL 7B on Ion
const client = await cumulus.deploy("qwen2-5-vl-7b")

// Run inference
const result = await client.run({
  prompt: "Your prompt here",
  // model-specific params...
})

More in Vision

Qwen 3 VL 8B~$QW3VL

1.3s+17.3%

InternVL 3.5 8B~$IVNL

1.4s+13.2%

View all Vision models →

Cumulus Labs

Leaderboard Categories Playground

Get Started

Back to Leaderboard

Vision

Qwen 2.5 VL 7B

+7.6%

~$Q25VL

Efficient vision-language model at 7B scale. Handles images, documents, and video frames with strong multimodal reasoning. Runs on all 4 universal-a nodes.

35.0k stars 7.3M HF downloads

Deploy Now

#12 overall · 4 deploys

1.3s

Cold Start

avg on Ion

445ms

Avg Inference

per request

Active Replicas

right now

Total Deployments

all time

7-Day Trend

+7.6% this week

Cold Start vs Competitors

Lower is better

●~$Q25VL1.3s

~$QW3VL1.3s

~$IVNL1.4s

Cold start comparison vs similar models. Lower is better.

Cost Estimate

Cost per 1K Inferences

$0.50

Est. Daily (1K req/day)

$0.50

Est. Monthly (30K req)

$15.00

No subscriptions. Buy credits, pay per inference. Scale to zero when idle.

Quick Deploy

cumulus-sdk

import cumulus from "cumulus-sdk"

// Deploy Qwen 2.5 VL 7B on Ion
const client = await cumulus.deploy("qwen2-5-vl-7b")

// Run inference
const result = await client.run({
  prompt: "Your prompt here",
  // model-specific params...
})

More in Vision

Qwen 3 VL 8B~$QW3VL

1.3s+17.3%

InternVL 3.5 8B~$IVNL

1.4s+13.2%

View all Vision models →