~$QW14
Balanced 14B LLM with strong coding and reasoning performance. Fast enough for interactive applications, capable enough for complex tasks.
Cold start comparison vs similar models. Lower is better.
No subscriptions. Buy credits, pay per inference. Scale to zero when idle.
import cumulus from "cumulus-sdk" // Deploy Qwen 2.5 14B on Ion const client = await cumulus.deploy("qwen2-5-14b") // Run inference const result = await client.run({ prompt: "Your prompt here", // model-specific params... })