~$QW32
High-capacity 32B language model. Best-in-class for complex instruction following, long-context tasks, and enterprise workflows. Runs on all 5 universal-b nodes.
Cold start comparison vs similar models. Lower is better.
No subscriptions. Buy credits, pay per inference. Scale to zero when idle.
import cumulus from "cumulus-sdk" // Deploy Qwen 2.5 32B on Ion const client = await cumulus.deploy("qwen2-5-32b") // Run inference const result = await client.run({ prompt: "Your prompt here", // model-specific params... })