~$QW30
Mixture-of-experts 30B model with only 3B active parameters per token. GPT-4o-level quality at a fraction of inference cost. Runs on all 5 universal-b nodes.
Cold start comparison vs similar models. Lower is better.
No subscriptions. Buy credits, pay per inference. Scale to zero when idle.
import cumulus from "cumulus-sdk" // Deploy Qwen 3 30B A3B on Ion const client = await cumulus.deploy("qwen3-30b-a3b") // Run inference const result = await client.run({ prompt: "Your prompt here", // model-specific params... })