~$QW3-14
14B reasoning model with extended thinking mode. Hybrid reasoning approach lets you trade compute for accuracy on demand. Runs on all 4 universal-a nodes.
Cold start comparison vs similar models. Lower is better.
No subscriptions. Buy credits, pay per inference. Scale to zero when idle.
import cumulus from "cumulus-sdk" // Deploy Qwen 3 14B on Ion const client = await cumulus.deploy("qwen3-14b") // Run inference const result = await client.run({ prompt: "Your prompt here", // model-specific params... })