API · Models
TextFeatured
NVIDIA: Nemotron 3 Ultra
NVIDIA Nemotron 3 Ultra is a 550B parameter (55B activated) open reasoning model built for long-running autonomous agents handling orchestration and complex tasks across coding, deep research, and enterprise workflows. Its hybrid Mamba-Transformer MoE architecture combines Latent MoE — which calls 4 experts at the inference cost of 1 — with Multi-Token Prediction for reduced generation time on long sequences, and Token Budget support for optimal accuracy with minimum reasoning token output. The model supports a 1M token context window and is fully open under the NVIDIA Open Model License with open weights, training data, and recipes.
ReasoningToolsCache
Context 262K·Max out 65K
Quickstart
import { streamText } from 'ai';
import { createOpenAICompatible } from '@ai-sdk/openai-compatible';
const blackbox = createOpenAICompatible({
name: 'blackbox',
apiKey: process.env.BLACKBOX_API_KEY!,
baseURL: 'https://api.blackbox.ai/v1',
});
const result = streamText({
model: blackbox('blackboxai/nvidia/nemotron-3-ultra'),
prompt: 'Why is the sky blue?',
});
for await (const chunk of result.textStream) {
process.stdout.write(chunk);
}Specs & pricing
- Input $/M
- $0.37
- Output $/M
- $1.08
- Context window
- 262K
- Max output
- 65K
- Supported parameters
- tool-usereasoningstreaming
Related models
| Model | Type | Context | Input $/M |
|---|---|---|---|
| Nemotron 3 Nano 30B | Text | 262.1K | $0.05 |
| Nemotron Nano 12B VL | Text | 128K | $0.20 |
| Nemotron 3 Ultra 550B | Text | 1M | $0.37 |
| Blackbox Pro | Text | 400K | $1.75 |
| Claude Opus 4.7 | Text | 1M | $5 |