Pricing

Start free. Pay only for what you use.

110+ open-source models are $0 with no credit card. Premium providers (OpenAI, Anthropic, Google) bill at the provider's published rate — zero markup.

Free

$0/month

Evaluate and prototype with open-source models.

  • 200 free requests (no credit card required)
  • 110+ open-source models on NVIDIA NIM
  • Llama 3.1 70B, Mixtral, Nemotron, Codestral, Gemma 4
  • OpenAI-compatible and Anthropic-compatible endpoints
  • Automatic failover across providers
Get free key

Pay-as-you-go· Most popular

0 + usage

Add a card to activate the full free tier and unlock premium providers.

  • All free-tier models — still $0
  • OpenAI, Anthropic, Google, Replicate, Runway
  • Charged at the provider's published per-token rate
  • Zero markup — you pay exactly what providers charge
  • $10/month default spending cap (configurable)
Add a card

Pro

$29/month

For developers shipping production AI features.

  • 2M tokens included
  • All 190+ models across every provider
  • Image generation (Flux, DALL·E, Imagen)
  • Video generation (Kling, Runway, Veo)
  • Priority routing
Upgrade to Pro

Enterprise

Custom

Custom token allowance, dedicated support, and compliance.

  • Custom token allowance
  • Custom model hosting
  • Dedicated account team
  • Priority support SLA
  • Custom DPA and compliance
Contact sales

Token prices

InferAll charges the provider's published rate with zero markup. Prices shown are per 1M tokens (input / output) as of June 2026; see each provider's website for the current rate.

NVIDIA NIM (110+ models)Free, no cap after card on file
OpenAI GPT-4o$2.50 / $10.00 per 1M
OpenAI GPT-4o-mini$0.15 / $0.60 per 1M
Anthropic Claude Sonnet 4$3.00 / $15.00 per 1M
Google Gemini 2.5 Flash$0.15 / $0.60 per 1M
Google Gemini 2.5 Pro$1.25 / $10.00 per 1M

See inferall.ai/live for the full model list and pricing.

Common questions

Is the free tier actually free?

Yes. The 110+ open-source models on NVIDIA NIM are $0 to call. No credit card required to start — you get 200 free requests to evaluate. Add a card to continue past the trial and unlock paid providers; the free models stay $0.

What does 'zero markup' mean?

When you use a premium provider (OpenAI, Anthropic, Google), you pay exactly what that provider charges per token. InferAll does not add a markup. A $0.15/M-token call to GPT-4o-mini costs $0.15/M through InferAll.

How does the $10/month spending cap work?

By default, usage on premium providers is capped at $10/month to prevent unexpected bills. You can raise or remove this cap in your billing settings. The cap does not apply to free NVIDIA NIM models.

What's the difference between the free tier and pay-as-you-go?

Both give you the same 110+ free NVIDIA NIM models at $0. Pay-as-you-go additionally unlocks premium providers (OpenAI, Anthropic, Google) billed per token. The only requirement is a card on file.

Get your free API key — no credit card required