Anthropic's Claude 4 family is available through InferAll — using the same Anthropic SDK you already have, with no separate API key to manage. Current Claude models available: | Model | Input | Output | Best for | |---|---|---|---| | `claude-opus-4-8` | $15.00/M | $75.00/M | Hardest reasoning, agentic tasks | | `claude-opus-4-7` | $15.00/M | $75.00/M | Complex analysis, long documents | | `claude-opus-4-6` | $15.00/M | $75.00/M | High-stakes generation | | `claude-sonnet-4-6` | $3.00/M | $15.00/M | Most tasks — the daily driver | | `claude-sonnet-4-5-20250929` | $3.00/M | $15.00/M | Sonnet 4.5 milestone | | `claude-haiku-4-5-20251001` | $0.80/M | $4.00/M | Fast, cheap, high-volume | All at Anthropic's published list rates, zero markup. --- ### With the Anthropic SDK ```python import anthropic client = anthropic.Anthropic( api_key="ifu_your_key_here", # get one at inferall.ai/keys base_url="https://api.inferall.ai", ) # Opus 4 — for your hardest tasks message = client.messages.create( model="claude-opus-4-8", max_tokens=1024, messages=[{"role": "user", "content": "Analyze this contract for risk clauses: ..."}], ) # Sonnet 4.6 — the everyday workhorse message = client.messages.create( model="claude-sonnet-4-6", max_tokens=512, messages=[{"role": "user", "content": "Write unit tests for this function: ..."}], ) print(message.content[0].text) ``` The only change from direct Anthropic: add `base_url="https://api.inferall.ai"`. Your existing client code, tool use, streaming, and system prompts all work unchanged. --- ### TypeScript ```typescript import Anthropic from "@anthropic-ai/sdk"; const client = new Anthropic({ apiKey: process.env.INFERALL_API_KEY, baseURL: "https://api.inferall.ai", }); const message = await client.messages.create({ model: "claude-sonnet-4-6", max_tokens: 256, messages: [{ role: "user", content: "Explain this stack trace in plain English." }], }); console.log(message.content[0].text); ``` --- ### Claude Code through one InferAll key ```bash export ANTHROPIC_BASE_URL=https://api.inferall.ai export ANTHROPIC_API_KEY=ifu_your_key_here claude # Claude Code now routes through InferAll ``` By default, Claude Code requests bill against your starter balance at Anthropic's published per-token rate (zero markup). You can also route specific calls to a cheaper open model when the task allows: ```bash # In Claude Code, set model to a cheap NIM open model for high-volume turns: # anthropic/claude-sonnet-4-6 for real Claude, meta/llama-3.1-70b-instruct for open ``` --- ### OpenAI-compatible endpoint If your stack uses the OpenAI SDK format, Claude models are reachable there too: ```python from openai import OpenAI client = OpenAI( base_url="https://api.inferall.ai/v1", api_key="ifu_your_key_here", ) response = client.chat.completions.create( model="claude-sonnet-4-6", messages=[{"role": "user", "content": "Refactor this code for clarity."}], max_tokens=512, ) ``` --- ### Why one key for Claude + everything else The same `ifu_...` key that calls Claude also routes to GPT-4.1, Gemini 2.5 Flash, and 118+ open NVIDIA NIM models. When you want to benchmark Claude Sonnet against GPT-4.1-mini on a task, it's a single string change — no credential juggling. Sign-up funds a key with a $5 starter pack — usage credit you spend on any model (Claude, GPT, Gemini, or NIM open models) at the provider's published rate with zero markup. Get your key at [inferall.ai/keys](https://inferall.ai/keys).

Claude Opus 4 and Sonnet 4 — via one API key

Run Claude Code with 200 free requests via NVIDIA NIM — 60-second setup

NVIDIA Nemotron 3 Super 120B vs Claude Opus 4: when the free model is good enough

One observability ship found three production bugs in five hours