VS Code extension

Cline with free inference
built in. Zero setup.

InferAll for VS Code is a Cline-based agent with InferAll's Anthropic-compatible gateway wired in. Sign up, activate with the $5 starter pack (open NIM models bill at $0 input/$0 output), and the extension handles the rest — no providers to configure, no separate bills.

Install

# 1. Install from the VS Code Marketplace (listing coming soon).
# 2. Open the InferAll sidebar in VS Code.
# 3. Sign in. 100 chat / 50 text reqs/day on 118+ open NIM models — $0.

Marketplace listing — coming soonDirect .vsix download

Why use it

Open NIM models at $0 input/$0 output

Sign up at InferAll, activate with the $5 starter pack, and start coding. NVIDIA NIM open models (Llama 3.1 70B, Mixtral, Nemotron, more) bill at $0 input/$0 output against your starter balance, subject to your tier's daily request quota. Enough to evaluate the workflow end-to-end with $5.

Anthropic-compatible

The InferAll gateway speaks Anthropic's API format. The extension slots into Claude Code-style flows and any tooling chain that already accepts Anthropic-format endpoints.

Single key, single bill

One vendor relationship for procurement, one billing surface. Every gateway call hits api.inferall.ai — one endpoint to log, govern, and reason about.

Automatic provider failover

When a provider rate-limits or has an incident, the gateway transparently retries against another upstream that can serve the model class. Your editor session keeps moving.

Procurement

Designed for environments where AI procurement is hard

Many engineering teams work inside organizations where ad-hoc ChatGPT use is being restricted and where every new SaaS vendor triggers a security review. InferAll is intentionally positioned to fit that constraint: one vendor, one endpoint, one billing relationship, format-compatible with tooling that's already been approved.

We do not claim FedRAMP, SOC 2, HIPAA, FISMA, or any other certification — those require real audits, and we will only publish them when they exist. See /security for the current state.

Single-vendor relationship

One contract, one DPA, one endpoint — instead of separately evaluating Anthropic plus OpenAI plus Google plus the agent vendor.

Audit-log-ready

The extension scaffolds an inferall.auditLog.enabled VS Code setting for compliance buyers. (The local audit-log writer is on the near-term roadmap; the gateway already centralizes every call through one endpoint that can be logged today.)

Anthropic-format means approved stacks work

If a team has already cleared Claude Code or Anthropic SDK-based tooling, InferAll drops into that same chain without onboarding a brand new vendor.

No fan-out to multiple vendors

Requests leave a single trust boundary. The gateway handles upstream provider selection so the client never talks directly to multiple AI vendors.

Screenshots

Screenshot — sidebar view (placeholder)

Screenshot — first-run sign-in (placeholder)

Already using Cline?

You can use upstream Cline + InferAll too

InferAll for VS Code is the canonical InferAll-branded experience — audit-log scaffolding, single-vendor positioning, no-key first run. If you already use upstream Cline, you can still route it through InferAll by setting the Anthropic base URL to https://api.inferall.ai in Cline's API settings. Both paths remain supported.

Cline with free inferencebuilt in. Zero setup.