Documentation

Quick start

npm install @kindlyrobotics/ai-gateway

# or with the InferAll SDK (coming soon)
# npm install @inferall/sdk

Base URL

https://api.inferall.ai

All requests require an API key via Authorization: Bearer kr_proj_... or x-api-key: kr_proj_...

Endpoints

MethodPathDescription
POST/ai/v1/generateGenerate text, chat, images, or video
GET/ai/v1/modelsList all models with pricing
GET/ai/v1/healthHealth check
POST/v1/messagesAnthropic-compatible (Claude Code)
POST/ai/v1/keysCreate API key (requires JWT)
GET/ai/v1/keysList your keys (requires JWT)
GET/ai/v1/usageUsage summary (requires JWT)
POST/ai/v1/billing/checkoutStripe checkout session
GET/ai/v1/billing/statusBilling status and spend

TypeScript SDK

import { KindlyAI } from '@kindlyrobotics/ai-gateway';

const ai = new KindlyAI({
  apiKey: process.env.INFERALL_API_KEY!,
});

// Text generation (free via NVIDIA Llama 405B)
const text = await ai.text("Explain quantum computing");

// Chat with any provider
const reply = await ai.chat(messages, {
  provider: "anthropic",
  model: "claude-sonnet-4-20250514",
});

// Vision
const analysis = await ai.vision(imageBase64, "What is this?");

// Image generation
const image = await ai.imageGenerate("A sunset", {
  provider: "openai",
  model: "dall-e-3",
});

// Video generation
const video = await ai.generate({
  provider: "gemini",
  model: "veo-2.0-generate-001",
  operation: "video-generate",
  prompt: "Drone shot of a city",
});

// Streaming
const stream = await ai.chatStream(messages);

Python client

from lib.ai_gateway import KindlyAI

ai = KindlyAI(api_key="your_key")

# Text
response = ai.text("Summarize this data...")

# With system prompt
response = ai.text(
    "Analyze this:",
    system="You are a data analyst.",
    provider="openai",
    model="gpt-4o",
)

Claude Code integration

Requests are routed to free NVIDIA models by default.

# Use Claude Code with free inference
export ANTHROPIC_BASE_URL=https://api.inferall.ai
export ANTHROPIC_API_KEY=your_inferall_key

# Run Claude Code normally — uses free NVIDIA models by default
claude

# Force a specific provider with model prefix
# anthropic/claude-sonnet-4-20250514  → actual Claude
# gemini/gemini-2.5-flash            → Google Gemini

Providers

OpenAIGPT-4o, o1, DALL-E 3
AnthropicClaude Sonnet/Opus/Haiku
Google Gemini2.5 Flash/Pro, Veo, Imagen
NVIDIA NIM186 free models (Llama, Mixtral)
ReplicateFlux, Stable Diffusion
RunwayGen-4.5, video generation

Live model list

View all available models with pricing at api.inferall.ai/ai/v1/models