Documentation
Quick start
npm install @kindlyrobotics/ai-gateway # or with the InferAll SDK (coming soon) # npm install @inferall/sdk
Base URL
https://api.inferall.aiAll requests require an API key via Authorization: Bearer kr_proj_... or x-api-key: kr_proj_...
Endpoints
| Method | Path | Description |
|---|---|---|
| POST | /ai/v1/generate | Generate text, chat, images, or video |
| GET | /ai/v1/models | List all models with pricing |
| GET | /ai/v1/health | Health check |
| POST | /v1/messages | Anthropic-compatible (Claude Code) |
| POST | /ai/v1/keys | Create API key (requires JWT) |
| GET | /ai/v1/keys | List your keys (requires JWT) |
| GET | /ai/v1/usage | Usage summary (requires JWT) |
| POST | /ai/v1/billing/checkout | Stripe checkout session |
| GET | /ai/v1/billing/status | Billing status and spend |
TypeScript SDK
import { KindlyAI } from '@kindlyrobotics/ai-gateway';
const ai = new KindlyAI({
apiKey: process.env.INFERALL_API_KEY!,
});
// Text generation (free via NVIDIA Llama 405B)
const text = await ai.text("Explain quantum computing");
// Chat with any provider
const reply = await ai.chat(messages, {
provider: "anthropic",
model: "claude-sonnet-4-20250514",
});
// Vision
const analysis = await ai.vision(imageBase64, "What is this?");
// Image generation
const image = await ai.imageGenerate("A sunset", {
provider: "openai",
model: "dall-e-3",
});
// Video generation
const video = await ai.generate({
provider: "gemini",
model: "veo-2.0-generate-001",
operation: "video-generate",
prompt: "Drone shot of a city",
});
// Streaming
const stream = await ai.chatStream(messages);Python client
from lib.ai_gateway import KindlyAI
ai = KindlyAI(api_key="your_key")
# Text
response = ai.text("Summarize this data...")
# With system prompt
response = ai.text(
"Analyze this:",
system="You are a data analyst.",
provider="openai",
model="gpt-4o",
)Claude Code integration
Requests are routed to free NVIDIA models by default.
# Use Claude Code with free inference export ANTHROPIC_BASE_URL=https://api.inferall.ai export ANTHROPIC_API_KEY=your_inferall_key # Run Claude Code normally — uses free NVIDIA models by default claude # Force a specific provider with model prefix # anthropic/claude-sonnet-4-20250514 → actual Claude # gemini/gemini-2.5-flash → Google Gemini
Providers
OpenAIGPT-4o, o1, DALL-E 3
AnthropicClaude Sonnet/Opus/Haiku
Google Gemini2.5 Flash/Pro, Veo, Imagen
NVIDIA NIM186 free models (Llama, Mixtral)
ReplicateFlux, Stable Diffusion
RunwayGen-4.5, video generation
Live model list
View all available models with pricing at api.inferall.ai/ai/v1/models