Google's Gemma 4 31B (`google/gemma-4-31b-it`) is available free via NVIDIA NIM through InferAll. No credit card, no billing setup — create a key and call it now.
```python
from openai import OpenAI
client = OpenAI(
base_url="https://api.inferall.ai/v1",
api_key="ifu_your_key_here", # get one at inferall.ai/keys
)
response = client.chat.completions.create(
model="google/gemma-4-31b-it",
messages=[{"role": "user", "content": "What are Gemma 4's key improvements over Gemma 3?"}],
max_tokens=512,
)
print(response.choices[0].message.content)
```
---
### What is Gemma 4?
Gemma 4 is Google's fourth generation of open-weight foundation models. The 31B instruction-tuned variant (`gemma-4-31b-it`) offers strong performance on reasoning, coding, and instruction following — significantly more capable than the Gemma 3 family while remaining fully open-weight and free to run via NVIDIA NIM.
Like all Gemma models, it's fully open-weight under Google's Gemma Terms of Use, available for commercial use, and hosted without charge on NVIDIA's DGX Cloud infrastructure.
---
### TypeScript / Node.js
```typescript
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://api.inferall.ai/v1",
apiKey: process.env.INFERALL_API_KEY,
});
const response = await client.chat.completions.create({
model: "google/gemma-4-31b-it",
messages: [{ role: "user", content: "Write a Python function to parse JSON." }],
});
console.log(response.choices[0].message.content);
```
### Streaming
```python
with client.chat.completions.create(
model="google/gemma-4-31b-it",
messages=[{"role": "user", "content": "Explain transformer attention in plain English."}],
stream=True,
) as stream:
for chunk in stream:
print(chunk.choices[0].delta.content or "", end="")
```
### Claude Code / Cline / Cursor
```sh
export ANTHROPIC_BASE_URL=https://api.inferall.ai
export ANTHROPIC_API_KEY=ifu_your_key_here
```
Gemma 4 routes as the "sonnet" tier equivalent for Anthropic-compatible clients.
---
### Free Google models on InferAll
| Model | Size | Notes |
|---|---|---|
| `google/gemma-4-31b-it` | 31B | Newest Gemma generation |
| `google/gemma-3-12b-it` | 12B | Gemma 3, instruction-tuned |
| `google/gemma-3-4b-it` | 4B | Fast, compact Gemma 3 |
| `google/codegemma-7b` | 7B | Optimized for code |
| `google/gemma-3n-e4b-it` | E4B | Gemma 3 Nano efficient |
All are free on NVIDIA NIM. The [full model list](https://api.inferall.ai/ai/v1/models) is always live at the API.
---
### Compare with other free models
```python
# Gemma 4 vs Llama 4 vs Nemotron — one prompt, three free models
models = [
"google/gemma-4-31b-it",
"meta/llama-4-maverick-17b-128e-instruct",
"nvidia/nemotron-3-super-120b-a12b",
]
for model in models:
resp = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": "What year was the transformer paper published?"}],
max_tokens=50,
)
print(f"{model.split('/')[-1]}: {resp.choices[0].message.content.strip()}")
```
---
### Get started
[inferall.ai/keys](https://inferall.ai/keys) — no credit card required. 200 free requests to evaluate, then add a card to unlock the full free allowance (still $0) and paid providers at zero markup.
← Blog
Google Gemma 4 31B — free API, no credit card
How to call Google's Gemma 4 31B for free using any OpenAI-compatible SDK. Hosted on NVIDIA NIM through InferAll. No billing setup, no credit card required.
InferAll Team
2 min read
Gemma 4Google AIfree LLM APINVIDIA NIMOpenAI APIopen source
Share
Related
2 min read
GPT-4.1, GPT-4.1 Mini, and GPT-4.1 Nano — via one API key
How to call OpenAI's GPT-4.1 family through InferAll's OpenAI-compatible endpoint. Try all three tiers — nano to full — with the same key, same SDK, no provider switching.
3 min read
Mistral Codestral 22B — free API for code generation
How to call Codestral 22B for free using any OpenAI-compatible SDK. Mistral's code-specialized model, hosted on NVIDIA NIM through InferAll. No credit card required.
3 min read
Free GPT-4 alternatives — open-source models via the OpenAI API
The top free open-source alternatives to GPT-4, callable with the same OpenAI SDK. No code changes, no credit card required. Hosted on NVIDIA NIM through InferAll.