Your first call in 60 seconds.
From zero to a streaming chat completion. One key, every model — drop-in OpenAI-compatible endpoint, 40+ models, The price is 10% lower than the official price. Pick a language and copy.
1. Get your API key
Sign in at bentoo.ai/dashboard and click Create key. Keys start with btoo_ and are shown once — store yours in a password manager or a secret vault.
Set your environment
Export the key as BENTOO_API_KEY so the SDKs pick it up automatically:
# Add to ~/.zshrc or ~/.bashrc export BENTOO_API_KEY="btoo_sk_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" export BENTOO_BASE_URL="https://api.bentoo.ai/v1"
# Windows PowerShell $env:BENTOO_API_KEY = "btoo_sk_live_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" $env:BENTOO_BASE_URL = "https://api.bentoo.ai/v1"
2. Install the SDK
Bentoo AI is fully OpenAI SDK compatible — keep your existing code, just point the base URL at us. Or use our typed first-party SDKs for stricter ergonomics.
pip install bentoo # first-party SDK pip install openai # or use OpenAI SDK
npm install bentoo
# or
pnpm add bentoo
yarn add bentoo
go get github.com/bentoo-ai/bentoo-go@latest
cargo add bentoo
3. Make your first call
A minimal chat completion. Swap model for any of 40+ supported models — same payload shape across providers.
from bentoo import Bentoo client = Bentoo() # reads BENTOO_API_KEY from env response = client.chat.completions.create( model="claude-sonnet-4-6", messages=[ {"role": "user", "content": "Write a haiku about TCP."} ], temperature=0.7, ) print(response.choices[0].message.content)
import Bentoo from "bentoo"; const client = new Bentoo(); // reads BENTOO_API_KEY from env const response = await client.chat.completions.create({ model: "claude-sonnet-4-6", messages: [ { role: "user", content: "Write a haiku about TCP." } ], temperature: 0.7, }); console.log(response.choices[0].message.content);
curl https://api.bentoo.ai/v1/chat/completions \ -H "Authorization: Bearer $BENTOO_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "claude-sonnet-4-6", "messages": [ { "role": "user", "content": "Write a haiku about TCP." } ], "temperature": 0.7 }'
package main import ( "context" "fmt" "github.com/bentoo-ai/bentoo-go" ) func main() { client := bentoo.NewClient() resp, _ := client.Chat.Completions.Create( context.Background(), bentoo.ChatRequest{ Model: "claude-sonnet-4-6", Messages: []bentoo.Message{ {Role: "user", Content: "Write a haiku about TCP."}, }, }, ) fmt.Println(resp.Choices[0].Message.Content) }
4. Stream tokens as they arrive
For chat UIs and long completions, set stream: true. Tokens arrive over Server-Sent Events — first token usually in <400ms.
from bentoo import Bentoo client = Bentoo() for chunk in client.chat.completions.create( model="claude-sonnet-4-6", messages=[{"role": "user", "content": "Explain quantum computing."}], stream=True, ): if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="", flush=True)
import Bentoo from "bentoo"; const stream = await client.chat.completions.create({ model: "claude-sonnet-4-6", messages: [{role: "user", content: "Explain quantum computing."}], stream: true, }); for await (const chunk of stream) { if (chunk.choices[0]?.delta?.content) { process.stdout.write(chunk.choices[0].delta.content); } }
curl https://api.bentoo.ai/v1/chat/completions \ -H "Authorization: Bearer $BENTOO_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "claude-sonnet-4-6", "messages": [{"role": "user", "content": "Explain quantum computing."}], "stream": true }'
try / finally and call stream.close(), or use a context manager.
5. Request parameters
Core fields for the chat.completions endpoint. The full spec is in the API reference.
gpt-5, claude-sonnet-4-6, gemini-2.5-pro, deepseek-v3. See the model registry for the full list.{role, content} objects. Roles: system, user, assistant, tool.0–2. Lower = more deterministic. Default 1.true, partial deltas are sent over SSE. Default false.{ "type": "json_object" } or a JSON Schema for strict structured output.6. Response codes
Bentoo AI follows standard HTTP semantics. Errors return a JSON body with error.code and error.message.
choices[] and usage.Retry-After header and back off.X-Bentoo-Fallback: off.