Developer Docs

Integrate in
5 minutes.

One line change. Works with every LLM provider. Drop-in compatible with OpenAI SDK, Anthropic SDK, LangChain, and more.

Python
Node.js
curl
import openai # Before Trimio client = openai.OpenAI( api_key="sk-..." ) # After Trimio — one line change client = openai.OpenAI( api_key="sk-...", base_url="https://api.trimio.ai/v1" ) response = client.chat.completions.create( model="gpt-4o", messages=[{"role": "user", "content": "Hello!"}] )
import OpenAI from 'openai'; // After Trimio — one line change const client = new OpenAI({ apiKey: 'sk-...', baseURL: 'https://api.trimio.ai/v1' }); const response = await client.chat.completions.create({ model: 'gpt-4o', messages: [{ role: 'user', content: 'Hello!' }] });
curl https://api.trimio.ai/v1/chat/completions \ -H "Authorization: Bearer sk-..." \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o", "messages": [ {"role": "user", "content": "Hello!"} ] }'
Getting Started

Live in three steps.

01
Get your virtual key
Sign up and get a scoped virtual API key. Your provider keys stay in your environment — we never see them.
02
Change one URL
Point your SDK's base_url to api.trimio.ai/v1. That's it.
03
Watch savings accumulate
Dashboard populates in real time. First savings report within 30 days. Zero maintenance from your team.
Features

What you get out of the box.

OBSERVABILITY
Full Request Logging
Every request logged with 40+ fields: latency, tokens, model, cache status, cost, savings. Prometheus + OTEL native.
RELIABILITY
Automatic Fallbacks
If a provider is down, requests automatically route to the next best option. Zero downtime, zero code changes.
CACHING
Provider Cache Optimization
Sophisticated cache-aware request handling that maximizes hit rates against Anthropic, OpenAI, and Google's native prompt caching. 93% average token savings on cache hits. No Trimio-side cache state to manage.
GOVERNANCE
Rate Limiting
Per-key and per-team rate limits. Prevent runaway scripts from generating surprise bills overnight.
BUDGETS
Budget Controls
Set monthly spend limits per team, project, or key. Get alerted before limits are hit — not after.
SECURITY
Virtual Key Governance
Issue scoped virtual keys per team. Rotate and revoke without touching provider credentials.
Observability

Every metric, every request.

40+
Fields per request
<20ms
Added latency
99.99%
Uptime SLA
1,600+
LLMs supported
SDK & Framework Compatibility
SDK / FrameworkStatusNotes
OpenAI Python SDKDrop-inChange base_url only
OpenAI Node.js SDKDrop-inChange baseURL only
Anthropic SDKDrop-inOpenAI-compat endpoint
LangChainDrop-inWorks with all LLM wrappers
LlamaIndexDrop-inAll model integrations
curl / HTTPFullStandard OpenAI REST API
Exports natively to your existing stack:

Start saving in 5 minutes.

No infrastructure changes. No code rewrites. Just one URL.

Get API Access →