LiteLLM Integration
LiteLLM is a Python SDK that lets you call 100+ LLM providers with one unified interface. By pointing LiteLLM’sapi_base at the LeanMCP AI Gateway, every request — including tool calls, token usage, and cost — gets logged to your observability dashboard with zero changes to your model code.
This is useful when you:
- Run evaluations or benchmarks across multiple models and want a single place to inspect every call
- Use tool-calling agents and need to see the full request/response cycle per tool invocation
- Want cost and latency tracking without adding custom instrumentation
Prerequisites
Get Credits
Purchase credits at app.leanmcp.com/billing
Create API Key
Create an API key at app.leanmcp.com/api-keys with SDK permissions
Gateway Endpoints
| Provider | Gateway Base URL |
|---|---|
| OpenAI | https://aigateway.leanmcp.com/v1/openai |
| Anthropic | https://aigateway.leanmcp.com/v1/anthropic |
| xAI (Grok) | https://aigateway.leanmcp.com/v1/xai |
| Fireworks | https://aigateway.leanmcp.com/v1/fireworks |
Basic Usage
Passapi_base and api_key to litellm.completion(). LiteLLM forwards them to the provider — except now the request goes through the gateway first.
- Python
- cURL
Using Different Providers
Swap theapi_base URL and use the provider-specific model prefix that LiteLLM expects.
OpenAI
- Python
- cURL
Anthropic
- Python
- cURL
Fireworks
LiteLLM requires thefireworks_ai/ prefix for Fireworks models.
- Python
- cURL
Streaming
Streaming works the same way. Setstream=True and iterate over chunks.
- Python
- cURL
Tool Calling
LiteLLM supports tool/function calling. When routed through the gateway, every tool call and its response is captured in the observability dashboard.Using with Existing Frameworks
LiteLLM is often used as the LLM backend for evaluation frameworks, agent harnesses, and batch pipelines. You can route all of those calls through the gateway by passingapi_base and api_key as extra kwargs.
Example: Evaluation Framework
This pattern comes from a real benchmark runner that uses LiteLLM under the hood. The gateway endpoint and key are passed as JSON kwargs to the framework’s CLI:litellm.completion() will pick up the gateway routing automatically.
Environment Setup
Debugging
Turn on LiteLLM verbose logging to see the exact URL, headers, and body of each outgoing request:aigateway.leanmcp.com and not the provider directly.
Troubleshooting
litellm.completion returns an auth error
litellm.completion returns an auth error
- Verify
LEANMCP_API_KEYis set and starts withleanmcp_ - Check that the key has SDK permissions at app.leanmcp.com/api-keys
- Make sure you have credits in your account
Model not found or routing error
Model not found or routing error
- Confirm the model string uses the correct LiteLLM prefix (e.g.
fireworks_ai/for Fireworks,anthropic/for Anthropic) - Verify the
api_basematches the provider (e.g./v1/fireworksfor Fireworks models, not/v1/openai)
Requests not showing up in the dashboard
Requests not showing up in the dashboard
- Enable debug logging (
litellm._turn_on_debug()) and confirm the request URL starts withhttps://aigateway.leanmcp.com - Check app.leanmcp.com/observability — requests appear within a few seconds
Streaming not working
Streaming not working
- Make sure you pass
stream=Truetolitellm.completion() - The gateway supports streaming for all providers. If you get buffered responses, check your HTTP client settings
Next Steps
Observability Dashboard
Inspect every request, response, and token count
SDK Integration
OpenAI and Anthropic SDK examples
Security
Block sensitive data before it reaches providers
Token Optimization
A/B testing and cost reduction

