User Guide
v3.8.1Last updated: 2026-05-13
Was this page helpful?
Loading OmniRoute...
Languages: ๐บ๐ธ English | ๐ง๐ท Portuguรชs (Brasil) | ๐ช๐ธ Espaรฑol | ๐ซ๐ท Franรงais | ๐ฎ๐น Italiano | ๐ท๐บ ะ ัััะบะธะน | ๐จ๐ณ ไธญๆ (็ฎไฝ) | ๐ฉ๐ช Deutsch | ๐ฎ๐ณ เคนเคฟเคจเฅเคฆเฅ | ๐น๐ญ เนเธเธข | ๐บ๐ฆ ะฃะบัะฐัะฝััะบะฐ | ๐ธ๐ฆ ุงูุนุฑุจูุฉ | ๐ฏ๐ต ๆฅๆฌ่ช | ๐ป๐ณ Tiแบฟng Viแปt | ๐ง๐ฌ ะัะปะณะฐััะบะธ | ๐ฉ๐ฐ Dansk | ๐ซ๐ฎ Suomi | ๐ฎ๐ฑ ืขืืจืืช | ๐ญ๐บ Magyar | ๐ฎ๐ฉ Bahasa Indonesia | ๐ฐ๐ท ํ๊ตญ์ด | ๐ฒ๐พ Bahasa Melayu | ๐ณ๐ฑ Nederlands | ๐ณ๐ด Norsk | ๐ต๐น Portuguรชs (Portugal) | ๐ท๐ด Romรขnฤ | ๐ต๐ฑ Polski | ๐ธ๐ฐ Slovenฤina | ๐ธ๐ช Svenska | ๐ต๐ญ Filipino | ๐จ๐ฟ ฤeลกtina
| ๐ณ SUBSCRIPTION | ||||
| FREE | ||||
| ๐ API KEY | ||||
| ๐ฐ CHEAP | ||||
| ๐ FREE | ||||
๐ก Pro Tip: Start with Gemini CLI (180K free/month) + Qoder (unlimited free) combo = $0 cost!
Problem: Quota expires unused, rate limits during heavy coding
Problem: Can't afford subscriptions, need reliable AI coding
Problem: Deadlines, can't afford downtime
Problem: Need AI assistant in messaging apps, completely free
Dashboard โ Providers โ Connect Claude Code โ OAuth login โ Auto token refresh โ 5-hour + weekly quota tracking Models: cc/claude-opus-4-7 cc/claude-sonnet-4-6 cc/claude-haiku-4-5-20251001
Pro Tip: Use Opus for complex tasks, Sonnet for speed. OmniRoute tracks quota per model!
Dashboard โ Providers โ Connect Codex โ OAuth login (port 1455) โ 5-hour + weekly reset Models: cx/gpt-5.5 cx/gpt-5.4 cx/gpt-5.3-codex cx/gpt-5.3-codex-spark
Dashboard โ Providers โ Connect Gemini CLI โ Google OAuth โ 180K completions/month + 1K/day Models: gemini-cli/gemini-3.1-pro-preview gemini-cli/gemini-3-flash-preview gemini-cli/gemini-3.1-flash-lite-preview
Best Value: Huge free tier! Use this before paid tiers.
Dashboard โ Providers โ Connect GitHub โ OAuth via GitHub โ Monthly reset (1st of month) Models: gh/gpt-5.5 gh/gpt-5.4 gh/claude-sonnet-4.6 gh/claude-opus-4.7 gh/gemini-3.1-pro-preview
Use: โ Pro Tip: Coding Plan offers 3ร quota at 1/7 cost! Reset daily 10:00 AM.
Use: โ Pro Tip: Cheapest option for long context (1M tokens)!
Use: โ Pro Tip: Fixed $9/month for 10M tokens = $0.90/1M effective cost!
Use: , , or another Qianfan OpenAI-compatible model ID.
Dashboard โ Connect Qoder โ OAuth login โ Unlimited usage Models: if/kimi-k2, if/qwen3-coder-plus, if/qwen3-max, if/qwen3-235b, if/deepseek-r1, if/deepseek-v3.2
Dashboard โ Connect Kiro โ AWS Builder ID or Google/GitHub โ Unlimited Models: kr/claude-sonnet-4.5, kr/claude-haiku-4.5
Dashboard โ Combos by dragging the handle on each card. The order is stored in SQLite and restored on reload.
:
{
"env": {
"ANTHROPIC_BASE_URL": "http://localhost:20128",
"ANTHROPIC_AUTH_TOKEN": "your-omniroute-api-key"
}
}
to .
export OPENAI_BASE_URL="http://localhost:20128" export OPENAI_API_KEY="your-omniroute-api-key" codex "your prompt"
:
{
"agents": {
"defaults": {
"model": { "primary": "omniroute/if/kimi-k2" }
}
},
"models": {
"providers": {
"omniroute": {
"baseUrl": "http://localhost:20128/v1",
"apiKey": "your-omniroute-api-key",
"api": "openai-completions",
"models": [{ "id": "if/kimi-k2", "name": "kimi-k2" }]
}
}
}
}
Or use Dashboard: CLI Tools โ OpenClaw โ Auto-config
npm install -g omniroute # Create config directory mkdir -p ~/.omniroute # Create .env file (see .env.example) cp .env.example ~/.omniroute/.env # Start server omniroute # Or with custom port: omniroute --port 3000
from or .
keeps your DB and configurations in . |
|
| erases all configurations, keys, and databases. |
.
git clone https://github.com/diegosouzapw/OmniRoute.git cd OmniRoute && npm install && npm run build export JWT_SECRET="your-secure-secret-change-this" export INITIAL_PASSWORD="your-password" export DATA_DIR="/var/lib/omniroute" export PORT="20128" export HOSTNAME="0.0.0.0" export NODE_ENV="production" export NEXT_PUBLIC_BASE_URL="http://localhost:20128" export API_KEY_SECRET="endpoint-proxy-api-key-secret" npm run start # Or: pm2 start npm --name omniroute -- start
# With 512MB limit (default) pm2 start npm --name omniroute -- start # Or with custom memory limit OMNIROUTE_MEMORY_MB=512 pm2 start npm --name omniroute -- start # Or using ecosystem.config.js pm2 start ecosystem.config.js
:
module.exports = {
apps: [
{
name: "omniroute",
script: "npm",
args: "start",
env: {
NODE_ENV: "production",
OMNIROUTE_MEMORY_MB: "512",
JWT_SECRET: "your-secret",
INITIAL_PASSWORD: "your-password",
},
node_args: "--max-old-space-size=512",
max_memory_restart: "300M",
},
],
};
# Build image (default = runner-cli with codex/claude/droid preinstalled) docker build -t omniroute:cli . # Portable mode (recommended) docker run -d --name omniroute -p 20128:20128 --env-file ./.env -v omniroute-data:/app/data omniroute:cli
cross-compilation framework. This automates the Node.js standalone build along with the required native bindings.
# Template file for 'omniroute'
pkgname=omniroute
version=3.8.0
revision=1
hostmakedepends="nodejs python3 make"
depends="openssl"
short_desc="Universal AI gateway with smart routing for multiple LLM providers"
maintainer="zenobit <zenobit@disroot.org>"
license="MIT"
homepage="https://github.com/diegosouzapw/OmniRoute"
distfiles="https://github.com/diegosouzapw/OmniRoute/archive/refs/tags/v${version}.tar.gz"
checksum=009400afee90a9f32599d8fe734145cfd84098140b7287990183dde45ae2245b
system_accounts="_omniroute"
omniroute_homedir="/var/lib/omniroute"
export NODE_ENV=production
export npm_config_engine_strict=false
export npm_config_loglevel=error
export npm_config_fund=false
export npm_config_audit=false
do_build() {
# Determine target CPU arch for node-gyp
local _gyp_arch
case "$XBPS_TARGET_MACHINE" in
aarch64*) _gyp_arch=arm64 ;;
armv7*|armv6*) _gyp_arch=arm ;;
i686*) _gyp_arch=ia32 ;;
*) _gyp_arch=x64 ;;
esac
# 1) Install all deps โ skip scripts
NODE_ENV=development npm ci --ignore-scripts
# 2) Build the Next.js standalone bundle
npm run build
# 3) Copy static assets into standalone
cp -r .next/static .next/standalone/.next/static
[ -d public ] && cp -r public .next/standalone/public || true
# 4) Compile better-sqlite3 native binding
local _node_gyp=/usr/lib/node_modules/npm/node_modules/node-gyp/bin/node-gyp.js
(cd node_modules/better-sqlite3 && node "$_node_gyp" rebuild --arch="$_gyp_arch")
# 5) Place the compiled binding into the standalone bundle
local _bs3_release=.next/standalone/node_modules/better-sqlite3/build/Release
mkdir -p "$_bs3_release"
cp node_modules/better-sqlite3/build/Release/better_sqlite3.node "$_bs3_release/"
# 6) Remove arch-specific sharp bundles
rm -rf .next/standalone/node_modules/@img
# 7) Copy pino runtime deps omitted by Next.js static analysis:
for _mod in pino-abstract-transport split2 process-warning; do
cp -r "node_modules/$_mod" .next/standalone/node_modules/
done
}
do_check() {
npm run test:unit
}
do_install() {
vmkdir usr/lib/omniroute/.next
vcopy .next/standalone/. usr/lib/omniroute/.next/standalone
# Prevent removal of empty Next.js app router dirs by the post-install hook
for _d in \
.next/standalone/.next/server/app/dashboard \
.next/standalone/.next/server/app/dashboard/settings \
.next/standalone/.next/server/app/dashboard/providers; do
touch "${DESTDIR}/usr/lib/omniroute/${_d}/.keep"
done
cat > "${WRKDIR}/omniroute" <<'EOF'
#!/bin/sh
export PORT="${PORT:-20128}"
export DATA_DIR="${DATA_DIR:-${XDG_DATA_HOME:-${HOME}/.local/share}/omniroute}"
export APP_LOG_TO_FILE="${APP_LOG_TO_FILE:-false}"
mkdir -p "${DATA_DIR}"
exec node /usr/lib/omniroute/.next/standalone/server.js "$@"
EOF
vbin "${WRKDIR}/omniroute"
}
post_install() {
vlicense LICENSE
}
| change in production) | ||
| in examples) | ||
| ) | ||
| for deploy | ||
| ) | ||
| ) | ||
| auth cookie (behind HTTPS reverse proxy) | ||
| binary instead of managed download | ||
, , or |
||
README.
for v3.8.0. Cloud catalogs (Gemini, OpenRouter, etc.) are synced dynamically โ for the full live catalog open Dashboard โ Providers โ [provider] โ Available Models or call .
Claude Code () โ Pro/Max OAuth: , , , , ,
Codex () โ Plus/Pro OAuth: (+ effort tiers: , , , ), , , , ,
Gemini CLI () โ FREE OAuth: , , ,
GitHub Copilot () โ OAuth: , , , , , , , , , , , , ,
Kiro () โ FREE OAuth: , , , , , , , , , ,
Qoder () โ FREE OAuth: , , , , , , , , , , , , ,
Qwen () โ FREE OAuth (chat.qwen.ai): ,
GLM (, , , ) โ $0.2โ0.6/1M: , , , , , , , , ,
MiniMax (, ) โ $0.2/1M: , , ,
Kimi (, , ) โ $9/mo flat or per-use: ,
DeepSeek () โ API key: ,
Groq () โ Ultra-fast: , , ,
xAI () โ Grok native: , , ,
Mistral () โ EU-hosted: , , , ,
Perplexity () โ Search-augmented: , , ,
Together AI () โ Open-source: (free), , , , ,
Fireworks AI () โ Fast inference: , , , ,
Cerebras () โ Wafer-scale: ,
Cohere () โ RAG-focused: , , ,
NVIDIA NIM () โ Enterprise: , , , , , , , ,
Baidu Qianfan () โ ERNIE: , ,
Ollama Cloud (): , , , , , ,
Gemini (Google Cloud ): Synced live per API key from Google โ no static list. Connect a key in Dashboard โ Providers then use Available Models to import the current catalog (e.g. , ).
Other compatible providers (selected): , , , , , , , (via ), , (passthrough catalog), , , , , , , , , , , , , , , . Each maintains its own model list in and can be auto-synced when the provider exposes a endpoint.
Note on model IDs: OmniRoute uses provider-native IDs (, , , , , ). Some IDs include dotted versions because that is how the upstream API expects them. If a model is not listed above, run or hit to confirm availability.
# Via API
curl -X POST http://localhost:20128/api/provider-models \
-H "Content-Type: application/json" \
-d '{"provider": "openai", "modelId": "gpt-5.2", "modelName": "GPT-5.2"}'
# List: curl http://localhost:20128/api/provider-models?provider=openai
# Remove: curl -X DELETE "http://localhost:20128/api/provider-models?provider=openai&model=gpt-5.2"
Providers โ [Provider] โ Custom Models.
POST http://localhost:20128/v1/providers/openai/chat/completions POST http://localhost:20128/v1/providers/openai/embeddings POST http://localhost:20128/v1/providers/fireworks/images/generations
.
# Set global proxy
curl -X PUT http://localhost:20128/api/settings/proxy \
-d '{"global": {"type":"http","host":"proxy.example.com","port":"8080"}}'
# Per-provider proxy
curl -X PUT http://localhost:20128/api/settings/proxy \
-d '{"providers": {"openai": {"type":"socks5","host":"proxy.example.com","port":"1080"}}}'
# Test proxy
curl -X POST http://localhost:20128/api/settings/proxy/test \
-d '{"proxy":{"type":"socks5","host":"proxy.example.com","port":"1080"}}'
Precedence: Key-specific โ Combo-specific โ Provider-specific โ Global โ Environment.
curl http://localhost:20128/api/models/catalog
, , ).
in production endpoint if you want to override the managed transport choice binary instead of the managed download)
- Request Idempotency โ Deduplicates requests within 5s via
or header
- Progress Tracking โ Opt-in SSE
events via header
Dashboard โ Translator. Debug and visualize how OmniRoute translates API requests between providers.
| Playground | |
| Chat Tester | |
| Test Bench | |
| Live Monitor |
Use cases:
Dashboard โ Settings โ Routing. The dashboard exposes the six most-used strategies; combos and the auto-router internally support a wider set.
Dashboard-visible strategies (account-level routing):
| Fill First | |
| Round Robin | |
| P2C (Power of Two Choices) | |
| Random | |
| Least Used | timestamp, distributing traffic evenly |
| Cost Optimized |
Advanced combo and auto strategies (configurable per combo or via prefixes โ see AUTO-COMBO.md):
/
X-Session-Id: your-session-key
and returns the effective session key in .
underscores_in_headers on;
(any characters) and (single character).
Dashboard โ Settings โ Resilience.
or reset hints when providedrate limits stay in Connection Cooldown and do not count toward the provider breaker.
Dashboard โ Health only.
Pro Tip: Use the Health page to inspect and reset live provider breakers after an outage. The Resilience page only changes configuration.
Dashboard โ Settings โ System & Storage.
| Export Database | file |
| Export All (.tar.gz) | |
| Import Database | file to replace the current database. A pre-import backup is automatically created unless |
# API: Export database curl -o backup.sqlite http://localhost:20128/api/db-backups/export # API: Export all (full archive) curl -o backup.tar.gz http://localhost:20128/api/db-backups/exportAll # API: Import database curl -X POST http://localhost:20128/api/db-backups/import \ -F "file=@backup.sqlite"
Import Validation: The imported file is validated for integrity (SQLite pragma check), required tables (, , , ), and size (max 100MB).
Use Cases:
7 tabs for easy navigation:
| General | |
| Appearance | |
| AI | |
| Security | , Provider Blocking, prompt-injection guard |
| Routing | |
| Resilience | |
| Advanced |
Dashboard โ Costs.
| Budget | |
| Pricing |
# API: Set a budget
curl -X POST http://localhost:20128/api/usage/budget \
-H "Content-Type: application/json" \
-d '{"keyId": "key-123", "limit": 50.00, "period": "monthly"}'
# API: Get current budget status
curl http://localhost:20128/api/usage/budget
Cost Tracking: Every request logs token usage and calculates cost using the pricing table. View breakdowns in Dashboard โ Usage by provider, model, and API key.
POST /v1/audio/transcriptions Authorization: Bearer your-api-key Content-Type: multipart/form-data # Example with curl curl -X POST http://localhost:20128/v1/audio/transcriptions \ -H "Authorization: Bearer your-api-key" \ -F "file=@audio.mp3" \ -F "model=deepgram/nova-3"
Speech-to-Text (transcription) providers:
Text-to-Speech () providers:
, , , , , . TTS output formats depend on the provider (mp3, wav, opus, pcm, mulaw).
Dashboard โ Combos โ Create/Edit โ Strategy.
| Round-Robin | |
| Priority | |
| Random | |
| Weighted | |
| Least-Used | |
| Cost-Optimized |
Dashboard โ Settings โ Routing โ Combo Defaults.
Dashboard โ Health. Real-time system health overview with 6 cards:
| System Status | |
| Provider Health | |
| Rate Limits | |
| Active Lockouts | |
| Signature Cache | |
| Latency Telemetry |
Pro Tip: The Health page auto-refreshes every 10 seconds. Use the circuit breaker card to identify which providers are experiencing issues.
score-driven auto-router that picks the best model for each request across every connected provider โ no combo to maintain. Just send the request with one of the prefixes and OmniRoute will assemble a virtual combo on the fly, scoring candidates on latency, cost, success rate, context fit, model fitness for the task, recent failures, quota, and circuit-breaker state.
curl -X POST http://localhost:20128/v1/chat/completions \
-H "Authorization: Bearer $OMNIROUTE_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "auto/coding",
"messages": [{ "role": "user", "content": "Refactor this Python function" }],
"stream": true
}'
AUTO-COMBO.md โ including how to tune scoring weights, blacklist providers, and inspect routing decisions in Dashboard โ Auto Combo.
MCP server (Model Context Protocol) and an A2A server (Agent-to-Agent JSON-RPC 2.0). Any MCP-compatible IDE or agent host can call OmniRoute tools directly โ no extra wrapper required.
- Streamable HTTP:
- stdio:
(for IDE plugins that prefer stdio)
(macOS) or the equivalent on Windows/Linux:
{
"mcpServers": {
"omniroute": {
"command": "omniroute",
"args": ["--mcp"]
}
}
}
and a Bearer API key generated in Dashboard โ API Manager.
, , , , , , , , , . Each Bearer key can be limited to specific scopes โ see MCP-SERVER.md for the full tool catalog and A2A-SERVER.md for the JSON-RPC schema.
skill framework () so agents and the A2A endpoint can run domain-specific routines (e.g. , , , ).
, register it, and it becomes immediately invocable over A2ASKILLS.md.
long-term conversational memory with hybrid retrieval:
tableDashboard โ Memory (search, edit, export, purge). The HTTP surface () lets agents push and query facts programmatically โ see MEMORY.md.
, , , , , WEBHOOKS.md.
OpenAI Codex Cloud, Devin, Jules, Antigravity) so you can dispatch long-running tasks from the same dashboard that handles your local routing.
-
-
CLOUD_AGENT.md.
Bearer key with the scope.
Dashboard โ API Manager โ New Key โ Scope: manage, then:
# List providers
curl http://localhost:20128/api/providers \
-H "Authorization: Bearer $OMNIROUTE_MANAGE_KEY"
# Add a provider connection
curl -X POST http://localhost:20128/api/providers \
-H "Authorization: Bearer $OMNIROUTE_MANAGE_KEY" \
-H "Content-Type: application/json" \
-d '{ "provider": "openai", "apiKey": "sk-...", "name": "main" }'
# Create a combo
curl -X POST http://localhost:20128/api/combos \
-H "Authorization: Bearer $OMNIROUTE_MANAGE_KEY" \
-H "Content-Type: application/json" \
-d '{ "name": "premium", "strategy": "priority", "models": [{ "model": "cc/claude-opus-4-7" }, { "model": "glm/glm-5.1" }] }'
# List/create API keys
curl http://localhost:20128/api/keys -H "Authorization: Bearer $OMNIROUTE_MANAGE_KEY"
curl -X POST http://localhost:20128/api/keys -H "Authorization: Bearer $OMNIROUTE_MANAGE_KEY" \
-d '{ "name": "ci-bot", "scopes": ["chat"] }'
API_REFERENCE.md for the full endpoint catalog and request/response schemas.
) for setup, diagnostics, and runtime control. This is separate from the "CLI Tools" page in the dashboard, which configures third-party CLIs (Claude Code, Cursor, Codex, Cline, โฆ) so they can talk to OmniRoute.
omniroute setup # Interactive wizard (password, providers, combos) omniroute setup --non-interactive # CI-friendly omniroute doctor # Health diagnostics (data dir, DB, providers, ports) omniroute providers available # List supported providers omniroute providers list # List configured connections omniroute providers test <id> # Live test a provider connection omniroute combos list # List combos omniroute combos switch <name> # Set default combo omniroute models # List available models (--json, --search) omniroute keys add | list | remove # Manage API keys from the terminal omniroute backup # Snapshot config + DB omniroute restore [<timestamp>] # Restore from a snapshot omniroute health # Detailed health (breakers, cache, memory) omniroute quota # Provider quota usage omniroute mcp status # MCP server status omniroute a2a status # A2A server status omniroute tunnel list|create|stop # Cloudflare/Tailscale/ngrok tunnels omniroute reset-password # Reset the admin password omniroute --mcp # Start MCP server over stdio omniroute --port 3000 # Start the server on a custom port
with your monitoring tool to alert on unhealthy provider connections.
# From the electron directory: cd electron npm install # Development mode (connect to running Next.js dev server): npm run dev # Production mode (uses standalone build): npm start
cd electron npm run build # Current platform npm run build:win # Windows (.exe NSIS) npm run build:mac # macOS (.dmg universal) npm run build:linux # Linux (.AppImage)
| Server Readiness | |
| System Tray | |
| Port Management | |
| Content Security Policy | |
| Single Instance | |
| Offline Mode |