OmniRoute — Dashboard Features Gallery
v3.8.1Last updated: 2026-05-13
Was this page helpful?
Loading OmniRoute...
Last updated: 2026-05-13
Was this page helpful?
Main README translations: 🇺🇸 English | 🇧🇷 Português (Brasil) | 🇪🇸 Español | 🇫🇷 Français | 🇮🇹 Italiano | 🇷🇺 Русский | 🇨🇳 中文 (简体) | 🇩🇪 Deutsch | 🇮🇳 हिन्दी | 🇹🇭 ไทย | 🇺🇦 Українська | 🇸🇦 العربية | 🇯🇵 日本語 | 🇻🇳 Tiếng Việt | 🇧🇬 Български | 🇩🇰 Dansk | 🇫🇮 Suomi | 🇮🇱 עברית | 🇭🇺 Magyar | 🇮🇩 Bahasa Indonesia | 🇰🇷 한국어 | 🇲🇾 Bahasa Melayu | 🇳🇱 Nederlands | 🇳🇴 Norsk | 🇵🇹 Português (Portugal) | 🇷🇴 Română | 🇵🇱 Polski | 🇸🇰 Slovenčina | 🇸🇪 Svenska | 🇵🇭 Filipino | 🇨🇿 Čeština
Last updated: 2026-05-13 — v3.8.0
, , , , , . Backed by a 9-factor scoring engine and 4 curated mode packs (ship-fast, cost-saver, quality-first, offline-friendly)
- Command Code provider (#2199) — first-class registration with model catalog and quota tracking
- Z.AI provider — new free-tier provider with quota labels
- KIE media expansion — extended catalog including video generation models
- Windsurf + Devin CLI OAuth flows (#2168) — end-to-end browser-based login
- 9 new free providers — LLM7, Lepton, Kluster, UncloseAI, BazaarLink, Completions, Enally, FreeTheAi, Command Code
- Manifest-aware tier routing W1–W4 — provider manifests drive weighted tier selection
- Cursor full OpenAI parity — tool calls, streaming, session management end-to-end
- Cursor Pro plan usage — quota & cycle data surfaced in the provider-limits dashboard
- Service tier breakdown / Codex fast tier analytics — per-tier consumption visibility
- Per-session sticky routing — Codex sessions pin to the same account between turns
- Inworld TTS enhancements — voice catalogs, streaming, and latency improvements
- Kiro headless auth — login via local
SQLite store, no browser required
- DeepSeek quota and limit monitoring — daily/monthly usage exposed via dashboard
- Reset-aware routing strategy — combos now prefer accounts whose quota window resets soonest
and dynamic tool limit detection — finer fallback timing + per-provider tool-count limits
- Background mode degradation (Responses API) — falls back to synchronous mode with a structured warning when an upstream lacks background polling
- Per-provider 429 classification +
toggle — finer breaker behavior using upstream rate-limit hints
- Model cooldowns dashboard — observe per-model lockouts and manually re-enable from the UI
- MITM dynamic Linux cert detection — works across Debian/Ubuntu, Fedora/RHEL, Arch, and other distros
- CLI enhancement suite — 20+ commands including
, , ,
- Qdrant embedding model discovery — automatic vector-store model probe
- API Manager / Bearer keys with
scope — perform admin operations programmatically via API
- Combo target health analytics + structured combo builder — per-target health & UI builder for assembling
steps
- GitLab Duo OAuth provider — login with GitLab credentials
- Reasoning Replay Cache — hybrid in-memory + SQLite persistence of reasoning traces
Related docs: Skills Framework · Memory System · Cloud Agents · Webhooks · Reasoning Replay Cache
context-relay. Each combo chains multiple models with automatic fallback and includes quick templates and readiness checks.
tuple is unique now influences runtime execution/fallback order for top-level combo steps
Playground (format converter), Chat Tester (live requests), Test Bench (batch tests), and Live Monitor (real-time stream).
7 tabs:
, , , ), reasoning replay cache, and skill/memory toggles, per-session sticky routing toggle, model cooldowns
Windsurf, Devin CLI, Kimi Coding, Command Code) with:
Context Relay documentation.
assigned to routing combos; the default stacked math reaches average and eligible-context savings when both engines applyCompression Guide, RTK Compression, and Compression Engines.
) routes through , honoring provider-level and global proxy settings errors on Node.js 22) to prevent accidental exposure when sharing screenshots or recording demos. The full email address remains accessible via hover tooltip ( attribute).
catalog) — Shows at a glance how many models are enabled vs total. Automatically detects and repairs:
keeps your DB and configurations in . |
|
| erases all configurations, keys, and databases. |
shuts down Next.js cleanly, preventing SQLite WAL database locks (v3.6.2+) for full documentation.
OpenAI-compatible WebSocket clients via the upgrade endpoint. The custom server wraps Next.js and upgrades WS connections to full bidirectional streaming sessions. Authentication uses the same API key or session cookie as HTTP requests.
scoped sync tokens:
— Issue a new sync token (scoped, with optional expiry)
— Revoke a token
— Download a versioned, ETag-keyed JSON snapshot of all non-sensitive settings (passwords redacted)
. Consumers compare the response header to detect changes without re-downloading the full payload.
GLM Thinking () is now a registered first-class provider: 65 536 max output tokens, 24 576 thinking budget, 900 s default timeout, Claude-compatible API format, and shared usage sync with the GLM family.
Hybrid token counting also lands in v3.6.6: when a Claude-compatible provider exposes , OmniRoute calls it before large requests with graceful estimation fallback.
) — Blocks private/loopback/link-local IP ranges before the socket is opened.
- Safe fetch wrapper (
) — Applies the URL guard, normalises timeouts, and retries transient errors with exponential backoff.
) and are written to the compliance audit log via .
automatically retry when an upstream provider returns a model-scoped cooldown. Configurable via (default: 2) and (default: 30 s). Rate-limit header learning improved across , , and — per-model cooldown state is visible in the Resilience dashboard.
.