Memory System
v3.8.1Last updated: 2026-05-13
Was this page helpful?
Loading OmniRoute...
Source of truth: and
Last updated: 2026-05-13 β v3.8.0
scoped per API key, not per user β every request authenticated
with the same API key shares the same memory pool, with optional further
scoping by .
(look for , ,
and ).
:
, , |
||
/ |
||
| means permanent | ||
| to bridge UUIDs β FTS5 rowids |
, , , , plus the unique
index.
Upsert semantics: looks for an existing row with the same
and updates it in place when found (merging via
shallow spread). This keeps the table from growing unbounded for repeated
preference statements.
creates an FTS5 virtual table over and
. fixes a real-world bug where the UUID
primary key did not join to FTS5's integer rowid β the migration adds the
column, recreates the FTS table, and wires triggers
(, , ) that keep FTS in sync on
INSERT, DELETE, and UPDATE.
for the and strategies (see below).
The retrieval code guards with and falls back to
chronological order if the FTS table is missing or the FTS query throws.
implements an optional Qdrant integration for true semantic memory:
with the configured
embedding model, ensure the collection exists (creates cosine-distance
vectors on first use), and upsert a point with payload .
- β embed the query, search the
collection filtered by
and optionally by
/ . Caps to .
- β single point delete.
- β bulk delete points whose
is in the past or whose is older than the
retention cutoff. Counts first so the dashboard can show actual numbers.
- β
health probe with latency.
TODO: The chat pipeline () and the in-tree
implementation do not currently call or
. The Qdrant integration is feature-flagged via
in settings, but at the time of writing the
results are not fused into retrieval β the
/ retrieval strategies use SQLite FTS5 only. The settings UI
in exposes Qdrant config, health,
search test, and cleanup, but the corresponding ,
, , and
routes are referenced from the UI but not
present under (only is
wired). Treat Qdrant as preview/optional plumbing.
():
retrieval strategy is one of , , or ,
and scope is one of , , or . The default scope from
is .
regex-based, not LLM-based β it runs in-process with
so it never blocks the response stream:
(e.g. , , , )
- Decision patterns β
(e.g. , , , )
- Pattern patterns β
(e.g. , , )
, whitespace-collapse, capped at 500 chars),
deduplicated within the batch via a stable , and
stored via with metadata
. Input text is capped at
64 KiB () β when longer, the tail of the text
is used so the most recent assistant content always participates.
is exported for tests and returns the structured facts without storing them.
is the main entry point. It:
is false or ..
table) so older databases keep working. cutoff. (default): chronological .: if and exists, JOIN
and order by FTS rank; fall back to chronological
when FTS returns 0 rows.: union of FTS results (higher relevance) and the
chronological set, deduplicated by id., , and JSON when a query is provided. Rows with
zero score are filtered out.) stays under the budget. Always
returns at least one entry when any matched. is exported and used by retrieval, summarisation, and the MCP
tool.
:
ahead of any existing system
messages so user system prompts still take precedence.: , , ,
, , , , . These reject the system role
and would 400 otherwise (cf. issue #1701 for GLM/Zhipu). is exported for callers that need to
make routing decisions of their own. Unknown providers default to
(system role allowed) for safety.
stored in the DB settings table, not in env vars.
reads from and caches the result
in-process; is called by the settings PUT
route after writes.
(range ) |
|||
(range ) |
|||
(one of , |
|||
maps to the internal retrieval
strategy via (chronological order).
, , ,
, default ,
default ) are read by
in .
or env vars exist today β everything is per-instance
DB settings. (commented out in ) is
unrelated and refers to Node heap sizing.
compacts older
content when the running token total over a key's memories exceeds the
budget. It iterates rows DESC by , keeps rows that fit, and for
the rest replaces in place with the first three sentences of the
original. is the difference in between old and
new content.
available but not called automatically in the current
chat pipeline β call it from a cron, an admin action, or
glue if you need ongoing compaction. The data
loss is one-way: original text is overwritten.
).
, , |
||
, , optional |
||
β round-trip createβlistβdelete to confirm the store is alive. Returns |
||
, , |
list query supports either -based pagination
() or raw β when is present it
takes precedence and a derived is computed for the response shape.
β wraps with , optionally
filters by , and reports .
- β
β wraps .
- β
β lists matching
entries, optionally filters by created-before timestamp, then deletes each
via .
MCP-SERVER.md for transport and scope details.
provides:
/ / / all)., (the latter two come
from the API stats payload). ().
keeps an in-process LRU-ish cache
(, , with 20 %
oldest eviction) for reads, plus a generic key/value
layer () with //
methods used by callers that want their own scoped cache (1 000-entry LRU,
default TTL 5 min).
). Without an neither retrieval nor injection
nor extraction runs.
- are filtered out of retrieval; old
entries beyond
are excluded by the
clause in .
- or
.
- ; failures are logged under
and never surface to the caller.
- ) clean up their own
test entries in a
block.
setting injects tool
definitions alongside memory.
- MCP-SERVER.md β MCP transport / scopes.
- API_REFERENCE.md β broader API surface.
- Tuto_Qdrant.md β repository-root Qdrant setup tutorial (integration currently dormant β see status banner at top of that file).
- ,
- ,
,
- ,
,
- ,
,
- ,
,
- ,
,
- (injection / extraction wiring)