Hermes-inspired memory hardening: hybrid retrieval, trust feedback, auto-consolidation, tiered context, embedding blobs by idapixl · Pull Request #35 · Fozikio/cortex-engine

idapixl · 2026-06-10T21:06:45Z

Summary

Audit of cortex-engine's storage and retrieval systems, cross-referenced against Hermes Agent (Nous Research, MIT). Full findings in docs/hermes-audit.md.

Audit fixes

last_retrieval_score, last_hop_count, memory_origin were silently dropped by the SQLite and Firestore backends — the dream pipeline's FSRS rating feedback loop never fired. Now persisted as real columns with ALTER TABLE migration shims.
Six secondary indexes added (edges, observations, memories, ops, beliefs) — these queries were all full table scans.

Patterns borrowed from Hermes

Hybrid retrieval + trust feedback — searchText() on CortexStore (FTS5/BM25 on SQLite with trigger-synced external-content index; weighted token-overlap fallback on JSON/Firestore). Lexical hits merge into the vector candidate set in query, re-scored by cosine. New feedback tool applies asymmetric confidence deltas (+0.05 helpful / −0.10 unhelpful) so polluted memories decay out of top ranks quickly.
Auto-consolidation — SessionConsolidator triggers dreamPhaseA (NREM: cluster → refine → create) in the background after 10 pending observations per namespace; SIGTERM/SIGINT/beforeExit flush so sessions that end early don't strand knowledge.
Tiered context loading — new context tool with L0 (~100 tokens, salience × FSRS retrievability), L1 (~2k tokens, semantic top-15 + one-hop edges), L2 (multi-anchor retrieval + 2-hop spreading activation + full metadata).

Embedding storage migration

SQLite embeddings are now raw Float32Array blobs (~4× smaller, parse-free reads) instead of JSON text. Legacy rows are converted in place at store-open time (idempotent). verifyMigration compares embeddings at float32 precision so json→sqlite migrations verify clean.

Test plan

28 new tests: FTS5 search + sync (including the INSERT OR REPLACE / recursive_triggers edge case), lexical fallback, feedback deltas + clamping + audit log, consolidator thresholds/flush/error-tolerance, blob round-trips, legacy text→blob conversion, idempotency.
Full suite: 18 files, 150 tests, all passing. tsc clean.

https://claude.ai/code/session_01DAZ3GzRri9hqxkTyqmSpc4

Generated by Claude Code

Inspired by Hermes Agent's Holographic memory provider, this adds three interlocking improvements to the cortex-engine cognitive loop: 1. Hybrid lexical+vector retrieval (FTS5 on SQLite, token-overlap fallback on JSON/Firestore). The `query` tool now merges BM25 full-text hits into the semantic candidate set, re-scoring them by cosine before ranking. Exact IDs, proper nouns, and rare terms that embeddings miss are now surfaced. Controlled by `lexical: false` to opt out. 2. Asymmetric trust feedback tool. New `feedback` tool lets agents close the retrieval loop: helpful memories gain +0.05 confidence, unhelpful ones lose -0.10. The asymmetry mirrors Hermes' holographic trust scoring — bad retrievals decay out of rankings faster than good ones earn their way in. Every event logged to `feedback_log` for `retrieval_audit`. 3. Schema hardening on SQLite. `last_retrieval_score`, `last_hop_count`, and `memory_origin` are now first-class persisted columns (with migration shims for existing DBs). FTS5 external-content index kept in sync by triggers. Six missing indexes added (edges, obs, memories, ops, beliefs). `recursive_triggers = ON` so INSERT OR REPLACE correctly fires the FTS delete trigger on upserts. https://claude.ai/code/session_01DAZ3GzRri9hqxkTyqmSpc4

Two of the three Hermes Agent patterns that were not yet in cortex-engine: ## Thing 2 — Automatic session-end memory extraction Hermes syncs conversation turns to memory after each response and extracts on session end. SessionConsolidator (engines/auto-consolidate.ts) replicates this loop: - observe / wonder / speculate call consolidator.notifyObservation() after every successful write. - When pending count crosses AUTO_THRESHOLD (10) per namespace, dreamPhaseA fires in the background without blocking the calling tool. Phase A only (NREM: cluster → refine → create) — lightweight enough to run per session; REM stays in the scheduled dream cron. - SIGTERM / SIGINT / beforeExit handlers flush all namespaces with unprocessed observations before the process dies. - Background errors are swallowed (best-effort); CORTEX_DEBUG=1 surfaces them to stderr. ## Thing 3 — Tiered context loading (L0 / L1 / L2) New `context` tool mirrors Hermes OpenViking's progressive context tiers: - L0 (~100 tokens): top-3 by salience × FSRS retrievability. One vector search, no LLM call. Designed for system-prompt injection on every turn. - L1 (~2k tokens): semantic top-15, full definitions, tags, immediate graph edges (one hop). Working-memory refresh mid-conversation. - L2 (full): multi-anchor retrieval (4 query reformulations, Borda count), spreading activation (2 hops), full metadata including provenance, FSRS state, activation path. Maximum recall for deep research tasks. All tiers support HyDE expansion (default on, disable with hyde: false). L0 always skips HyDE — it is the latency-zero path. https://claude.ai/code/session_01DAZ3GzRri9hqxkTyqmSpc4

27 new tests: - search-text.test.ts — FTS5 keyword search (name/definition/tags), faded exclusion, MATCH-syntax injection safety, trigger sync across updateMemory and upsertMemory (the INSERT OR REPLACE path that needs recursive_triggers), FTS rebuild on reopened DBs, JSON lexical fallback ranking, and round-trip persistence of last_retrieval_score / last_hop_count / memory_origin. - feedback.test.ts — asymmetric deltas (+0.05/-0.10), floor/ceiling clamping, access reinforcement only on helpful, feedback_log contents, unknown-id and missing-arg errors. Runs against real in-memory SQLite so the withTransaction path is exercised. - auto-consolidate.test.ts — threshold triggering (exactly at 10, not below), counter reset, per-namespace isolation, flush() draining, and error swallowing. docs/hermes-audit.md records the audit findings (severity + fix), the three Hermes patterns borrowed, and the gaps deliberately left open (embedding blob storage, ANN scaling, generic-collection scans). https://claude.ai/code/session_01DAZ3GzRri9hqxkTyqmSpc4

Embeddings were stored as JSON text (~4x larger, parsed on every read) even though the read path already understood Float32Array blobs. All write paths (putMemory, updateMemory, upsertMemory, putObservation, upsertObservation) now encode blobs, and legacy JSON-text rows are converted in place at store-open time — idempotent, only text-typed rows are touched. Because float32 truncation changes embedding values vs the float64 kept by the JSON backend, verifyMigration now compares embeddings at float32 precision (Math.fround) so json->sqlite migrations verify clean. https://claude.ai/code/session_01DAZ3GzRri9hqxkTyqmSpc4

Copilot

Pull request overview

This PR hardens cortex-engine’s memory subsystem (storage + retrieval) with Hermes-inspired patterns: hybrid lexical+vector retrieval with trust feedback, automatic session consolidation, tiered context loading, and SQLite embedding/storage upgrades (FTS5, indexes, float32-blob embeddings, and schema migrations).

Changes:

Adds hybrid retrieval via CortexStore.searchText() (FTS5/BM25 on SQLite; token-overlap fallback for JSON/Firestore) and merges lexical hits into query.
Introduces new memory tools: feedback (asymmetric confidence deltas + audit log) and context (L0/L1/L2 tiered loading).
Upgrades SQLite: persists retrieval-feedback fields, adds secondary indexes + trigger-synced FTS5 table, and migrates embeddings from JSON text to Float32Array BLOBs (idempotent).

Reviewed changes

Copilot reviewed 21 out of 22 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/tools/wonder.ts	Notifies the session auto-consolidator after writes.
src/tools/speculate.ts	Notifies the session auto-consolidator after writes.
src/tools/query.ts	Hybrid lexical+vector recall and additional response metadata.
src/tools/observe.ts	Notifies the session auto-consolidator after writes.
src/tools/feedback.ts	New tool: asymmetric trust scoring + feedback log writes.
src/tools/feedback.test.ts	Tests for feedback deltas, clamping, logging, and errors.
src/tools/context.ts	New tool: tiered context retrieval (L0/L1/L2).
src/stores/sqlite.ts	Schema fixes, indexes, FTS5+triggers, embedding blobs + migration, `searchText()`.
src/stores/sqlite.test.ts	Tests for blob embeddings and legacy JSON→BLOB migration/idempotency.
src/stores/search-text.test.ts	Tests for SQLite FTS5 search, JSON fallback lexical search, and field persistence.
src/stores/json.ts	Implements `searchText()` via shared lexical fallback.
src/stores/firestore.ts	Persists retrieval-feedback fields + lexical fallback `searchText()` implementation.
src/stores/_lexical.ts	New shared token-overlap lexical search implementation.
src/namespace/scoped-store.ts	Pass-through implementation for `searchText()`.
src/mcp/tools.ts	Registers `context` and `feedback` tools; adds consolidator to tool context.
src/mcp/server.ts	Instantiates `SessionConsolidator` and hooks shutdown flush handlers.
src/engines/memory.ts	Exports `cosineSimilarity` for reuse.
src/engines/auto-consolidate.ts	New `SessionConsolidator` (threshold-triggered background Phase A + flush).
src/engines/auto-consolidate.test.ts	Tests for consolidator thresholds, namespace isolation, flush, and error tolerance.
src/core/store.ts	Adds `searchText()` to the `CortexStore` interface.
src/bin/migrate-cmd.ts	Normalizes embedding comparisons at float32 precision for migration verification.
docs/hermes-audit.md	Audit write-up describing findings, fixes, and known gaps.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

 export function createTools(): ToolDefinition[] {
  return [
    // Core cognitive tools
+    contextTool,
    queryTool,
+    feedbackTool,
    observeTool,


+  // 7c. Flush pending observations to memory on shutdown
+  const consolidatorFlush = () => { consolidator.flush().catch(() => {}); };
+  process.once('SIGTERM', consolidatorFlush);
+  process.once('SIGINT', consolidatorFlush);
+  process.once('beforeExit', consolidatorFlush);


+      const rawEmbedding = await ctx.embed.embed(text);
+      const candidates = await store.findNearest(rawEmbedding, 20);
+      const now = new Date();
+
+      const scored = candidates.map((r) => {
+        const daysSince = r.memory.fsrs.last_review
+          ? elapsedDaysSince(r.memory.fsrs.last_review)
+          : 0;
+        const ret = retrievability(r.memory.fsrs.stability, daysSince);
+        return { r, score: r.memory.salience * ret };
+      });
+
+      const top = scored
+        .sort((a, b) => b.score - a.score)
+        .slice(0, 3);
+
+      void now;


+      const nearest = await store.findNearest(embedding, 15);
+
+      const now = new Date();
+      const results = await Promise.all(
+        nearest.map(async (r) => {
+          const daysSince = r.memory.fsrs.last_review
+            ? elapsedDaysSince(r.memory.fsrs.last_review)
+            : 0;
+          const ret = retrievability(r.memory.fsrs.stability, daysSince);
+          const salienceFactor = 0.5 + r.memory.salience * 0.5;
+          const compositeScore = r.score * ret * salienceFactor;
+
+          const edges = await store.getEdgesFrom(r.memory.id);
+          const links = edges.slice(0, 5).map((e) => ({
+            target_id: e.target_id,
+            relation: e.relation,
+            weight: e.weight,
+          }));
+
+          return { r, compositeScore, ret, links };
+        }),
+      );
+
+      void now;


…down, remove dead vars - context and feedback tools were gated behind namespace cognitive_tools config and not in CORE_TOOLS, so they never appeared in ListTools. Added both to CORE_TOOLS so they are always active like query/observe. - SIGTERM/SIGINT consolidator flush handler returned immediately, leaving the flush promise racing against process exit. Handlers now call process.exit(0) in the .finally() callback so the process stays alive until flush completes. beforeExit keeps the existing pattern (flush promise holds the event loop). - Removed two dead `now` variable declarations in context.ts L0/L1 handlers (elapsedDaysSince() computes its own reference time internally). https://claude.ai/code/session_01DAZ3GzRri9hqxkTyqmSpc4

claude added 4 commits June 10, 2026 04:16

Copilot AI review requested due to automatic review settings June 10, 2026 21:06

Copilot started reviewing on behalf of idapixl June 10, 2026 21:06 View session

Copilot AI reviewed Jun 10, 2026

View reviewed changes

idapixl merged commit d1afb05 into master Jun 10, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hermes-inspired memory hardening: hybrid retrieval, trust feedback, auto-consolidation, tiered context, embedding blobs#35

Hermes-inspired memory hardening: hybrid retrieval, trust feedback, auto-consolidation, tiered context, embedding blobs#35
idapixl merged 5 commits into
masterfrom
claude/hermes-agent-audit-91i4n6

idapixl commented Jun 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

idapixl commented Jun 10, 2026

Summary

Audit fixes

Patterns borrowed from Hermes

Embedding storage migration

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants