Skip to content

Hermes-inspired memory hardening: hybrid retrieval, trust feedback, auto-consolidation, tiered context, embedding blobs#35

Merged
idapixl merged 5 commits into
masterfrom
claude/hermes-agent-audit-91i4n6
Jun 10, 2026
Merged

Hermes-inspired memory hardening: hybrid retrieval, trust feedback, auto-consolidation, tiered context, embedding blobs#35
idapixl merged 5 commits into
masterfrom
claude/hermes-agent-audit-91i4n6

Conversation

@idapixl

@idapixl idapixl commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Summary

Audit of cortex-engine's storage and retrieval systems, cross-referenced against Hermes Agent (Nous Research, MIT). Full findings in docs/hermes-audit.md.

Audit fixes

  • last_retrieval_score, last_hop_count, memory_origin were silently dropped by the SQLite and Firestore backends — the dream pipeline's FSRS rating feedback loop never fired. Now persisted as real columns with ALTER TABLE migration shims.
  • Six secondary indexes added (edges, observations, memories, ops, beliefs) — these queries were all full table scans.

Patterns borrowed from Hermes

  1. Hybrid retrieval + trust feedbacksearchText() on CortexStore (FTS5/BM25 on SQLite with trigger-synced external-content index; weighted token-overlap fallback on JSON/Firestore). Lexical hits merge into the vector candidate set in query, re-scored by cosine. New feedback tool applies asymmetric confidence deltas (+0.05 helpful / −0.10 unhelpful) so polluted memories decay out of top ranks quickly.
  2. Auto-consolidationSessionConsolidator triggers dreamPhaseA (NREM: cluster → refine → create) in the background after 10 pending observations per namespace; SIGTERM/SIGINT/beforeExit flush so sessions that end early don't strand knowledge.
  3. Tiered context loading — new context tool with L0 (~100 tokens, salience × FSRS retrievability), L1 (~2k tokens, semantic top-15 + one-hop edges), L2 (multi-anchor retrieval + 2-hop spreading activation + full metadata).

Embedding storage migration

SQLite embeddings are now raw Float32Array blobs (~4× smaller, parse-free reads) instead of JSON text. Legacy rows are converted in place at store-open time (idempotent). verifyMigration compares embeddings at float32 precision so json→sqlite migrations verify clean.

Test plan

  • 28 new tests: FTS5 search + sync (including the INSERT OR REPLACE / recursive_triggers edge case), lexical fallback, feedback deltas + clamping + audit log, consolidator thresholds/flush/error-tolerance, blob round-trips, legacy text→blob conversion, idempotency.
  • Full suite: 18 files, 150 tests, all passing. tsc clean.

https://claude.ai/code/session_01DAZ3GzRri9hqxkTyqmSpc4


Generated by Claude Code

claude added 4 commits June 10, 2026 04:16
Inspired by Hermes Agent's Holographic memory provider, this adds three
interlocking improvements to the cortex-engine cognitive loop:

1. Hybrid lexical+vector retrieval (FTS5 on SQLite, token-overlap fallback
   on JSON/Firestore). The `query` tool now merges BM25 full-text hits into
   the semantic candidate set, re-scoring them by cosine before ranking.
   Exact IDs, proper nouns, and rare terms that embeddings miss are now
   surfaced. Controlled by `lexical: false` to opt out.

2. Asymmetric trust feedback tool. New `feedback` tool lets agents close
   the retrieval loop: helpful memories gain +0.05 confidence, unhelpful
   ones lose -0.10. The asymmetry mirrors Hermes' holographic trust scoring
   — bad retrievals decay out of rankings faster than good ones earn their
   way in. Every event logged to `feedback_log` for `retrieval_audit`.

3. Schema hardening on SQLite. `last_retrieval_score`, `last_hop_count`,
   and `memory_origin` are now first-class persisted columns (with
   migration shims for existing DBs). FTS5 external-content index kept in
   sync by triggers. Six missing indexes added (edges, obs, memories, ops,
   beliefs). `recursive_triggers = ON` so INSERT OR REPLACE correctly
   fires the FTS delete trigger on upserts.

https://claude.ai/code/session_01DAZ3GzRri9hqxkTyqmSpc4
Two of the three Hermes Agent patterns that were not yet in cortex-engine:

## Thing 2 — Automatic session-end memory extraction

Hermes syncs conversation turns to memory after each response and
extracts on session end. SessionConsolidator (engines/auto-consolidate.ts)
replicates this loop:

- observe / wonder / speculate call consolidator.notifyObservation() after
  every successful write.
- When pending count crosses AUTO_THRESHOLD (10) per namespace,
  dreamPhaseA fires in the background without blocking the calling tool.
  Phase A only (NREM: cluster → refine → create) — lightweight enough to
  run per session; REM stays in the scheduled dream cron.
- SIGTERM / SIGINT / beforeExit handlers flush all namespaces with
  unprocessed observations before the process dies.
- Background errors are swallowed (best-effort); CORTEX_DEBUG=1 surfaces
  them to stderr.

## Thing 3 — Tiered context loading (L0 / L1 / L2)

New `context` tool mirrors Hermes OpenViking's progressive context tiers:

- L0 (~100 tokens): top-3 by salience × FSRS retrievability. One vector
  search, no LLM call. Designed for system-prompt injection on every turn.
- L1 (~2k tokens): semantic top-15, full definitions, tags, immediate
  graph edges (one hop). Working-memory refresh mid-conversation.
- L2 (full): multi-anchor retrieval (4 query reformulations, Borda count),
  spreading activation (2 hops), full metadata including provenance,
  FSRS state, activation path. Maximum recall for deep research tasks.

All tiers support HyDE expansion (default on, disable with hyde: false).
L0 always skips HyDE — it is the latency-zero path.

https://claude.ai/code/session_01DAZ3GzRri9hqxkTyqmSpc4
27 new tests:

- search-text.test.ts — FTS5 keyword search (name/definition/tags), faded
  exclusion, MATCH-syntax injection safety, trigger sync across
  updateMemory and upsertMemory (the INSERT OR REPLACE path that needs
  recursive_triggers), FTS rebuild on reopened DBs, JSON lexical fallback
  ranking, and round-trip persistence of last_retrieval_score /
  last_hop_count / memory_origin.

- feedback.test.ts — asymmetric deltas (+0.05/-0.10), floor/ceiling
  clamping, access reinforcement only on helpful, feedback_log contents,
  unknown-id and missing-arg errors. Runs against real in-memory SQLite
  so the withTransaction path is exercised.

- auto-consolidate.test.ts — threshold triggering (exactly at 10, not
  below), counter reset, per-namespace isolation, flush() draining, and
  error swallowing.

docs/hermes-audit.md records the audit findings (severity + fix), the
three Hermes patterns borrowed, and the gaps deliberately left open
(embedding blob storage, ANN scaling, generic-collection scans).

https://claude.ai/code/session_01DAZ3GzRri9hqxkTyqmSpc4
Embeddings were stored as JSON text (~4x larger, parsed on every read)
even though the read path already understood Float32Array blobs. All
write paths (putMemory, updateMemory, upsertMemory, putObservation,
upsertObservation) now encode blobs, and legacy JSON-text rows are
converted in place at store-open time — idempotent, only text-typed
rows are touched.

Because float32 truncation changes embedding values vs the float64
kept by the JSON backend, verifyMigration now compares embeddings at
float32 precision (Math.fround) so json->sqlite migrations verify
clean.

https://claude.ai/code/session_01DAZ3GzRri9hqxkTyqmSpc4
Copilot AI review requested due to automatic review settings June 10, 2026 21:06

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens cortex-engine’s memory subsystem (storage + retrieval) with Hermes-inspired patterns: hybrid lexical+vector retrieval with trust feedback, automatic session consolidation, tiered context loading, and SQLite embedding/storage upgrades (FTS5, indexes, float32-blob embeddings, and schema migrations).

Changes:

  • Adds hybrid retrieval via CortexStore.searchText() (FTS5/BM25 on SQLite; token-overlap fallback for JSON/Firestore) and merges lexical hits into query.
  • Introduces new memory tools: feedback (asymmetric confidence deltas + audit log) and context (L0/L1/L2 tiered loading).
  • Upgrades SQLite: persists retrieval-feedback fields, adds secondary indexes + trigger-synced FTS5 table, and migrates embeddings from JSON text to Float32Array BLOBs (idempotent).

Reviewed changes

Copilot reviewed 21 out of 22 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/tools/wonder.ts Notifies the session auto-consolidator after writes.
src/tools/speculate.ts Notifies the session auto-consolidator after writes.
src/tools/query.ts Hybrid lexical+vector recall and additional response metadata.
src/tools/observe.ts Notifies the session auto-consolidator after writes.
src/tools/feedback.ts New tool: asymmetric trust scoring + feedback log writes.
src/tools/feedback.test.ts Tests for feedback deltas, clamping, logging, and errors.
src/tools/context.ts New tool: tiered context retrieval (L0/L1/L2).
src/stores/sqlite.ts Schema fixes, indexes, FTS5+triggers, embedding blobs + migration, searchText().
src/stores/sqlite.test.ts Tests for blob embeddings and legacy JSON→BLOB migration/idempotency.
src/stores/search-text.test.ts Tests for SQLite FTS5 search, JSON fallback lexical search, and field persistence.
src/stores/json.ts Implements searchText() via shared lexical fallback.
src/stores/firestore.ts Persists retrieval-feedback fields + lexical fallback searchText() implementation.
src/stores/_lexical.ts New shared token-overlap lexical search implementation.
src/namespace/scoped-store.ts Pass-through implementation for searchText().
src/mcp/tools.ts Registers context and feedback tools; adds consolidator to tool context.
src/mcp/server.ts Instantiates SessionConsolidator and hooks shutdown flush handlers.
src/engines/memory.ts Exports cosineSimilarity for reuse.
src/engines/auto-consolidate.ts New SessionConsolidator (threshold-triggered background Phase A + flush).
src/engines/auto-consolidate.test.ts Tests for consolidator thresholds, namespace isolation, flush, and error tolerance.
src/core/store.ts Adds searchText() to the CortexStore interface.
src/bin/migrate-cmd.ts Normalizes embedding comparisons at float32 precision for migration verification.
docs/hermes-audit.md Audit write-up describing findings, fixes, and known gaps.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/mcp/tools.ts
Comment on lines 210 to 216
export function createTools(): ToolDefinition[] {
return [
// Core cognitive tools
contextTool,
queryTool,
feedbackTool,
observeTool,
Comment thread src/mcp/server.ts Outdated
Comment on lines +145 to +149
// 7c. Flush pending observations to memory on shutdown
const consolidatorFlush = () => { consolidator.flush().catch(() => {}); };
process.once('SIGTERM', consolidatorFlush);
process.once('SIGINT', consolidatorFlush);
process.once('beforeExit', consolidatorFlush);
Comment thread src/tools/context.ts Outdated
Comment on lines +67 to +83
const rawEmbedding = await ctx.embed.embed(text);
const candidates = await store.findNearest(rawEmbedding, 20);
const now = new Date();

const scored = candidates.map((r) => {
const daysSince = r.memory.fsrs.last_review
? elapsedDaysSince(r.memory.fsrs.last_review)
: 0;
const ret = retrievability(r.memory.fsrs.stability, daysSince);
return { r, score: r.memory.salience * ret };
});

const top = scored
.sort((a, b) => b.score - a.score)
.slice(0, 3);

void now;
Comment thread src/tools/context.ts Outdated
Comment on lines +104 to +127
const nearest = await store.findNearest(embedding, 15);

const now = new Date();
const results = await Promise.all(
nearest.map(async (r) => {
const daysSince = r.memory.fsrs.last_review
? elapsedDaysSince(r.memory.fsrs.last_review)
: 0;
const ret = retrievability(r.memory.fsrs.stability, daysSince);
const salienceFactor = 0.5 + r.memory.salience * 0.5;
const compositeScore = r.score * ret * salienceFactor;

const edges = await store.getEdgesFrom(r.memory.id);
const links = edges.slice(0, 5).map((e) => ({
target_id: e.target_id,
relation: e.relation,
weight: e.weight,
}));

return { r, compositeScore, ret, links };
}),
);

void now;
…down, remove dead vars

- context and feedback tools were gated behind namespace cognitive_tools config
  and not in CORE_TOOLS, so they never appeared in ListTools. Added both to
  CORE_TOOLS so they are always active like query/observe.

- SIGTERM/SIGINT consolidator flush handler returned immediately, leaving the
  flush promise racing against process exit. Handlers now call process.exit(0)
  in the .finally() callback so the process stays alive until flush completes.
  beforeExit keeps the existing pattern (flush promise holds the event loop).

- Removed two dead `now` variable declarations in context.ts L0/L1 handlers
  (elapsedDaysSince() computes its own reference time internally).

https://claude.ai/code/session_01DAZ3GzRri9hqxkTyqmSpc4
@idapixl idapixl merged commit d1afb05 into master Jun 10, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants