Memory Sidecar v3.1.0

A production memory system for any AI agent. Keep knowledge across sessions, without touching agent internals.

What This Is

AI agents forget things. Every new session starts blank.

Memory Sidecar runs alongside your agent — Hermes, Claude Code, Cursor, Codex, whatever — and gives it a real memory. It saves important conversations, builds long-term knowledge, and feeds relevant context back when needed.

It doesn't patch the agent. It's a sidecar: separate process, shared data directory.

Three things it actually does:

Archives sessions to permanent knowledge — conversations aren't lost when you restart
Recalls what matters — layered retrieval: recent context → semantic search → knowledge graph
Tracks important topics — people, projects, recurring problems get their own "dossier"

Architecture at a Glance

Agent writes sessions → state.db + session files
              ↓
Sidecar reads checkpoint, processes new sessions
              ↓
  ┌───────────┼───────────┐
  │           │           │
  ▼           ▼           ▼
Hot Layer   Warm Layer  Cold Layer
(memory     (Hindsight  (gbrain graph
 tool,      PostgreSQL)  + FTS5 search)
 5KB cap)               
              ↓
  Tiered context injection → agent's system prompt

The full stack is documented in ARCHITECTURE.md. Short version:

Layer	What	Technology	Speed
Hot	Current user + system facts	memory tool injection	0ms
Warm	Extracted facts, recurring patterns	Hindsight (PostgreSQL 16)	~50ms
Cold	Permanent archive, knowledge graph	gbrain + FTS5 session search	~500ms–2s

We dropped the intermediate agentmemory bridge layer from earlier versions. It added Docker overhead with barely any data. The current three layers are simpler, faster, and more reliable.

Quick Start

What you need

Python 3.9+
gbrain (knowledge graph, running on port 8787)
Hindsight (fact store, port 8890)
PostgreSQL 16 (backing store for both of the above)
An AI agent producing sessions (Hermes, Claude Code, etc.)

Install

git clone https://github.com/mage0535/hermes-memory-installer.git
cd hermes-memory-installer

# Set AGENT_HOME to point to your agent's data directory
export AGENT_HOME="$HOME/.hermes"   # or ~/.claude, ~/.cursor, etc.
./install.sh

The installer will:

Check your environment — Python, PostgreSQL, Hindsight, gbrain reachability
Let you pick an embedding model — for semantic search (optional but recommended)
Deploy sidecar scripts — to $AGENT_HOME/scripts/
Patch agent config — adds memory provider settings if a config file is found

Non-interactive mode:

./install.sh --noninteractive --agent-home "$HOME/.my-agent"

After Installing

# Run one archive pass
python3 $AGENT_HOME/scripts/session_to_gbrain.py --resume

# Run the full maintenance cycle
python3 $AGENT_HOME/scripts/memory_maintenance_cycle.py

# Verify everything works
python3 $AGENT_HOME/scripts/sidecar_acceptance_check.py

For ongoing operation, schedule the maintenance cycle via cron (or your agent's built-in scheduler). See ARCHITECTURE.md for recommended schedules.

The Scripts

Seven scripts run the sidecar. All live in $AGENT_HOME/scripts/ after install:

Script	Role
`session_to_gbrain.py`	Incremental session → gbrain archive with MCP API bridge
`memory_governance_rebuild.py`	Rebuild session index, hubs, canonical objects, vector index
`memory_guardian.py`	Capacity monitoring, backlog detection, stuck operation recovery
`memory_family_registry.py`	Query intent classification + Focused Dossier routing
`tiered_context_injector.py`	Layered recall: Hot → Warm → Cold → RRF fusion
`memory_maintenance_cycle.py`	Orchestrator: archive → rebuild → drain → recall → health
`sidecar_acceptance_check.py`	Production validation suite
`archive_sessions.py`	Bulk session archival to gbrain (in cron at 2am)
`auto_session_summary.py`	Session digest generation, runs every 6 hours

Running in production (cron): session_to_gbrain.py, archive_sessions.py, auto_session_summary.py

Available on-demand: memory_governance_rebuild.py, memory_guardian.py, memory_family_registry.py, tiered_context_injector.py, memory_maintenance_cycle.py, sidecar_acceptance_check.py

Focused Dossiers

Some things matter more than others. A key person. A long-running project. A recurring incident.

v3.1.0 lets you declare Focused Dossiers — high-priority memory profiles that get special treatment in recall. A dossier has:

aliases — all the names it's referred to by
topic markers — keywords that trigger dossier-first retrieval
retention priority — don't let this get pruned
timeline tracking — chronological entries for major events

The first production dossier is kiki — a relationship memory profile that demonstrated the pattern works at scale (hundreds of sessions, thousands of extracted facts, timeline-aware recall).

To add your own, edit memory_family_registry.py and add a new profile entry. The format is self-documenting in the file.

Embedding Model Selection

Semantic search needs embeddings. The sidecar supports pluggable models via sentence-transformers.

During install, you pick one. The installer records your choice but doesn't deploy the model — you run the embedding service separately.

How it affects retrieval:

Semantic matching catches meaning, not just keywords
Cross-lingual: Chinese queries find English content
Better clustering of related facts even when wording differs

Supported models:

Model	Langs	Dim	Size	Best For
`intfloat/multilingual-e5-small` ★	100+	384d	~470MB	Default. Balanced multilingual
`BAAI/bge-small-zh-v1.5`	Chinese	512d	~96MB	Tiny Chinese-first deployment
`paraphrase-multilingual-MiniLM-L12-v2`	50+	384d	~471MB	Mature ST ecosystem
`Alibaba-NLP/gte-multilingual-base`	75+	768d	~610MB	Higher recall, more RAM
`sentence-transformers/LaBSE`	109	768d	~471MB	Strong cross-lingual alignment
`BAAI/bge-m3`	100+	1024d	~2GB	Maximum precision, needs resources

Deploying the Embedding Service

pip install sentence-transformers flask

Minimal server:

from sentence_transformers import SentenceTransformer
from http.server import HTTPServer, BaseHTTPRequestHandler
import json

model = SentenceTransformer("intfloat/multilingual-e5-small")

class Handler(BaseHTTPRequestHandler):
    def do_POST(self):
        length = int(self.headers.get("Content-Length", 0))
        body = json.loads(self.rfile.read(length))
        texts = body.get("input", [])
        emb = model.encode(texts, normalize_embeddings=True).tolist()
        self.send_response(200)
        self.send_header("Content-Type", "application/json")
        self.end_headers()
        self.wfile.write(json.dumps(
            {"data": [{"embedding": e} for e in emb]}
        ).encode())

HTTPServer(("127.0.0.1", 8766), Handler).serve_forever()

Set the URL and rebuild governance:

export EMBEDDING_API_URL=http://127.0.0.1:8766/v1/embeddings
python3 $AGENT_HOME/scripts/memory_maintenance_cycle.py

No embedding service? No problem — text-based retrieval (FTS5, LIKE, Hindsight, gbrain) works without it.

Works With Any Agent

Memory Sidecar is agent-agnostic. It reads from $AGENT_HOME/state.db and session files, and operates entirely outside the agent process.

Tested with:

Hermes Agent — original companion, 2+ months production
Claude Code — via AGENT_HOME=~/.claude
Cursor / Codex — shared data directory pattern

The installer respects AGENT_HOME (falls back to HERMES_HOME for backward compatibility). If your agent stores data somewhere non-standard, point --agent-home at it.

Production Track Record

This isn't a prototype. The current stack has been running continuously on a production Hermes installation since April 2026:

10,885 gbrain pages — full knowledge graph with timeline tracking
42,481 Hindsight nodes — extracted facts with auto-retain/recall/reflect
105,601 indexed messages — FTS5 searchable session archive
100% embedding coverage — vector search across all content
brain score 73 — gbrain content quality metric

Repository Layout

installer/     Entry point, config patching, environment checks
scripts/       7 supported sidecar scripts
skills/        Agent-side memory skills (starter-kit, proactive, archivist)
templates/     Memory templates

Acknowledgements

Core Projects

Hermes Agent — the agent this sidecar was built alongside
Hindsight — short/medium-term fact graph
gbrain — personal knowledge graph engine
sentence-transformers — embedding model framework
PostgreSQL + pgvector — vector storage backbone
OpenCode — guided architecture and production iteration

Embedding Models

Community

Shoutout to everyone who filed issues, surfaced recall gaps, and pushed the design forward. GitHub Issues, Discussions, Reddit (r/LocalLLaMA, r/MachineLearning), V2EX, and direct production feedback all shaped v3.1.0.

If this project helps you, drop a star ⭐ — it helps others find it too.

License

MIT. See bundled dependencies for their respective licenses.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
bin		bin
config		config
installer		installer
scripts		scripts
skills		skills
templates		templates
tests		tests
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
ARCHITECTURE_CN.md		ARCHITECTURE_CN.md
HERMES_AUDIT_REPORT.md		HERMES_AUDIT_REPORT.md
MANUAL_INSTALL.md		MANUAL_INSTALL.md
PRIVACY.md		PRIVACY.md
README.md		README.md
README_CN.md		README_CN.md
install.sh		install.sh
install_cli.sh		install_cli.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Memory Sidecar v3.1.0

What This Is

Architecture at a Glance

Quick Start

What you need

Install

After Installing

The Scripts

Focused Dossiers

Embedding Model Selection

Deploying the Embedding Service

Works With Any Agent

Production Track Record

Repository Layout

Acknowledgements

Core Projects

Embedding Models

Community

License

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Memory Sidecar v3.1.0

What This Is

Architecture at a Glance

Quick Start

What you need

Install

After Installing

The Scripts

Focused Dossiers

Embedding Model Selection

Deploying the Embedding Service

Works With Any Agent

Production Track Record

Repository Layout

Acknowledgements

Core Projects

Embedding Models

Community

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages