Skip to content

Agent memory compaction via LLM reflection #23

@michaelzwang13

Description

@michaelzwang13

Why this is interesting

Memory in Phase D is naive on purpose: a key/value table the agent writes to via the update-memory skill, with the platform injecting all of it into role_context at dispatch. Correct for hackathon scale and forward-compatible — but it accumulates linearly with the agent's life. The interesting engineering problem is what to do once an agent has hundreds of keys, many of them stale, redundant, or contradictory.

Same shape as MemGPT, Letta, and the generative-agents reflection pattern. Real product moat — "your AI employee actually learns and consolidates how it works for you" — not just maintenance.

The forcing function

Compaction starts mattering when injected memory exceeds the role-context budget. Rough thresholds:

  • ~1–2k tokens of role_context (~50–100 keys at typical sizes) → noticeable signal dilution.
  • Above ~4k tokens → the agent's actual instruction starts losing salience against accumulated trivia.

Below that, all-in injection (Phase D default) is fine.

Three tiers of approach (build order)

  1. Mechanical eviction. LRU on agent_memory (touch updated_at on every read, evict bottom of LRU above N keys). Cheap, no LLM. Ship first as a safety net.

  2. Heuristic clustering. Group keys by namespace prefix (style.*, repos.*, people.*); show counts; let the user prune manually via a UI. Mid-effort, no LLM, gives the user a knob.

  3. LLM reflection (the real prize). A scheduled job (or budget-triggered) loads the agent's full memory + recent slice of agent_action_log, asks the LLM:

    "Consolidate this into the smallest set of facts/preferences that still capture how this user wants you to work. Drop stale/contradicted preferences. Surface contradictions."

    Rewrites memory in place. The agent next session reads the consolidated form.

What makes (3) genuinely hard

  • Race conditions. The agent may be writing memory while reflection runs. Need pessimistic locking, an agent_memory_version snapshot, or a "dirty" flag.
  • Versioning + rollback. If reflection produces worse memory than before, you need a way to revert. Store the pre-reflection snapshot for N days.
  • Trust & visibility. The user should be able to see what reflection did — diff view of before vs after, with the LLM's reasoning. Without that, the agent feels like it's silently forgetting things.
  • Triggering policy. Cron (every Sunday at 3am)? Token-budget triggered (when injected memory > X)? Manual user button? Probably all three with different defaults.
  • Action-log distillation. The cousin problem: turning thousands of agent_action_log rows into a "what has my agent been doing" narrative for the work-log surface. Same reflection-style approach.

Forward-compatible defaults Phase D will set

Both lock-in-free; compaction can be added without schema changes:

  • agent_memory rows carry updated_at for last-write-wins and future LRU.
  • Memory injection at dispatch is "all keys for this agent" — swap the function later for a filtered/compacted version. The dispatch contract is stable.

Open questions

  • Reflection model: Kimi (in-container) or Claude on the platform? Probably Claude — the platform has the full memory and shouldn't depend on the agent's container being up.
  • Should the user be able to edit memory directly, bypassing the agent? (Probably yes; surfaces as a settings panel post-Phase-D.)
  • Is there a public eval for "did this reflection make memory better"? (Probably not without ground truth — but a held-out task can A/B "task quality before vs after reflection.")

Depends on #4 (Phase D).

Metadata

Metadata

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions