Skip to content

fix(#2861): release per-message render caches when streaming completes#2866

Open
aheritier wants to merge 1 commit into
mainfrom
fix-tui-message-render-leak
Open

fix(#2861): release per-message render caches when streaming completes#2866
aheritier wants to merge 1 commit into
mainfrom
fix-tui-message-render-leak

Conversation

@aheritier
Copy link
Copy Markdown
Contributor

@aheritier aheritier commented May 21, 2026

Summary

messageModel held its rendered ANSI output (renderCache.result) and a per-message IncrementalRenderer (inputPrefix + outputPrefix + codeBlocksPrefix) for the full lifetime of the session. With ~80–120 KB of retained strings per assistant turn and no eviction, a long TUI session grew its heap linearly until macOS's memorystatus (jetsam) subsystem killed docker-agent as the largest compressed process.

Reported in #2861 via:

kernel: [com.apple.xnu:memorystatus] memorystatus: killing largest compressed process docker-agent [<pid>] 390361 MB

The compressed-pages number is dramatic because ANSI output (escape sequences + line padding to terminal width) compresses extremely well, but the underlying retained heap is real and grows at +161 KB / assistant message in a synthetic harness.

Fix

  • New Finalize() and HasLiveRenderState() on the message.Model interface and *messageModel. Finalize() drops renderCache and Reset()s + nils mdRenderer. Sets a finalized flag so Render() skips repopulating renderCache, and renderAssistantMarkdown() falls back to a transient renderer rather than persisting one on the struct.
  • Finalize() is gated on MessageTypeAssistant. Only assistant views allocate an IncrementalRenderer and accumulate large rendered ANSI strings during streaming. User messages, tool calls and other types have nothing to release; setting finalized = true on them would only have the side effect of permanently disabling their renderCache for selected/hovered states.
  • SetMessage() un-finalizes the view if streaming ever resumes into it (Finalize is purely a cache hint).
  • messages.model.finalizePreviousMessageView() runs whenever a new top-level entry is appended (addMessage, AddCancelledMessage, AddOrUpdateToolCall).
  • addReasoningBlock() deliberately does not finalize. Reasoning blocks interleave with an actively streaming assistant turn, and finalizing mid-stream would force every subsequent chunk through the slow transient-renderer path. The next non-reasoning entry finalizes via addMessage() when the streaming turn actually ends.
  • LoadFromSession finalizes every loaded view except the last.
  • removeSpinner() no longer calls invalidateAllItems(). The spinner is always at the tail, so wiping the entire 500-entry rendered-item LRU on every assistant turn was both wasteful and immediately re-armed the leak by forcing a full re-render of every previous message into a fresh renderCache. It now only resets the joined renderedLines slab; spinner-driven messages are never put in the LRU (shouldCacheMessage returns false), so no LRU Delete is needed.

The chat-list's bounded LRU continues to memoize finalized messages' rendered output, so subsequent re-renders are still O(1) when a viewed message is in cache.

Validation

pkg/tui/components/messages/leak_test.go streams 400 8-KiB assistant messages (200 warmup + 200 measured) and asserts both a per-message heap budget and a structural invariant. The structural assertion uses the new exported HasLiveRenderState() method rather than cross-package reflection on unexported fields.

per-message slope live render-state holders
HEAD before fix 161 KB / msg (linear, no plateau) 400 / 400
HEAD after fix 18 KB / msg 1 / 800
$ task build && task lint
Built ./bin/docker-agent
0 issues.
$ go test ./pkg/tui/components/message ./pkg/tui/components/messages -count=1
ok  	github.com/docker/docker-agent/pkg/tui/components/message
ok  	github.com/docker/docker-agent/pkg/tui/components/messages

task test passes everywhere except the two pkg/teamloader/TestLoadExamples/.../dmr.yaml cases that require Docker Model Runner to be running locally — environmental, unrelated to this change.

Fixes #2861


Assisted-By: docker-agent

@aheritier aheritier requested a review from a team as a code owner May 21, 2026 19:17
@aheritier aheritier added area/tui For features/issues/fixes related to the TUI priority:high kind/fix PR fixes a bug (maps to fix: commit prefix) labels May 21, 2026
@aheritier aheritier marked this pull request as draft May 21, 2026 19:18
Copy link
Copy Markdown

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟡 NEEDS ATTENTION

This PR correctly addresses the memory leak in long TUI sessions by introducing Finalize() on messageModel to drop renderCache and mdRenderer once streaming completes. The removeSpinner() targeted-invalidation improvement is architecturally sound. Two issues were found in the new code — one performance edge case and one dead code placement.

Comment thread pkg/tui/components/messages/messages.go Outdated
Comment thread pkg/tui/components/messages/messages.go Outdated
@aheritier aheritier changed the title fix(tui): release per-message render caches when streaming completes fix(#2861): release per-message render caches when streaming completes May 21, 2026
@aheritier aheritier self-assigned this May 21, 2026
@aheritier aheritier force-pushed the fix-tui-message-render-leak branch from ff46eb0 to 306af6a Compare May 21, 2026 19:48
@aheritier
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟢 APPROVE

The memory-leak fix is well-reasoned and correctly targets the dominant sources of retained heap in long TUI sessions: the per-message renderCache.result ANSI string and the IncrementalRenderer's prefix cache. The Finalize()/finalizePreviousMessageView() lifecycle is sound, the SetMessage() un-finalize path is correct, and the regression test methodology (heap-inuse slope + structural invariant via reflection) is solid.

Two minor issues found in the added code (see inline comments).

Comment thread pkg/tui/components/messages/messages.go Outdated
Comment thread pkg/tui/components/message/message.go
@aheritier aheritier marked this pull request as ready for review May 21, 2026 20:07
@aheritier aheritier requested review from dgageot and rumpl May 21, 2026 20:07
Copy link
Copy Markdown

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🔴 CRITICAL

Comment thread pkg/tui/components/messages/leak_test.go Outdated
messageModel held its rendered ANSI output (renderCache.result) and a
per-message IncrementalRenderer (inputPrefix + outputPrefix +
codeBlocksPrefix) for the lifetime of the session. With ~80-120 KB of
retained strings per assistant turn and no eviction, a long TUI session
grew its heap linearly until macOS's memorystatus subsystem killed
docker-agent as the largest compressed process.

Add Finalize() on messageModel that drops both caches when a message is
no longer the active streaming target, and call it on the previous view
whenever a new top-level entry is appended (and on every loaded view
except the last in LoadFromSession). Render() and renderAssistantMarkdown()
gracefully fall back to a transient renderer for finalized messages, so
output stays correct; the chat list's bounded LRU still memoizes their
rendered output.

Finalize() is gated on MessageTypeAssistant: only assistant views allocate
an IncrementalRenderer and accumulate large rendered ANSI strings during
streaming, so user messages, tool calls, and other types have nothing to
release. Setting finalized=true on those views would only have the side
effect of permanently disabling renderCache for selected/hovered states
(which bypass the parent's bounded LRU), forcing a fresh re-render every
animation tick.

addReasoningBlock() does not finalize the previous view: reasoning
blocks routinely interleave with an actively streaming assistant turn,
and finalizing the in-progress view would force every subsequent chunk
through the transient-renderer path. The next non-reasoning entry will
finalize via addMessage() when the streaming turn actually ends.

removeSpinner() no longer calls invalidateAllItems(): the spinner is
always at the tail, so wiping the LRU on every assistant turn was both
wasteful and immediately re-armed the leak by forcing a full re-render
of every previous message into a fresh renderCache. Spinner-driven
messages are never put in the LRU (shouldCacheMessage returns false),
so only the joined renderedLines slab needs to be reset.

Adds a regression test that streams 400 large assistant messages and
asserts both heap growth (<30 KB/message vs. the 161 KB/message
pre-fix slope) and the structural invariant that at most one view
retains render-cache state. The assertion uses a new exported
HasLiveRenderState() method on the message.Model interface instead of
cross-package reflection on unexported fields.

Fixes #2861

Assisted-By: docker-agent
@aheritier aheritier force-pushed the fix-tui-message-render-leak branch from 306af6a to 243c31f Compare May 21, 2026 20:27
@aheritier
Copy link
Copy Markdown
Contributor Author

/review

Copy link
Copy Markdown

@docker-agent docker-agent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assessment: 🟢 APPROVE

This PR introduces a well-designed memory leak fix for the TUI's messageModel. The Finalize() / HasLiveRenderState() / finalizePreviousMessageView() mechanics are coherent, the gating on MessageTypeAssistant is correct, and the reasoning about why addReasoningBlock deliberately skips finalization is sound.

What was reviewed:

  • pkg/tui/components/message/message.go — new Finalize(), HasLiveRenderState(), renderAssistantMarkdown transient-renderer path, SetMessage un-finalize logic, shouldCacheMessage guard
  • pkg/tui/components/messages/messages.gofinalizePreviousMessageView() call sites, removeSpinner narrowing from invalidateAllItems to targeted reset, LoadFromSession finalization loop
  • pkg/tui/components/messages/leak_test.go — heap-slope and live-render-state assertions

No confirmed or likely bugs were found in the added code. The implementation correctly addresses the linear heap growth described in #2861.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/tui For features/issues/fixes related to the TUI kind/fix PR fixes a bug (maps to fix: commit prefix)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

tui: per-message render caches leak across long sessions, triggering macOS jetsam kills

2 participants