Skip to content

fix(chat): surface interrupted/truncated replies + code-block copy button#207

Merged
AVADSA25 merged 1 commit into
mainfrom
fix/chat-stream-visibility
Jul 2, 2026
Merged

fix(chat): surface interrupted/truncated replies + code-block copy button#207
AVADSA25 merged 1 commit into
mainfrom
fix/chat-stream-visibility

Conversation

@AVADSA25

@AVADSA25 AVADSA25 commented Jul 2, 2026

Copy link
Copy Markdown
Owner

What

Fixes the two chat annoyances from today's report (empty bubbles / replies cut mid-sentence) and adds the missing copy button on code blocks.

Why replies died silently: codec_llm.stream never raises — a connection drop, model-server non-200, or read timeout just ended the stream, which the chat handler couldn't tell apart from a clean finish. Separately, when the model stopped at the max_tokens cap (finish_reason="length" — and the cap includes <think> tokens in thinking mode), that was swallowed too. Result: blank bubble or mid-sentence stop, no clue why.

Now:

  • stream(error_sentinel=True) yields STREAM_ERROR / FINISH_LENGTH sentinels (opt-in; zero change for other callers)
  • The chat stream appends a visible note: "⚠️ Reply interrupted — connection to the local model dropped…" or "⚠️ Reply truncated — raise chat.max_tokens…"
  • The blank-bubble fallback is honest: it only blames a tool if [SKILL:] tags were actually resolved (new tags_resolved counter); otherwise it says the model returned nothing
  • chat.max_tokens (28000) and chat.llm_timeout_s (300) are tunable via ~/.codec/config.json — deep-chat length is now operator-controlled
  • Copy button on every code block in the chat UI (floats top-right of the code window, reuses the existing clipboard + toast feedback)

Tests

8 new tests (5 sentinel, 3 tags_resolved). Full suite: 2,440 passed, 4 pre-existing pilot-e2e failures (documented in known-issues).

🤖 Generated with Claude Code

… copy button

Chat replies sometimes rendered as empty bubbles or stopped mid-sentence
with zero explanation. Root cause: codec_llm.stream never raises by
contract — a non-200, connection drop, or read timeout just ENDED the
stream, indistinguishable from a clean finish; and a finish_reason=
"length" stop (max_tokens cap, which includes <think> tokens in
thinking mode) was silently swallowed.

- codec_llm.stream(error_sentinel=True): yields STREAM_ERROR on abnormal
  death and FINISH_LENGTH on the token-cap stop. Off by default — all
  existing callers keep the old contract.
- routes/chat.py stream path consumes the sentinels and tells the user:
  "reply interrupted" / "reply truncated (raise chat.max_tokens)".
- Blank-bubble fallback now honest: "tool didn't apply" only when
  [SKILL:] tags were actually resolved (new SkillTagBuffer.tags_resolved
  counter); otherwise "model returned an empty reply, try again".
- chat.max_tokens (default 28000) + chat.llm_timeout_s (default 300) are
  now operator-tunable via ~/.codec/config.json.
- codec_chat.html: every ``` code block gets a floating copy button
  (reuses the existing clipboard + "copied" feedback path).

Tests: 5 new stream-sentinel tests, 3 new tags_resolved tests.
Full suite 2440 passed (4 pre-existing pilot-e2e failures, known-issues).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@AVADSA25 AVADSA25 merged commit da9b224 into main Jul 2, 2026
1 check passed
@AVADSA25 AVADSA25 deleted the fix/chat-stream-visibility branch July 2, 2026 10:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants