Skip to content

fix: queue telegram bridge messages per chat to prevent session lock contention#860

Open
Junior00619 wants to merge 1 commit intoNVIDIA:mainfrom
Junior00619:fix/telegram-bridge-session-lock
Open

fix: queue telegram bridge messages per chat to prevent session lock contention#860
Junior00619 wants to merge 1 commit intoNVIDIA:mainfrom
Junior00619:fix/telegram-bridge-session-lock

Conversation

@Junior00619
Copy link
Contributor

@Junior00619 Junior00619 commented Mar 25, 2026

Fixes #831

Problem
Concurrent inbound messages for the same Telegram chat each trigger an independent runAgentInSandbox call keyed on the same session ID (tg-). Because the underlying session store uses file-level locking, overlapping writes from parallel SSH+agent processes race on the lock and surface as session file locked (timeout 10000ms) errors.

Root Cause
The poll loop dispatches agent calls inline with await, but the for...of iteration over updates only serializes messages within a single polling batch. Under sustained load, the next getUpdates batch can begin processing before prior agent calls resolve, producing concurrent sandbox processes for the same chat.

Fix
A per-chat promise chain (chatQueues: Map<string, Promise>) gates runAgentInSandbox so at most one invocation is in-flight per chat at any time. Subsequent messages for the same chat are appended to the chain via .then(job, job) — the rejection handler ensures the queue drains even if an individual job throws. Cross-chat concurrency is unaffected.

Cleanup is handled in .finally(): the map entry is removed only when the stored reference matches the completing promise, preventing a late-settling chain from stomping a freshly enqueued one. The /reset command now also evicts the queue entry so a user reset doesn't block behind a stale in-flight call.

Summary by CodeRabbit

  • Bug Fixes
    • Messages for each Telegram chat are now processed sequentially so one message completes before the next starts.
    • Typing indicator is sent and maintained while a message is being processed, then cleared when a response or error is delivered.
    • Reset now clears any pending chat queue entries as well as the session history.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 25, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 32ef2829-331f-45f3-9b97-f4cc74065984

📥 Commits

Reviewing files that changed from the base of the PR and between 2a61512 and 1f1be1a.

📒 Files selected for processing (1)
  • scripts/telegram-bridge.js
🚧 Files skipped from review as they are similar to previous changes (1)
  • scripts/telegram-bridge.js

📝 Walkthrough

Walkthrough

Per-chat Promise-based queues were added to serialize agent execution for each Telegram chat. Incoming messages are enqueued; jobs send typing indicators, run the sandboxed agent, then send responses. The chat queue entry is removed on /reset alongside session history.

Changes

Cohort / File(s) Summary
Telegram bridge queuing
scripts/telegram-bridge.js
Adds a chatQueues Map to chain Promises per chatId, enqueuing jobs that emit typing, run runAgentInSandbox, clear typing intervals, and send the response or error. /reset now deletes the chat's queue entry as well as clearing stored session history. Replaces immediate agent invocation with serialized job execution to avoid concurrent session file locks.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant User as Telegram User
    participant TG as Telegram Bot API
    participant Bridge as telegram-bridge.js
    participant Agent as OpenClaw Sandbox
    participant Store as Session Store

    User->>TG: send message
    TG->>Bridge: deliver update (message, chatId, message_id)
    Bridge->>Bridge: enqueue job in chatQueues[chatId]
    Bridge->>TG: (optional) send typing action
    Bridge->>Agent: runAgentInSandbox(message, sessionId)
    Agent->>Store: read/write session file
    Agent-->>Bridge: agent response / error
    Bridge->>TG: send reply (using original message_id)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰
I hop in lines, one by one,
Chats wait patient till the job is done.
No more locked files, no startled cries—
A tidy queue beneath my skies. ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: introducing per-chat message queuing to prevent session lock contention in the Telegram bridge.
Linked Issues check ✅ Passed The PR implements the per-chat queuing mechanism as specified in #831, preventing concurrent runAgentInSandbox calls for the same chat and addressing session lock contention.
Out of Scope Changes check ✅ Passed All changes in the PR are directly related to implementing the per-chat queuing mechanism described in issue #831; no out-of-scope modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@Junior00619
Copy link
Contributor Author

@cv friendly ping , this is ready for review whenever you have a moment 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug] Telegram bridge fails with session file lock error — "Agent exited with code 255"

1 participant