fix: queue telegram bridge messages per chat to prevent session lock contention#860
fix: queue telegram bridge messages per chat to prevent session lock contention#860Junior00619 wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughPer-chat Promise-based queues were added to serialize agent execution for each Telegram chat. Incoming messages are enqueued; jobs send typing indicators, run the sandboxed agent, then send responses. The chat queue entry is removed on Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant User as Telegram User
participant TG as Telegram Bot API
participant Bridge as telegram-bridge.js
participant Agent as OpenClaw Sandbox
participant Store as Session Store
User->>TG: send message
TG->>Bridge: deliver update (message, chatId, message_id)
Bridge->>Bridge: enqueue job in chatQueues[chatId]
Bridge->>TG: (optional) send typing action
Bridge->>Agent: runAgentInSandbox(message, sessionId)
Agent->>Store: read/write session file
Agent-->>Bridge: agent response / error
Bridge->>TG: send reply (using original message_id)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
2a61512 to
1f1be1a
Compare
|
@cv friendly ping , this is ready for review whenever you have a moment 🙏 |
Fixes #831
Problem
Concurrent inbound messages for the same Telegram chat each trigger an independent runAgentInSandbox call keyed on the same session ID (tg-). Because the underlying session store uses file-level locking, overlapping writes from parallel SSH+agent processes race on the lock and surface as session file locked (timeout 10000ms) errors.
Root Cause
The poll loop dispatches agent calls inline with await, but the for...of iteration over updates only serializes messages within a single polling batch. Under sustained load, the next getUpdates batch can begin processing before prior agent calls resolve, producing concurrent sandbox processes for the same chat.
Fix
A per-chat promise chain (chatQueues: Map<string, Promise>) gates runAgentInSandbox so at most one invocation is in-flight per chat at any time. Subsequent messages for the same chat are appended to the chain via .then(job, job) — the rejection handler ensures the queue drains even if an individual job throws. Cross-chat concurrency is unaffected.
Cleanup is handled in .finally(): the map entry is removed only when the stored reference matches the completing promise, preventing a late-settling chain from stomping a freshly enqueued one. The /reset command now also evicts the queue entry so a user reset doesn't block behind a stale in-flight call.
Summary by CodeRabbit