fix: handle and propagate errors in event channel monitors#573
fix: handle and propagate errors in event channel monitors#573xdustinface wants to merge 4 commits intorefactor/move-callbacks-to-cratesfrom
Conversation
Introduce an `EventHandler` trait in `dash-spv` with default no-op methods (`on_sync_event`, `on_network_event`, `on_progress`, `on_wallet_event`, `on_error`). The `run()` method now accepts an `Arc<H: EventHandler>`, subscribes to internal channels, and dispatches events to the handler via monitoring tasks. Add `TestEventHandler` in `test_utils` that bridges events back to tokio channels for ergonomic `select!`-based integration tests.
Implement the `EventHandler` trait with tracing-based logging for sync events, wallet events, progress updates, and errors in `main.rs`.
Pass all event callbacks as a single `FFIEventCallbacks` struct directly to `run()` instead of requiring separate set_*_callbacks() calls. This removes 5 Mutex fields from `FFIDashSpvClient` and 10 FFI functions.
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughThe changes implement a failure detection mechanism for event and progress monitoring tasks. A dedicated Changes
Sequence DiagramsequenceDiagram
participant Coordinator as Coordinator<br/>(run loop)
participant Monitor as Monitor Task<br/>(broadcast)
participant Handler as Event<br/>Handler
participant Token as on_failure<br/>Token
Monitor->>Monitor: Detect channel lag/<br/>closure
Monitor->>Handler: Call on_error()
Monitor->>Handler: Log error
Monitor->>Token: Cancel on_failure token
Monitor->>Monitor: Break from loop
Coordinator->>Coordinator: tokio::select! loop
Coordinator->>Token: Poll on_failure
Token-->>Coordinator: Cancellation detected
Coordinator->>Coordinator: Return SpvError::<br/>ChannelFailure
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@CodeRabbit review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
dash-spv/src/client/sync_coordinator.rs (1)
53-85:⚠️ Potential issue | 🟠 MajorReturn the real monitor failure instead of synthesizing
"broadcast receiver lagged".All four monitors cancel the same token, but this branch always constructs
ChannelFailure("event monitor", "broadcast receiver lagged"). That misreports progress-channel failures and unexpected broadcast closes, and it drops the more specific failure message already produced by the monitor task. Please propagate the concrete failure back intorun()and return that here. As per coding guidelines, "Use proper error types (thiserror) and propagate errors appropriately in Rust code".Also applies to: 106-111
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dash-spv/src/client/sync_coordinator.rs` around lines 53 - 85, The code in run() currently maps any monitor cancellation to a synthesized ChannelFailure("event monitor", "broadcast receiver lagged") which hides the real cause; change the logic that handles monitor_shutdown/monitor_failure so it returns the concrete failure produced by the monitor tasks (propagate the actual error from monitor_failure) instead of constructing the static "broadcast receiver lagged" message. Locate the monitor spawns (spawn_broadcast_monitor and spawn_progress_monitor) and the branch that creates ChannelFailure("event monitor", "broadcast receiver lagged") and replace it so run() awaits or inspects monitor_failure (the propagated error type) and returns that error (preserving its thiserror type) when cancelling the token; ensure the same fix is applied for the second occurrence referenced (lines ~106-111).
🧹 Nitpick comments (1)
dash-spv/src/client/event_handler.rs (1)
242-263: Add a teardown test that closes the channel aftershutdown.cancel().The new guards on Line 55 and Line 98 only matter when the channel closes after shutdown has been requested. The added tests cover unexpected drops and explicit shutdown separately, but they still don't exercise that path, so a regression there would go unnoticed. As per coding guidelines, "Write unit tests for new functionality in Rust code" and "Place unit tests alongside code with
#[cfg(test)]attribute".Also applies to: 319-337
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dash-spv/src/client/event_handler.rs` around lines 242 - 263, Add a new unit test placed alongside the existing tests (under #[cfg(test)]) that exercises the path where the broadcast channel is closed after shutdown has been requested: create a broadcast::channel::<SyncEvent>, spawn the same spawn_broadcast_monitor using the RecordingHandler, call shutdown.cancel() first, then drop the sender to close the channel, await the task, and assert the handler/error and on_failure behavior matches expectations; reference the existing test broadcast_monitor_fails_on_unexpected_channel_close and reuse its setup (RecordingHandler, shutdown CancellationToken, on_failure CancellationToken, and spawn_broadcast_monitor invocation) but invert the order of shutdown.cancel() and drop(tx) so the guards on lines 55 and 98 are exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@dash-spv/src/client/sync_coordinator.rs`:
- Around line 53-85: The code in run() currently maps any monitor cancellation
to a synthesized ChannelFailure("event monitor", "broadcast receiver lagged")
which hides the real cause; change the logic that handles
monitor_shutdown/monitor_failure so it returns the concrete failure produced by
the monitor tasks (propagate the actual error from monitor_failure) instead of
constructing the static "broadcast receiver lagged" message. Locate the monitor
spawns (spawn_broadcast_monitor and spawn_progress_monitor) and the branch that
creates ChannelFailure("event monitor", "broadcast receiver lagged") and replace
it so run() awaits or inspects monitor_failure (the propagated error type) and
returns that error (preserving its thiserror type) when cancelling the token;
ensure the same fix is applied for the second occurrence referenced (lines
~106-111).
---
Nitpick comments:
In `@dash-spv/src/client/event_handler.rs`:
- Around line 242-263: Add a new unit test placed alongside the existing tests
(under #[cfg(test)]) that exercises the path where the broadcast channel is
closed after shutdown has been requested: create a
broadcast::channel::<SyncEvent>, spawn the same spawn_broadcast_monitor using
the RecordingHandler, call shutdown.cancel() first, then drop the sender to
close the channel, await the task, and assert the handler/error and on_failure
behavior matches expectations; reference the existing test
broadcast_monitor_fails_on_unexpected_channel_close and reuse its setup
(RecordingHandler, shutdown CancellationToken, on_failure CancellationToken, and
spawn_broadcast_monitor invocation) but invert the order of shutdown.cancel()
and drop(tx) so the guards on lines 55 and 98 are exercised.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e4a090ec-4d14-4e67-b83c-1db7a7f43090
📒 Files selected for processing (2)
dash-spv/src/client/event_handler.rsdash-spv/src/client/sync_coordinator.rs
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## refactor/move-callbacks-to-crates #573 +/- ##
=====================================================================
+ Coverage 66.32% 66.55% +0.22%
=====================================================================
Files 311 312 +1
Lines 64976 65058 +82
=====================================================================
+ Hits 43097 43298 +201
+ Misses 21879 21760 -119
|
- Add `on_failure` cancellation token to `spawn_broadcast_monitor` and `spawn_progress_monitor` to signal the run loop on fatal monitor errors - Treat Lagged broadcast errors as fatal, lost events are unrecoverable at the moment - Distinguish expected vs unexpected channel close by checking the shutdown token - Run loop watches `monitor_failure.cancelled()` and exits with `ChannelFailure` error
31549f0 to
1f3f4c0
Compare
becff66 to
9ec7541
Compare
|
This PR has merge conflicts with the base branch. Please rebase or merge the base branch into your branch to resolve them. |
aaddece to
23accc0
Compare
on_failurecancellation token tospawn_broadcast_monitorandspawn_progress_monitorto signal the run loop on fatal monitor errorsmonitor_failure.cancelled()and exits withChannelFailureerrorSummary by CodeRabbit