Fix silent event loss in DomainParticipant status channel by alvgaona · Pull Request #401 · Atostek/RustDDS

alvgaona · 2026-03-07T22:01:19Z

Summary

The DomainParticipantStatusEvent channel silently drops events when its capacity is exceeded. The try_send implementation converts a Full error into Ok(()), so neither the sender nor the consumer can detect the loss. The only indication is a trace!-level log that is invisible at normal log levels.

This PR:

Increases the channel capacity from 16 to 2048
Upgrades the overflow log from trace! to warn!
Minor formatting fixes (comment alignment, import grouping)

How was this discovered

I discovered this bug while building a ROS 2 application that uses RustDDS. When using the status_listener() API to introspect the DDS graph, some nodes' services and topics would not show up at all. Which nodes were visible varied between runs. After ruling out consumer-side issues and confirming SEDP was delivering all data correctly, we traced the problem to the bounded status channel silently dropping events.

Why 16 is not enough

A single DDS participant can expose many endpoints. When SEDP discovers a remote participant, it generates WriterDetected and ReaderDetected events for every endpoint in a burst — faster than the consumer can drain.

For example, a typical ROS 2 node creates ~9 writers and ~7 readers (~16 status events per node). Two nodes produce ~32 events, already exceeding the channel capacity. At 10 nodes (~160 events), 90% of events are silently lost. This is not specific to ROS 2 — any DDS application with multiple endpoints per participant will hit this with modest scale.

Why 2048

Covers realistic deployments. 2048 handles ~120 participants in a single SEDP burst, which covers the vast majority of DDS systems.
Negligible cost. The channel is a notification pipe — the actual endpoint data already lives in the DiscoveryDB. At ~200-500 bytes per event, peak buffer is ~1 MB. That's nothing compared to the SPDP/SEDP threads, network buffers, and DDSCache already maintained.
No performance impact. Send/receive are O(1) regardless of capacity. In steady state the channel is nearly empty — it only fills during initial discovery bursts.
No behavioral change. Events that previously went through still go through. The only difference is that events that were silently dropped now have room in the channel.

The DomainParticipantStatusEvent channel had a capacity of 16 and silently dropped events when full, causing downstream consumers to miss endpoint discoveries entirely with no indication of data loss. A single participant can expose many endpoints (e.g. ~16 in a typical ROS 2 node), so even two participants overwhelm a 16-slot channel during the initial SEDP burst. Changes: - Increase status channel capacity from 16 to 2048 - Upgrade log level from trace! to warn! on channel overflow

alvgaona · 2026-03-07T22:06:32Z

@jhelovuo would you mind reviewing this PR? It's a fix for proper interoperability with ROS 2 DDS.

jhelovuo · 2026-03-08T15:59:06Z

Nice debugging work. Thank you for the contribution.

alvgaona force-pushed the fix/status-channel-overflow branch from 008c582 to 5b70e40 Compare March 7, 2026 22:05

jhelovuo merged commit 724950c into Atostek:master Mar 8, 2026
7 checks passed

alvgaona deleted the fix/status-channel-overflow branch March 8, 2026 16:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix silent event loss in DomainParticipant status channel#401

Fix silent event loss in DomainParticipant status channel#401
jhelovuo merged 1 commit intoAtostek:masterfrom
alvgaona:fix/status-channel-overflow

alvgaona commented Mar 7, 2026 •

edited

Loading

Uh oh!

alvgaona commented Mar 7, 2026

Uh oh!

Uh oh!

jhelovuo commented Mar 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alvgaona commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How was this discovered

Why 16 is not enough

Why 2048

Uh oh!

alvgaona commented Mar 7, 2026

Uh oh!

Uh oh!

jhelovuo commented Mar 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alvgaona commented Mar 7, 2026 •

edited

Loading