fix: tighten oneshot memory ordering by tisonkun · Pull Request #111 · fast/mea

tisonkun · 2026-05-13T01:07:06Z

Summary

tighten oneshot sender/receiver state transitions to synchronize final channel ownership handoff
avoid racing with the sender when dropping a pending async receiver by first moving RECEIVING back to EMPTY
add concurrent send/drop with try_recv/poll regression tests inspired by upstream oneshot Miri coverage

Verification

cargo x test
cargo +nightly x lint

Copilot

Pull request overview

Tightens the oneshot channel’s state-transition memory orderings to better synchronize message/waker visibility and final channel deallocation ownership, and adds regression tests covering concurrent send/drop vs try_recv/poll.

Changes:

Adjusted atomic orderings and added dedicated acquire fences around MESSAGE-handling paths to explicitly synchronize message visibility and final deallocation handoff.
Updated drop paths (Sender/Receiver/Recv) to better coordinate which side frees the channel under races.
Added concurrent send/drop completion tests for both try_recv and Future::poll paths.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`mea/src/oneshot/mod.rs`	Updates atomic orderings/fences and drop-path state transitions to strengthen synchronization and ownership handoff.
`mea/src/oneshot/tests.rs`	Adds multi-threaded regression tests for concurrent send/drop vs `try_recv`/`poll`, plus a small thread helper.

Comments suppressed due to low confidence (3)

mea/src/oneshot/tests.rs:363

This test’s receiver thread loops with spin_loop() until Disconnected is observed; without any timeout/iteration cap, failures can turn into an infinite hang during join(). Consider adding a timeout/budget and failing fast if the expected state isn’t reached.

    let receiver_thread = spawn_named("receiver", move || {
        loop {
            match receiver.try_recv() {
                Ok(value) => panic!("unexpected value: {value}"),
                Err(TryRecvError::Empty) => spin_loop(),
                Err(TryRecvError::Disconnected) => break,
            }
        }

mea/src/oneshot/tests.rs:389

This test polls in a tight unbounded spin_loop() until completion. If the wake/state transition breaks, the test will hang indefinitely at join(). Add a bounded timeout/iteration budget (and panic on timeout) to keep CI reliable.

        loop {
            match Pin::new(&mut receiver).poll(&mut context) {
                Poll::Ready(Ok(999)) => break,
                Poll::Ready(result) => panic!("unexpected result: {result:?}"),
                Poll::Pending => spin_loop(),
            }
        }

mea/src/oneshot/tests.rs:415

Similar to the other concurrent tests, this thread spins in an unbounded loop until it observes the disconnect. A regression could cause an infinite hang. Add a timeout/iteration cap and fail fast rather than spinning forever.

        loop {
            match Pin::new(&mut receiver).poll(&mut context) {
                Poll::Ready(Err(oneshot::RecvError::Disconnected)) => break,
                Poll::Ready(result) => panic!("unexpected result: {result:?}"),
                Poll::Pending => spin_loop(),
            }
        }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

orthur2 · 2026-05-13T18:20:57Z

Memory ordering here is tough to think through and guarantee correctness, so my review may not be very comprehensive.

First and foremost, following Copilot's review, Recv::drop needs acquire synchronization before dropping the stored waker. Channel::write_waker publishes the waker with a Release CAS, but the new RECEIVING -> EMPTY CAS in Recv::drop is Relaxed. Since Recv<T> is Send, the future may be polled on one thread and dropped on another, so Relaxed is too weak here. And the Recv::poll waker-replacement path also has the same issue.

Then, I found an issue that existed before this PR. I am concerned that the AWAKING branch in Recv::drop can still deallocate the channel too early. This path writes DISCONNECTED before it handles AWAKING. So if the sender has just moved the state to AWAKING and is still taking the waker, the drop path can read back its own DISCONNECTED write and free the channel too early.

As Copilot noted, the new spin-loop tests should also be bounded with a timeout or iteration budget.

Use acquire ordering before dropping stored wakers in Recv::poll and Recv::drop. When Recv::drop observes AWAKING, wait for the sender to finish the transition instead of writing DISCONNECTED first. Also bound the spin-loop tests and add coverage for pending receiver drop and cross-thread waker replacement.

orthur2 · 2026-05-22T05:23:17Z

I pushed a follow-up commit to address it.

Signed-off-by: tison <wander4096@gmail.com>

orthur2

LGTM from my side.

tisonkun · 2026-05-23T04:14:39Z

I'm reviewing the comments and order. From agent review, it says:

Blockers:

Please explain the Recv::drop interleaving around the RECEIVING -> EMPTY -> DISCONNECTED path. In particular: if a sender concurrently succeeds in send during the temporary EMPTY window, who owns dropping the message, the waker, and the allocation, and in what order? The current code may rely on the next loop iteration doing EMPTY -> DISCONNECTED, but that safety argument is not written down clearly enough for this kind of code.
Fix the misleading comments in oneshot/mod.rs, especially the poll MESSAGE branch saying the sender was dropped when it actually sent a message, and the stale memory-barrier comment around RECEIVING -> EMPTY. In unsafe/atomic code, wrong comments are not cosmetic.

Non-blocking: the new 5s spin/concurrency tests are useful as smoke/stress tests, but they should not be presented as proof of the memory-ordering argument. A loom/model test would be a good follow-up if this code is staying subtle.

orthur2 · 2026-05-23T09:11:14Z

I'm reviewing the comments and order. From agent review, it says:

Blockers:

Please explain the Recv::drop interleaving around the RECEIVING -> EMPTY -> DISCONNECTED path. In particular: if a sender concurrently succeeds in send during the temporary EMPTY window, who owns dropping the message, the waker, and the allocation, and in what order? The current code may rely on the next loop iteration doing EMPTY -> DISCONNECTED, but that safety argument is not written down clearly enough for this kind of code.

Fix the misleading comments in oneshot/mod.rs, especially the poll MESSAGE branch saying the sender was dropped when it actually sent a message, and the stale memory-barrier comment around RECEIVING -> EMPTY. In unsafe/atomic code, wrong comments are not cosmetic.

Non-blocking: the new 5s spin/concurrency tests are useful as smoke/stress tests, but they should not be presented as proof of the memory-ordering argument. A loom/model test would be a good follow-up if this code is staying subtle.

The agent review makes sense. I focused on the state-machine change and missed that some existing comments became stale or misleading.

I added a small follow-up to fix the MESSAGE / RECEIVING -> EMPTY comments and spell out the Recv::drop case a bit more.

For the temporary EMPTY window, after Recv::drop successfully CAS RECEIVING -> EMPTY with Acquire, it has retracted the published waker and owns dropping that waker. The Acquire CAS synchronizes with write_waker's Release CAS, so the waker is initialized before drop_waker().If a sender's send transition wins during this temporary EMPTY window, it observes EMPTY, publishes MESSAGE, and returns Ok. Because it did not observe RECEIVING, it never takes or drops the waker, and because send succeeded from EMPTY, it does not own deallocating the channel. Recv::drop still owns the retracted waker, drops it, and loops. Once the sender has published MESSAGE, Recv::drop will observe MESSAGE with Acquire, drop the message, and deallocate the channel.If no sender wins that temporary EMPTY window, Recv::drop's next loop iteration can complete EMPTY -> DISCONNECTED. In that case, any later cleanup is owned by the sender side: either Sender::send observes DISCONNECTED and the SendError path owns the message/allocation cleanup, or Sender::drop observes DISCONNECTED and deallocates the channel.

fix: tighten oneshot memory ordering

f1166a8

tisonkun requested review from Copilot and orthur2 May 13, 2026 03:14

Copilot started reviewing on behalf of tisonkun May 13, 2026 03:15 View session

Copilot AI reviewed May 13, 2026

View reviewed changes

Comment thread mea/src/oneshot/mod.rs Outdated

Comment thread mea/src/oneshot/tests.rs Outdated

tisonkun marked this pull request as draft May 13, 2026 03:20

orthur2 reviewed May 13, 2026

View reviewed changes

Comment thread mea/src/oneshot/mod.rs Outdated

tisonkun marked this pull request as ready for review May 22, 2026 14:24

fixup style and comments

930a6a6

Signed-off-by: tison <wander4096@gmail.com>

orthur2 approved these changes May 23, 2026

View reviewed changes

docs(oneshot): clarify receiver drop ownership

ed17497

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: tighten oneshot memory ordering#111

fix: tighten oneshot memory ordering#111
tisonkun wants to merge 4 commits into
mainfrom
codex/oneshot-ordering-fixes

tisonkun commented May 13, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

orthur2 commented May 13, 2026

Uh oh!

Uh oh!

orthur2 commented May 22, 2026

Uh oh!

orthur2 left a comment

Uh oh!

tisonkun commented May 23, 2026

Uh oh!

orthur2 commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

tisonkun commented May 13, 2026

Summary

Verification

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

orthur2 commented May 13, 2026

Uh oh!

Uh oh!

orthur2 commented May 22, 2026

Uh oh!

orthur2 left a comment

Choose a reason for hiding this comment

Uh oh!

tisonkun commented May 23, 2026

Uh oh!

orthur2 commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants