add tmc::qu_mpsc_bounded#236
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
tmc::qu_mpsc_boundedis a bounded MPSC queue. It's the worst-performing TMC queue overall due to the combination of multi-producer atomic interactions and bounded queue behavior. Still the performance is respectable. Benchmarks in SPSC mode:tmc::qu_mpsc_bounded: 26M elems/sec, using capacity = 16384tmc::qu_spsc_bounded: 70M elems/sec, or 90M elems/sec in polling-only mode, using capacity = 16384tmc::qu_spsc_unbounded: 200M elems/sectmc::qu_mpsc_unbounded: 45M elems/sec (underperforms in SPSC scenarios due to its reclamation scheme)tmc::channel: 60M elems/secpush()is an awaitable that will suspend the producer if the queue is full, until a slot becomes available.try_push()andpush_bulk()are not provided, due to the 2-stage producer commit protocol making them impossible to safely implement.Compared to
tmc::channel:Advantages of
tmc::qu_mpsc_bounded:std::optional/std::variant. The zero-copy scope objects handle the optionality.Advantages of
tmc::channel:Consumer functions (can only be called by the single consumer):
co_await pull()try_pull()empty()Producer functions (can be called by anyone):
co_await push()close()close_resume_inline()- When a queue and executor are destroyed nearly simultaneously, and the consumer is running on the executor that's going out of scope, there would be a race condition where close() / destructor post the consumer back to the (soon to be destroyed) executor and then return before it actually finishes.close_resume_inline()solves this by setting the close signal and then resuming the awaiter immediately inline. This is a special-purpose tool that should only be used when you can be sure that the coroutine will exit promptly when it receives the close signal. This is recommended for use when building an "actor" class.It also supports these configurable features:
ConsumerCanSuspendenables thepull()operation. This is enabled by default, but can be disabled to slightly improve performance.PackingLevel- by default, queue elements are padded to avoid false sharing. If this is set to1then no padding will be applied.Capacity is set at runtime as a constructor parameter.
The implementation of the consumer / multi-producer suspension protocol is fairly complicated due to the possibility of multiple producers waiting on the same slot due to wraparound. e.g. if the queue capacity is 2, and 6 producers try to enqueue, producer 0 can produce to slot 0, producer 2 will suspend on slot 0, and producer 4 will also suspend on slot 0. Then, a consumer can come and consume from producer 0+1, wake producer 2, then suspend waiting for producer 2 to produce data. At this point we have both producer 4 and a consumer suspended on the same slot, and an in-flight producer 2 that will come produce to that slot. This all works, but requires careful handling.
I did also test a simpler version that just used a
qu_mpsc_unboundedwrapped with a semaphore, but it was slower across all combinations of producer count and capacity, was not allocation-free, and didn't offer any additional APIs (push_bulk / try_push were still unimplementable with the current version of tmc::semaphore).Other changes in this PR:
tmc/detail/qu_storage.hpp.std::atomic<T>::is_always_lock_freeasserts into compat.hppqu_*types.