Async fork/join#32
Conversation
|
Thank you for your pull request. We require contributors to sign our Contributor License Agreement, and yours has expired. Before we can review or merge your code, we need you to email cla@fb.com with your details so we can update your status. |
|
Thanks for sharing this Lee. I am starting to grok this. I am a little troubled by the need for an |
|
Thank you for signing our Contributor License Agreement. We can now accept your code for this (and any) Facebook open source project. Thanks! |
|
It isn't really necessary if we define all algorithms to use customisation points underneath, and then have a trait for an async token. We could just switch to the async version under the hood when we have an incoming token. I just wanted to make this a purely additive change to Kirk's functionality. The big picture is that we cannot wait for the incoming value to do work queuing, because that implies completion of the previous task, so we need some mechanism to handle that. So this instead proposes waiting for a token to say that the previous task has been queued, and have a means to transform from a value stream to that token stream. It also provides a means of carrying the executor through when you need it, but doing so explicitly without changing the underlying concepts. |
But why is this necessary? It is anti-generic to require a reimplementation of all the algorithms each time you want to support a new way of queueing work and waiting for completion. Can the algorithms be parameterized on a queuing / signaling concept? |
|
How else would you parameterise without using customisation points to change the behaviour based on those parameters? That strikes me as the most generic way to go about it. The algorithms will often require reimplementation anyway, a bulk algorithm is not going to be implemented the same way when asked to enqueue to a GPU compared with asking it to run on a CPU. |
Rebase on master and rewrite async.h to fix symbol name length problems
add async.h to the single header
…s on the actual type of the Sender, only its properties
This change demonstrates the most general form of what we should build bulk on top of.
A bulk task is just some sort of parallel operation with a single input and output. Bulk as envisaged, however, also implies some sort of asynchronous queuing. We can build asynchronous queuing directly into pushmi with no extra fundamental concepts, instead a set of customisation points.
This change shows how that works with a very simple implementation - an async transform that, for each transform operation it hits enqueues a new thread. Each transform waits on the previous one using a condition variable and heap-allocated value.
The main executor control flow simply passes a synchronisation token along, firing off the transform to one side and using a join customisation point to rejoin and turn this back into a more traditional value flow at the end.
We customise here on an inline_executor, which is not async at all, and on the new_thread executor which customises the thread creation behaviour. A GPU executor would fire off GPU enqueue primitives, only adding a bulk operation in addition to the scalar transform we see here.