-
Notifications
You must be signed in to change notification settings - Fork 10
Description
Since there are different topics/aspects that can be covered for the Dask tutorial and we have a limited time we need to define the scope of the tutorial and create a clear agenda for it.
From our conversation with @dcherian and @vanderwb : This going to be a half-day event with topics and sessions for the following audiences:
(a) folks who are Python-aware but essentially total novices to Dask.
(b) those who use Dask regularly but would like optimization guidance and tips/tricks.
The followings are my initial thoughts on what would be appropriate for this meeting:
-
Introductory Dask + Xarray:
- What is Dask
- Dask +Xarray
- dask-backed Xarray objects (lazy computations, actual values)
- Distributed clusters
- Extract Dask arrays from Xarray objects and use Dask array directly.
- Dask Delayed to parallelize any code ( do we have time to include this?)
-
Intermediate Topics:
- Dask chunking schemes (performance and rechunking)
- Apply unvectorized functions (
apply_unfunc) - More advanced collection of custom operations (
map_blocks,map_partitions,map_overlapdo we have enough time for this?) - Blockwise computation
Since this is going to be a half-day tutorial, we need to be cautious of the time.
We are going to solicit more feedback on this from the community and ESDS forum.