Posts tagged distributed
Shuffling large data at constant memory in Dask
- Mar 15, 2023
This work was engineered and supported by Coiled. In particular, thanks to Florian Jetter, Gabe Joseph, Hendrik Makait, and Matt Rocklin. Original version of this post appears on blog.coiled.io
Data Proximate Computation on a Dask Cluster Distributed Between Data Centres
- Jul 19, 2022
This work is a joint venture between the Met Office and the European Weather Cloud, which is a partnership of ECMWF and EUMETSAT.
Measuring Dask memory usage with dask-memusage
- Mar 11, 2021
Using too much computing resources can get expensive when you’re scaling up in the cloud.
Configuring a Distributed Dask Cluster
- Jul 30, 2020
Configuring a Dask cluster can seem daunting at first, but the good news is that the Dask project has a lot of built in heuristics that try its best to anticipate and adapt to your workload based on the machine it is deployed on and the work it receives. Possibly for a long time you can get away with not configuring anything special at all. That being said, if you are looking for some tips to move on from using Dask locally, or have a Dask cluster that you are ready to optimize with some more in-depth configuration, these tips and tricks will help guide you and link you to the best Dask docs on the topic!
Dask-jobqueue
- Oct 08, 2018
This work was done in collaboration with Matthew Rocklin (Anaconda), Jim Edwards (NCAR), Guillaume Eynard-Bontemps (CNES), and Loïc Estève (INRIA), and is supported, in part, by the US National Science Foundation Earth Cube program. The dask-jobqueue package is a spinoff of the Pangeo Project. This blogpost was previously published here