Posts tagged dataframe
Dask DataFrame is Fast Now
- May 30, 2024
This work was engineered and supported by Coiled and NVIDIA. Thanks to Patrick Hoefler and Rick Zamora, in particular. Original version of this post appears on docs.coiled.io
Do you need consistent environments between the client, scheduler and workers?
- Apr 14, 2023
Update May 3rd 2023: Clarify GPU recommendations.
Deep Dive into creating a Dask DataFrame Collection with from_map
- Apr 12, 2023
Dask DataFrame provides dedicated IO functions for several popular tabular-data formats, like CSV and Parquet. If you are working with a supported format, then the corresponding function (e.g read_csv) is likely to be the most reliable way to create a new Dask DataFrame collection. For other workflows, from_map now offers a convenient way to define a DataFrame collection as an arbitrary function mapping. While these kinds of workflows have historically required users to adopt the Dask Delayed API, from_map now makes custom collection creation both easier and more performant.
Understanding Dask’s meta keyword argument
- Aug 09, 2022
If you have worked with Dask DataFrames or Dask Arrays, you have probably come across the meta keyword argument. Perhaps, while using methods like apply():
Building GPU Groupby-Aggregations for Dask
- Mar 04, 2019
Document headings start at H2, not H1 [myst.header]
Single-Node Multi-GPU Dataframe Joins
- Jan 29, 2019
Document headings start at H2, not H1 [myst.header]