---
blogpost: true
date: Sep 5, 2018
title: Dask Release 0.19.0
tags: Programming, Python, scipy, dask
---

_This work is supported by [Anaconda Inc.](http://anaconda.com)_

I'm pleased to announce the release of Dask version 0.19.0. This is a major
release with bug fixes and new features. The last release was 0.18.2 on July
23rd. This blogpost outlines notable changes since the last release blogpost
for 0.18.0 on June 14th.

You can conda install Dask:

    conda install dask

or pip install from PyPI:

    pip install dask[complete] --upgrade

Full changelogs are available here:

- [dask/dask](https://github.com/dask/dask/blob/master/docs/source/changelog.rst)
- [dask/distributed](https://github.com/dask/distributed/blob/master/docs/source/changelog.rst)

## Notable Changes

A ton of work has happened over the past two months, but most of the changes
are small and diffuse. Stability, feature parity with upstream libraries (like
Numpy and Pandas), and performance have all significantly improved, but in ways
that are difficult to condense into blogpost form.

That being said, here are a few of the more exciting changes in the new
release.

### Python Versions

We've dropped official support for Python 3.4 and added official support for
Python 3.7.

### Deploy on Hadoop Clusters

Over the past few months [Jim Crist](https://jcrist.github.io/) has bulit a
suite of tools to deploy applications on YARN, the primary cluster manager used
in Hadoop clusters.

- [Conda-pack](https://conda.github.io/conda-pack/): packs up Conda
  environments for redistribution to distributed clusters, especially when
  Python or Conda may not be present.
- [Skein](https://jcrist.github.io/skein/): easily launches and manages YARN
  applications from non-JVM systems
- [Dask-Yarn](https://dask-yarn.readthedocs.io/en/latest/): a thin library
  around Skein to launch and manage Dask clusters

Jim has written about Skein and Dask-Yarn in two recent blogposts:

- [jcrist.github.io/dask-on-yarn](https://jcrist.github.io/dask-on-yarn)
- [jcrist.github.io/introducing-skein.html](https://jcrist.github.io/introducing-skein.html)

### Implement Actors

Some advanced workloads want to directly manage and mutate state on workers. A
task-based framework like Dask can be forced into this kind of workload using
long-running-tasks, but it's an uncomfortable experience.

To address this we've added an experimental Actors framework to Dask alongside
the standard task-scheduling system. This provides reduced latencies, removes
scheduling overhead, and provides the ability to directly mutate state on a
worker, but loses niceties like resilience and diagnostics.
The idea to adopt Actors was shamelessly stolen from the [Ray Project](http://ray.readthedocs.io/en/latest/) :)

```python
class Counter:
    def __init__(self):
        self.n = 0

    def increment(self):
        self.n += 1
        return self.n

counter = client.submit(Counter, actor=True).result()

>>> future = counter.increment()
>>> future.result()
1
```

You can read more about actors in the [Actors documentation](https://distributed.readthedocs.io/en/latest/actors.html).

### Dashboard improvements

The Dask dashboard is a critical tool to understand distributed performance.
There are a few accessibility issues that trip up beginning users that we've
addressed in this release.

#### Save task stream plots

You can now save a task stream record by wrapping a computation in the
`get_task_stream` context manager.

```python
from dask.distributed import Client, get_task_stream
client = Client(processes=False)

import dask
df = dask.datasets.timeseries()

with get_task_stream(plot='save', filename='my-task-stream.html') as ts:
    df.x.std().compute()
```

```python
>>> ts.data
[{'key': "('make-timeseries-edc372a35b317f328bf2bb5e636ae038', 0)",
  'nbytes': 8175440,
  'startstops': [('compute', 1535661384.2876947, 1535661384.3366017)],
  'status': 'OK',
  'thread': 139754603898624,
  'worker': 'inproc://192.168.50.100/15417/2'},

  ...
```

This gives you the start and stop time of every task on every worker done
during that time. It also saves that data as an HTML file that you can share
with others. This is very valuable for communicating performance issues within
a team. I typically upload the HTML file as a gist and then share it with
rawgit.com

```
$ gist my-task-stream.html
https://gist.github.com/f48a121bf03c869ec586a036296ece1a
```

<iframe src="https://rawgit.com/mrocklin/f48a121bf03c869ec586a036296ece1a/raw/d2c1a83d5dc62996eeabca495d5284e324d71d0c/my-task-stream.html" width="800" height="400"></iframe>

#### Robust to different screen sizes

The Dashboard's layout was designed to be used on a single screen, side-by-side
with a Jupyter notebook. This is how many Dask developers operate when working
on a laptop, however it is not how many users operate for one of two reasons:

1.  They are working in an office setting where they have several screens
2.  They are new to Dask and uncomfortable splitting their screen into two
    halves

In these cases the styling of the dashboard becomes odd. Fortunately, [Luke
Canavan](https://github.com/canavandl) and [Derek
Ludwig](https://github.com/dsludwig) recently improved the CSS for the
dashboard considerably, allowing it to switch between narrow and wide screens.
Here is a snapshot.

<a href="/images/dashboard-widescreen.png"><img src="/images/dashboard-widescreen.png" width="70%"></a>

#### Jupyter Lab Extension

You can now embed Dashboard panes directly within Jupyter Lab using the newly
updated [dask-labextension](https://github.com/dask/dask-labextension/).

```
jupyter labextension install dask-labextension
```

This allows you to layout your own dashboard directly within JupyterLab. You
can combine plots from different pages, control their sizing, and so on. You
will need to provide the address of the dashboard server
(`http://localhost:8787` by default on local machines) but after that
everything should persist between sessions. Now when I open up JupyterLab and
start up a Dask Client, I get this:

<a href="/images/dashboard-jupyterlab.png"><img src="/images/dashboard-jupyterlab.png" width="70%"></a>

Thanks to [Ian Rose](https://github.com/ian-r-rose) for doing most of the work
here.

## Outreach

### Dask Stories

People who use Dask have been writing about their experiences at [Dask
Stories](https://dask-stories.readthedocs.io/en/latest/). In the last couple
months the following people have written about and contributed their experience:

1.  [Civic Modelling at Sidewalk Labs](https://dask-stories.readthedocs.io/en/latest/sidewalk-labs.html) by [Brett Naul](https://github.com/bnaul)
2.  [Genome Sequencing for Mosquitoes](https://dask-stories.readthedocs.io/en/latest/mosquito-sequencing.html) by [Alistair Miles](http://alimanfoo.github.io/about/)
3.  [Lending and Banking at Full Spectrum](https://dask-stories.readthedocs.io/en/latest/fullspectrum.html) by [Hussain Sultan](https://www.linkedin.com/in/hussainsultan/)
4.  [Detecting Cosmic Rays at IceCube](https://dask-stories.readthedocs.io/en/latest/icecube-cosmic-rays.html) by [James Bourbeau](https://github.com/jrbourbeau)
5.  [Large Data Earth Science at Pangeo](https://dask-stories.readthedocs.io/en/latest/pangeo.html) by [Ryan Abernathey](http://rabernat.github.io/)
6.  [Hydrological Modelling at the National Center for Atmospheric Research](https://dask-stories.readthedocs.io/en/latest/hydrologic-modeling.html) by [Joe Hamman](http://joehamman.com/about/)
7.  [Mobile Networks Modeling](https://dask-stories.readthedocs.io/en/latest/network-modeling.html) by [Sameer Lalwani](https://www.linkedin.com/in/lalwanisameer/)
8.  [Satellite Imagery Processing at the Space Science and Engineering Center](https://dask-stories.readthedocs.io/en/latest/satellite-imagery.html) by [David Hoese](http://github.com/djhoese)

These stories help people understand where Dask is and is not applicable, and
provide useful context around how it gets used in practice. We welcome further
contributions to this project. It's very valuable to the broader community.

### Dask Examples

The [Dask-Examples repository](https://github.com/dask/dask-examples) maintains
easy-to-run examples using Dask on a small machine, suitable for an entry-level
laptop or for a small cloud instance. These are hosted on
[mybinder.org](https://mybinder.org) and are integrated into our documentation.
A number of new examples have arisen recently, particularly in machine
learning. We encourage people to try them out by clicking the link below.

[![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/dask/dask-examples/main)

## Other Projects

- The [dask-image](https://dask-image.readthedocs.io/en/latest/) project was
  recently released. It includes a number of image processing routines around
  dask arrays.

  This project is mostly maintained by [John Kirkham](https://github.com/jakirkham).

- [Dask-ML](https://dask-ml.readthedocs.io/en/latest/) saw a recent bugfix release

- The [TPOT](http://epistasislab.github.io/tpot/) library for automated
  machine learning recently published a new release that adds Dask support to
  parallelize their model training. More information is available on the
  [TPOT documentation](http://epistasislab.github.io/tpot/using/#parallel-training-with-dask)

## Acknowledgements

Since June 14th, the following people have contributed to the following repositories:

The core Dask repository for parallel algorithms:

- Anderson Banihirwe
- Andre Thrill
- Aurélien Ponte
- Christoph Moehl
- Cloves Almeida
- Daniel Rothenberg
- Danilo Horta
- Davis Bennett
- Elliott Sales de Andrade
- Eric Bonfadini
- GPistre
- George Sakkis
- Guido Imperiale
- Hans Moritz Günther
- Henrique Ribeiro
- Hugo
- Irina Truong
- Itamar Turner-Trauring
- Jacob Tomlinson
- James Bourbeau
- Jan Margeta
- Javad
- Jeremy Chen
- Jim Crist
- Joe Hamman
- John Kirkham
- John Mrziglod
- Julia Signell
- Marco Rossi
- Mark Harfouche
- Martin Durant
- Matt Lee
- Matthew Rocklin
- Mike Neish
- Robert Sare
- Scott Sievert
- Stephan Hoyer
- Tobias de Jong
- Tom Augspurger
- WZY
- Yu Feng
- Yuval Langer
- minebogy
- nmiles2718
- rtobar

The dask/distributed repository for distributed computing:

- Anderson Banihirwe
- Aurélien Ponte
- Bartosz Marcinkowski
- Dave Hirschfeld
- Derek Ludwig
- Dror Birkman
- Guillaume EB
- Jacob Tomlinson
- Joe Hamman
- John Kirkham
- Loïc Estève
- Luke Canavan
- Marius van Niekerk
- Martin Durant
- Matt Nicolls
- Matthew Rocklin
- Mike DePalatis
- Olivier Grisel
- Phil Tooley
- Ray Bell
- Tom Augspurger
- Yu Feng

The dask/dask-examples repository for easy-to-run examples:

- Albert DeFusco
- Dan Vatterott
- Guillaume EB
- Matthew Rocklin
- Scott Sievert
- Tom Augspurger
- mholtzscher
