<?xml version='1.0' encoding='UTF-8'?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
  <id>https://blog.dask.org</id>
  <title>Dask Working Notes - Posts tagged life science</title>
  <updated>2026-03-05T15:05:26.531990+00:00</updated>
  <link href="https://blog.dask.org"/>
  <link href="https://blog.dask.org/blog/tag/life-science/atom.xml" rel="self"/>
  <generator uri="https://ablog.readthedocs.io/" version="0.11.12">ABlog</generator>
  <entry>
    <id>https://blog.dask.org/2021/12/15/dask-fellow-reflections/</id>
    <title>Reflections on one year as the Dask life science fellow</title>
    <updated>2021-12-15T00:00:00+00:00</updated>
    <author>
      <name>Genevieve Buckley</name>
    </author>
    <content type="html">&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/12/15/dask-fellow-reflections.md&lt;/span&gt;, line 9)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;section id="summary"&gt;

&lt;p&gt;&lt;a class="reference external" href="https://github.com/GenevieveBuckley/"&gt;Genevieve Buckley&lt;/a&gt; was hired as a Dask Life Science Fellow in 2021 &lt;a class="reference external" href="https://chanzuckerberg.com/eoss/proposals/"&gt;funded by CZI&lt;/a&gt;. The goal was to improve Dask, with a &lt;a class="reference external" href="https://blog.dask.org/2021/03/04/the-life-science-community"&gt;specific focus on the life science community&lt;/a&gt;. This blogpost contains another progress update, and some personal reflections looking back over this year.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/12/15/dask-fellow-reflections.md&lt;/span&gt;, line 13)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="contents"&gt;
&lt;h1&gt;Contents&lt;/h1&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#progress-update"&gt;&lt;span class="xref myst"&gt;Progress update&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#personal-reflections"&gt;&lt;span class="xref myst"&gt;Personal reflections&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#highlights-from-this-year"&gt;&lt;span class="xref myst"&gt;Highlights from this year&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#what-worked-well"&gt;&lt;span class="xref myst"&gt;What worked well&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#what-didnt-work-so-well"&gt;&lt;span class="xref myst"&gt;What didn’t work so well&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#whats-next-in-dask"&gt;&lt;span class="xref myst"&gt;What’s next in Dask?&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/12/15/dask-fellow-reflections.md&lt;/span&gt;, line 22)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="progress-update"&gt;
&lt;h1&gt;Progress update&lt;/h1&gt;
&lt;p&gt;A previous progress update for February to September 2021 is &lt;a class="reference external" href="https://blog.dask.org/2021/10/20/czi-eoss-update"&gt;available here&lt;/a&gt;. Read on for a progress update for the period September to December 2021.&lt;/p&gt;
&lt;p&gt;To summarize, between September and December 2021 inclusive, there were:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;32 merged pull requests acorss 7 repositories (&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask&lt;/span&gt;&lt;/code&gt;, &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;distributed&lt;/span&gt;&lt;/code&gt;, &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-image&lt;/span&gt;&lt;/code&gt;, &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-tutorial&lt;/span&gt;&lt;/code&gt;, &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;ITK&lt;/span&gt;&lt;/code&gt;, &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;napari&lt;/span&gt;&lt;/code&gt;, and &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;napari.github.io&lt;/span&gt;&lt;/code&gt;)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;8 pending pull requests&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;1 new &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-image&lt;/span&gt;&lt;/code&gt; release&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;1 Dask tutorial run, and assisted with a second tutorial.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;4 new Dask blogposts published (five, if we count this one)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Read on for a more detailed description of special projects within this time.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Dask stale issues sprint&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In two weeks I was able to:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;close 117 stale issues, and&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;identify another 25 potential easy wins for the maintainer team to investigate further.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Lots of other people did work around the same time, following up on old pull requests and other maintanence work. The sprint was very successful overall.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Dask user survey results analysis&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In September I analyzed the results from the 2021 Dask user survey.
This was a really fun task. Because we asked a lot more questions in 2021 (18 new questions, 43 questions in total) there was was a lot more data to dig into, compared with previous years. You can read the &lt;a class="reference external" href="https://blog.dask.org/2021/09/15/user-survey"&gt;full details about it here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The biggest benefit from this work is that now we can use this data to prioritize improvements to the documentation and examples.
The top two user requests are for more documentation and more examples from their industry. But it wasn’t until this year that we started asking what industries people worked in, so we can target new narrative documentation to the areas that need it most (geoscience, life science, and finance).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ITK compatibility with Dask&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I implemented &lt;a class="reference external" href="https://github.com/InsightSoftwareConsortium/ITK/pull/2829/"&gt;pickle serialization for itk images (ITK PR #2829)&lt;/a&gt;. This should be one of the last major pieces of the puzzle needed to make ITK images compatible with Dask. It builds on earlier work by Matt McCormick and John Kirkham (you can read a blog post about their earlier work &lt;a class="reference external" href="https://blog.dask.org/2019/08/09/image-itk"&gt;here&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;Better cross-compatibility for Dask with other projects was a major goal of mine, so this is an important piece of work. I outline the next steps in the section &lt;a class="reference internal" href="#whats-next-in-dask"&gt;&lt;span class="xref myst"&gt;What’s next in Dask?&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Improve rechunking&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I implemented &lt;a class="reference external" href="https://github.com/dask/dask/pull/8124"&gt;PR #8124&lt;/a&gt; fix a bug where reshaping a Dask array can cause an output array with chunks that are much too large to fit in memory.
Feedback from the life science user survey indicates that improving Dask’s performance around rechunking is a priority. This work helps to address that.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;High level graph work&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;A major piece of work earlier this year was introducing high level graphs for array slicing and array overlap operations. That is a big effort requiring a lot of ongoing work.
&lt;a class="reference external" href="https://github.com/dask/dask/pull/8467"&gt;PR #8467&lt;/a&gt; tackles one of the next steps for this work.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Find objects function for dask-image&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I implemented a &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;find_objects&lt;/span&gt;&lt;/code&gt; function for &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-image&lt;/span&gt;&lt;/code&gt; in &lt;a class="reference external" href="https://github.com/dask/dask-image/pull/240"&gt;PR #240&lt;/a&gt;. This implementation does not need to know the maximum label number ahead of time, a subtantial improement over the previous attempt. This is a major step forward, because it removes a major blocker to introducing scikit-image like &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;regionprops&lt;/span&gt;&lt;/code&gt; functionality.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Blogposts&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Dask blogposts published between September through to December 2021 include:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://blog.dask.org/2021/11/02/choosing-dask-chunk-sizes"&gt;Choosing good chunk sizes in Dask&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;This blogpost addresses some very common concerns and questions about using Dask.
I’m very pleased with this article, due to several thoughtful reviewers the final work is a much stronger and more comprehensive than the &lt;a class="reference external" href="https://twitter.com/DataNerdery/status/1424953376043790341"&gt;twitter thread&lt;/a&gt; that inspired it.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It’s also high impact work. In the Dask survey the most common request is for more documentation, and this content helps to address that. Twitter analytics also show much higher engagement with this content than for other similar tweets, indicating a demand in the community for this type of explanation.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://blog.dask.org/2021/12/01/mosaic-fusion"&gt;Mosaic Image Fusion&lt;/a&gt; (co-authored with Volker Hisenstein and Marvin Albert)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;This blogpost was several months in the making (started in mid-August and published in December). It’s fantastic to have people sharing some of the very cool work they do with Dask on real world problems.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://blog.dask.org/2021/10/20/czi-eoss-update"&gt;CZI EOSS Update&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;This blogpost shares with the community an interim progress update provided to CZI.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://blog.dask.org/2021/09/15/user-survey"&gt;2021 Dask user survey results&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Discussed in more detail above, the analysis results from the Dask User Survey were published in September 2021.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Tutorials&lt;/strong&gt;&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;I presented a Dask tutorial at the &lt;a class="reference external" href="https://resbaz.github.io/resbaz2021/sydney/"&gt;ResBaz Sydney online conference&lt;/a&gt; on the 25th of November 2021. Thanks to the ResBaz organisers and to David McFarlane, Svetlana Tkachenko, and Oksana Tkachenko for monitoring the chat for questions on the day.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Naty Clementi ran a Dask tutorial for the Women Who Code DC meetup on the 4th of November 2021. I assisted Naty, mostly by monitoring questions in the chat.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/12/15/dask-fellow-reflections.md&lt;/span&gt;, line 93)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="personal-reflections"&gt;
&lt;h1&gt;Personal reflections&lt;/h1&gt;
&lt;p&gt;Reflecting back over the whole year, there were some things that worked well and some things that were less successful.&lt;/p&gt;
&lt;section id="highlights-from-this-year"&gt;
&lt;h2&gt;Highlights from this year&lt;/h2&gt;
&lt;p&gt;My personal highlights include:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;ITK + Dask integration work (discussed in more detail above).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;A find objects fucntion for &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-image&lt;/span&gt;&lt;/code&gt; (discussed in more detail above).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Visualization work, because it’s very high impact. We’re solving issues raised by life science groups, but the improved tools benefit EVERYONE who uses Dask.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;This bugfix from &lt;a class="reference external" href="https://github.com/dask/dask/pull/7391"&gt;dask PR #7391&lt;/a&gt;, because this single change fixed problems in four places at once (&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;scikit-image&lt;/span&gt;&lt;/code&gt;, &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-ml&lt;/span&gt;&lt;/code&gt;, &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;xgcm/xhistogram&lt;/span&gt;&lt;/code&gt;, and the cupy dask tests).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Community building, conferences, and engagement. Lots of effort went into events over this year, and it’s certainly paid dividends.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;section id="what-worked-well"&gt;
&lt;h2&gt;What worked well&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Dask stale issues sprint&lt;/strong&gt;&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;This was useful for the project, as well as useful for me.
Sorting through old issues was an incredibly effective way to get familiar with who the experts are for particular topics. It would have been even better if this happened in the first few months of working on Dask, instead of the last few months.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It’s been suggested that one good way to gain familiarity is spending 6 months full time managing the issue tracker. Maybe that’s true, but the much shorter stale issue sprint was a very efficient way of getting a lot of the same benefits in a short space of time. I’d recommend it for new maintainers or triage team members.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Community building events&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;We had a very successful year in terms of community building and events. This included tutorials, workshops, conferences, and community outreach. Summary of major events:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Led a Dask tutorial at &lt;a class="reference external" href="https://resbaz.github.io/resbaz2021/sydney/"&gt;ResBaz Sydney 2021&lt;/a&gt; in November.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Co-led a half-day tutorial on napari and Dask at the &lt;a class="reference external" href="https://www.lmameeting.com.au/"&gt;Light Microscopy Australia Meeting&lt;/a&gt; in August.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;SciPy 2021 presentation &lt;a class="reference external" href="https://www.youtube.com/watch?v=tY_lCGS1BMk&amp;amp;amp;t=60s"&gt;Scaling Science: leveraging Dask for life sciences&lt;/a&gt; in July.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Organized the &lt;a class="reference external" href="https://blog.dask.org/2021/05/24/life-science-summit-workshop"&gt;Dask Life Science workshop&lt;/a&gt; at the Dask Summit in May 2021. The life science workshop included 15 pre-recorded talks, and 3 interactive discussions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Co-organised the &lt;a class="reference external" href="https://blog.dask.org/2021/06/25/dask-down-under"&gt;Dask Down Under&lt;/a&gt; workshop for the Dask Summit in May 2021. Dask Down Under contained 5 talks, 2 tutorials, 1 panel discussion, and 1 meet and greet networking event.
Dask Down Under&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Expert panelist at the &lt;a class="reference external" href="https://www.vis2021.com.au/"&gt;VIS2021 symposium&lt;/a&gt; in February.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Visualization work&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This has been very high impact work, and I’m pleased with what we’ve achieved. Improved tools for visualization were requested by users in our survey of the life science community. This was a high priority, because improvements to visuzliation tools benefit EVERYONE who uses Dask.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="what-didn-t-work-so-well"&gt;
&lt;h2&gt;What didn’t work so well&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Technical resources&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;We never really solved the problem of finding someone I could go to with technical questions. I did have people to ask about some specific projects, but in most cases I didn’t have a good way to direct questions to the right people. This is a challenging problem, especially because most Dask maintainers and contributors have full time jobs doing other things too. In my opinion, this negatively impacted the work and what we were able to achieve.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Being added to the &amp;#64;dask/maintenance team&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;There’s no point getting notifications if you don’t have GitHub permissions to do anything about them. In future I think we should add only people with at least triage or write permissions to the github teams.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Real time interaction&lt;/strong&gt;&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;We tried out “Ask a maintainer” office hours for the life science community, but they were poorly attended, so we cancelled this.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;We added some “Dask social chat” events to the calendar, but they were not very well attended outside of the first few. Most often, zero people attended. (There is another social chat for the Americas/Europe time zones, which is at a more convenient time for most people and might be more popular.)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Slack&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Slack works well to DM specific people to set up meeting times, etc, but the public channels didn’t end up being very useful for me personally.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Lack of integration with other project teams&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;You can only get so much done as a solo developer. We had hoped that I would naturally end up working with teams from several different projects, but this didn’t really end up being the case. The &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;napari&lt;/span&gt;&lt;/code&gt; project is an exception to this, and that relationship was well established before starting work for Dask. Perhaps there’s something more we could have done here to facilitate more interaction.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/12/15/dask-fellow-reflections.md&lt;/span&gt;, line 154)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;/section&gt;
&lt;section id="what-s-next-for-genevieve"&gt;
&lt;h1&gt;What’s next for Genevieve?&lt;/h1&gt;
&lt;p&gt;Genevieve will be starting a new job next year, you can find her on GitHub &lt;a class="reference external" href="https://github.com/GenevieveBuckley/"&gt;&amp;#64;GeneviveeBuckley&lt;/a&gt;.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/12/15/dask-fellow-reflections.md&lt;/span&gt;, line 158)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="what-s-next-in-dask"&gt;
&lt;h1&gt;What’s next in Dask?&lt;/h1&gt;
&lt;p&gt;Lots of stuff has happened in Dask, but there is still lots left to do.
Here is a summary of the next steps for several projects. We’d love it if new people would like to take up the torch and contribute to any of these projects.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;ITK image compatibility with Dask&lt;/strong&gt;&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;The next steps for the ITK + Dask project require ITK release candidate 5.3rc3 or above to become available (likely early in 2022).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When the release is available, the next step is to try to re-run the code from the original &lt;a class="reference external" href="https://blog.dask.org/2019/08/09/image-itk"&gt;ITK blogpost&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If there’s still work to be done we’ll need to open issues for the remaining blockers. And if it all works well, we’d like someone to write a second ITK + Dask blogpost to publicize the new functionality.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Improving performance around rechunking&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;More performance improvements related to rechunking is required (see &lt;a class="reference external" href="https://github.com/dask/dask/pull/7950"&gt;#7950&lt;/a&gt; and &lt;a class="reference external" href="https://github.com/dask/dask/pull/7980"&gt;#7980&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;High level graph work for arrays and slicing&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The high level graph work for slicing and overlapping arrays has a number of next steps.
Ian Rose has written &lt;a class="reference external" href="https://gist.github.com/ian-r-rose/4221ebf52f3423203640c498fb815f21"&gt;an excellent summary here&lt;/a&gt;. Briefly, the&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;cull&lt;/span&gt;&lt;/code&gt; and &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;get_output_keys&lt;/span&gt;&lt;/code&gt; methods must be implemented, then low level fusion and optimizations can be done.&lt;/p&gt;
&lt;p&gt;Relevant links:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Implement &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;cull&lt;/span&gt;&lt;/code&gt; method for ArrayOverlapLayer &lt;a class="reference external" href="https://github.com/dask/dask/issues/7789"&gt;#7789&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Implement &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;get_output_keys&lt;/span&gt;&lt;/code&gt; method for ArrayOverlapLayer &lt;a class="reference external" href="https://github.com/dask/dask/issues/7791"&gt;#7791&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://github.com/dask/dask/pull/7655"&gt;Array slicing HighLevelGraph layer #7655&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Documentation&lt;/strong&gt;&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Dask needs better documentation for high level graphs. Both &lt;a class="reference external" href="https://github.com/dask/dask/issues/7709"&gt;user documentation&lt;/a&gt; and &lt;a class="reference external" href="https://github.com/dask/dask/issues/7755"&gt;developer documentation&lt;/a&gt; is required.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;At some future point, it might be worthwhile integrating blogpost content from
&lt;a class="reference external" href="https://blog.dask.org/2021/11/02/choosing-dask-chunk-sizes"&gt;Choosing good chunk sizes in Dask&lt;/a&gt; into the main &lt;a class="reference external" href="https://docs.dask.org/en/latest/"&gt;Dask documentation&lt;/a&gt;, for better discoverability.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
</content>
    <link href="https://blog.dask.org/2021/12/15/dask-fellow-reflections/"/>
    <summary>Document headings start at H2, not H1 [myst.header]</summary>
    <category term="lifescience" label="life science"/>
    <published>2021-12-15T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://blog.dask.org/2021/12/01/mosaic-fusion/</id>
    <title>Mosaic Image Fusion</title>
    <updated>2021-12-01T00:00:00+00:00</updated>
    <author>
      <name>and Genevieve Buckley</name>
    </author>
    <content type="html">&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/12/01/mosaic-fusion.md&lt;/span&gt;, line 9)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;section id="executive-summary"&gt;

&lt;p&gt;This blogpost shows a case study where a researcher uses Dask for mosaic image fusion.
Mosaic image fusion is when you combine multiple smaller images taken at known locations and stitch them together into a single image with a very large field of view. Full code examples are available on GitHub from the &lt;a class="reference external" href="https://github.com/VolkerH/DaskFusion"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskFusion&lt;/span&gt;&lt;/code&gt;&lt;/a&gt; repository:
&lt;a class="github reference external" href="https://github.com/VolkerH/DaskFusion"&gt;VolkerH/DaskFusion&lt;/a&gt;&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/12/01/mosaic-fusion.md&lt;/span&gt;, line 15)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="the-problem"&gt;
&lt;h1&gt;The problem&lt;/h1&gt;
&lt;section id="image-mosaicing-in-microscopy"&gt;
&lt;h2&gt;Image mosaicing in microscopy&lt;/h2&gt;
&lt;p&gt;In optical microscopy, a single field of view captured with a 20x objective typically
has a diagonal on the order of a few 100 μm (exact dimensions depend on other
parts of the optical system, including the size of the camera chip). A typical
sample slide has a size of 25mm by 75mm.
Therefore, when imaging a whole slide, one has to acquire hundreds of images, typically
with some overlap between individual tiles. With increasing magnification,
the required number of images increases accordingly.&lt;/p&gt;
&lt;p&gt;To obtain an overview one has to fuse this large number of individual
image tiles into a large mosaic image. Here, we assume that the information required for
positioning and alignment of the individual image tiles is known. In the example presented here,
this information is available as metadata recorded by the microscope, namely the microscope stage
position and the pixel scale. Alternatively, this
information could also be derived from the image data directly, e.g. through a
registration step that matches corresponding image features in the areas where tiles overlap.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/12/01/mosaic-fusion.md&lt;/span&gt;, line 35)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;/section&gt;
&lt;section id="the-solution"&gt;
&lt;h1&gt;The solution&lt;/h1&gt;
&lt;p&gt;The array that can hold the resulting mosaic image will often have a size that is too large
to fit in RAM, therefore we will use Dask arrays and the &lt;a class="reference external" href="https://docs.dask.org/en/latest/generated/dask.array.map_blocks.html"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;map_blocks&lt;/span&gt;&lt;/code&gt;&lt;/a&gt; function to enable
out-of-core processing. The &lt;a class="reference external" href="https://docs.dask.org/en/latest/generated/dask.array.map_blocks.html"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;map_blocks&lt;/span&gt;&lt;/code&gt;&lt;/a&gt;
function will process smaller blocks (a.k.a chunks) of the output array individually, thus eliminating the need to
hold the whole output array in memory. If sufficient resources are available, dask will also distribute the processing of blocks across several workers,
thus we also get parallel processing for free, which can help speed up the fusion process.&lt;/p&gt;
&lt;p&gt;Typically whenever we want to join Dask arrays, we use &lt;a class="reference external" href="https://docs.dask.org/en/latest/array-stack.html"&gt;Stack, Concatenate, and Block&lt;/a&gt;. However, these are not good tools for mosaic image fusion, because:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;&lt;p&gt;The image tiles will be be overlapping,&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Tiles may not be positioned on an exact grid and will typically also have slight rotations as the alignment of stage and camera is not perfect. In the most general case, for example in panaromic photo mosaics,
individual image tiles could be arbitrarily rotated or skewed.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The starting point for this mosaic prototype was some code that reads in the stage metadate for all tiles and calculates an affine transformation for each tile that would place it at the correct location
in the output array.&lt;/p&gt;
&lt;p&gt;The image below shows preliminary work placing mosaic image tiles into the correct positions using the napari image viewer.
Shown here is a small example with 63 image tiles.&lt;/p&gt;
&lt;img src="/images/mosaic-fusion/NapariMosaics.png" alt="Mosaic fusion images in the napari image viewer" width="700" height="265"&gt;
&lt;p&gt;And here is an animation of placing the individual tiles.&lt;/p&gt;
&lt;img src="/images/mosaic-fusion/Lama_whole_slide.gif" alt="Animation of whole slide mosaic fusion images" width="700" height="361"&gt;
&lt;p&gt;To leverage processing with Dask we created a &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;fuse&lt;/span&gt;&lt;/code&gt; function that generates a small block of the final mosaic and is invoked by &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;map_blocks&lt;/span&gt;&lt;/code&gt; for each chunk of the output array.
On each invocation of the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;fuse&lt;/span&gt;&lt;/code&gt; function &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;map_blocks&lt;/span&gt;&lt;/code&gt; passes a dictionary (&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;block_info&lt;/span&gt;&lt;/code&gt;). From the &lt;a class="reference external" href="https://docs.dask.org/en/latest/generated/dask.array.map_blocks.html?highlight=block_info#dask.array.map_blocks"&gt;Dask documentation&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;div&gt;&lt;p&gt;Your block function gets information about where it is in the array by accepting a special &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;block_info&lt;/span&gt;&lt;/code&gt; or &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;block_id&lt;/span&gt;&lt;/code&gt; keyword argument.&lt;/p&gt;
&lt;/div&gt;&lt;/blockquote&gt;
&lt;p&gt;The basic outline of the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;fuse&lt;/span&gt;&lt;/code&gt; function of the mosaic workflow is as follows.
For each chunk of the output array:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;&lt;p&gt;Determine which source image tiles intersect with the chunk.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Adjust the image tiles’ affine transformations to take the offset of the chunk within the array into account.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Load all intersectiong image tiles and apply their respective adjusted affine transformation to map them into the chunk.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Blend the tiles using a simple maximum projection.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Return the blended chunk.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Using a maximum projection to blend areas with overlapping tiles can lead to artifacts such as ghost images and visible tile
seams, so you would typically want to use something more sophisticated in production.&lt;/p&gt;
&lt;section id="results"&gt;
&lt;h2&gt;Results&lt;/h2&gt;
&lt;p&gt;For datasets with many image tiles (~500-1000 tiles), we could speed up the mosaic generation from several hours to tens of minutes using this Dask based method
(compared to a previous workflow using ImageJ plugins runnning on the same workstation).
Due to Dask’s ability to handle data out-of-core and chunked array storage using zarr it is also possible to run the
fusion on hardware with limited RAM.&lt;/p&gt;
&lt;p&gt;Finally, we have the final mosaic fusion result.&lt;/p&gt;
&lt;img src="/images/mosaic-fusion/final-mosaic-fusion-result.png" alt="Final mosaic fusion result" width="700" height="486"&gt;
&lt;/section&gt;
&lt;section id="code"&gt;
&lt;h2&gt;Code&lt;/h2&gt;
&lt;p&gt;Code relatiing to this mosaic image fusion project can be found in the &lt;a class="reference external" href="https://github.com/VolkerH/DaskFusion"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskFusion&lt;/span&gt;&lt;/code&gt;&lt;/a&gt; GitHub repository here:
&lt;a class="github reference external" href="https://github.com/VolkerH/DaskFusion"&gt;VolkerH/DaskFusion&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;There is a self-contained example available in &lt;a class="reference external" href="https://github.com/VolkerH/DaskFusion/blob/main/DaskFusion_Example.ipynb"&gt;this notebook&lt;/a&gt;, which downloads reduced-size example data to demonstrate the process.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/12/01/mosaic-fusion.md&lt;/span&gt;, line 97)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;/section&gt;
&lt;section id="what-s-next"&gt;
&lt;h1&gt;What’s next?&lt;/h1&gt;
&lt;p&gt;Currently, the DaskFusion code is a proof of concept for single-channel 2D images and simple maximum projection for blending the tiles in overlapping areas, it is not production code.
However, the same principle can be used for fusing multi-channel image volumes,
such as from Light-Sheet data if the tile chunk intersection calculation is extended to higher-dimensional arrays.
Such even larger datasets will benefit even more from leveraging dask,
as the processing can be distributed across multiple nodes of a HPC cluster using &lt;a class="reference external" href="http://jobqueue.dask.org/en/latest/"&gt;dask jobqueue&lt;/a&gt;.&lt;/p&gt;
&lt;section id="also-see"&gt;
&lt;h2&gt;Also see&lt;/h2&gt;
&lt;p&gt;Marvin’s lightning talk on multi-view image fusion:
&lt;a class="reference external" href="https://www.youtube.com/watch?v=YIblUvonMvo&amp;amp;amp;list=PLJ0vO2F_f6OBAY6hjRHM_mIQ9yh32mWr0&amp;amp;amp;index=10"&gt;15 minute video available here on YouTube&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The GitHub repository &lt;a class="reference external" href="https://github.com/m-albert/MVRegFus"&gt;MVRegFus&lt;/a&gt; that Marvin talks about in the video is available here:
&lt;a class="github reference external" href="https://github.com/m-albert/MVRegFus"&gt;m-albert/MVRegFus&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The &lt;a class="reference external" href="https://github.com/manzt/napari-lazy-openslide"&gt;napari-lazy-openslide&lt;/a&gt; visualization plugin by &lt;a class="reference external" href="https://github.com/manzt"&gt;Trevor Manz&lt;/a&gt;: &lt;em&gt;“An experimental plugin to lazily load multiscale whole-slide tiff images with openslide and dask.”&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;For further information on alternative approaches to image stitching:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;ASHLAR: Alignment by Simultaneous Harmonization of Layer / Adjacency Registration&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://labsyspharm.github.io/ashlar/"&gt;ASHLAR homepage&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://github.com/labsyspharm/ashlar"&gt;ASHLAR GitHub repository&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://doi.org/10.1101/2021.04.20.440625"&gt;ASHLAR biorxiv pre-print&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Microscopy Image Stitching Tool (MIST)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://pages.nist.gov/MIST/"&gt;MIST homepage&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://github.com/usnistgov/MIST"&gt;MIST GitHub repository&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://raw.githubusercontent.com/wiki/USNISTGOV/MIST/assets/mist-algorithm-documentation.pdf"&gt;MIST algorithm documentation (PDF)&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The &lt;a class="reference external" href="https://github.com/yfukai/m2stitch"&gt;m2stitch&lt;/a&gt; python package by &lt;a class="reference external" href="https://github.com/yfukai"&gt;Yohsuke T. Fukai&lt;/a&gt;: &lt;em&gt;“Provides robust stitching of tiled microscope images on a regular grid”&lt;/em&gt; (based on the MIST algorithm)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/12/01/mosaic-fusion.md&lt;/span&gt;, line 127)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;/section&gt;
&lt;section id="acknowledgements"&gt;
&lt;h1&gt;Acknowledgements&lt;/h1&gt;
&lt;p&gt;This computational work was done by Volker Hilsenstein, in conjunction with Marvin Albert.
Volker Hilsenstein is a scientific software developer at &lt;a class="reference external" href="https://www.embl.org/groups/alexandrov/"&gt;EMBL in Theodore Alexandrov’s lab&lt;/a&gt; with a focus on spatial metabolomics and bio-image analysis.&lt;/p&gt;
&lt;p&gt;The sample images were prepared and imaged by Mohammed Shahraz from the Alexandrov lab at EMBL Heidelberg.&lt;/p&gt;
&lt;p&gt;Genevieve Buckley and Volker Hilsenstein wrote this blogpost.&lt;/p&gt;
&lt;/section&gt;
</content>
    <link href="https://blog.dask.org/2021/12/01/mosaic-fusion/"/>
    <summary>Document headings start at H2, not H1 [myst.header]</summary>
    <category term="imageanalysis" label="image analysis"/>
    <category term="lifescience" label="life science"/>
    <published>2021-12-01T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://blog.dask.org/2021/10/20/czi-eoss-update/</id>
    <title>CZI EOSS Update</title>
    <updated>2021-10-20T00:00:00+00:00</updated>
    <author>
      <name>Genevieve Buckley</name>
    </author>
    <content type="html">&lt;p&gt;Dask was awarded funding last year in round 2 of the &lt;a class="reference external" href="https://chanzuckerberg.com/eoss/proposals/"&gt;CZI Essential Open Source Software&lt;/a&gt; grant program.
That funding was used to hire &lt;a class="reference external" href="https://github.com/GenevieveBuckley/"&gt;Genevieve Buckley&lt;/a&gt; to work on Dask with a focus on &lt;a class="reference external" href="https://blog.dask.org/2021/03/04/the-life-science-community"&gt;life sciences&lt;/a&gt;.
Last month Dask submitted an interim progress report to CZI, covering the period from February to September 2021.
That progress update is published verbatim below, to share with the wider Dask community.&lt;/p&gt;
&lt;hr class="docutils" /&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/10/20/czi-eoss-update.md&lt;/span&gt;, line 16)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;section id="progress-overview"&gt;

&lt;section id="brief-summary"&gt;
&lt;h2&gt;Brief summary&lt;/h2&gt;
&lt;p&gt;The scope of work performed by the Dask fellow includes code contributions, conference presentations and tutorials, community engagement, and outreach including blogposts.&lt;/p&gt;
&lt;blockquote&gt;
&lt;div&gt;&lt;p&gt;The primary deliverable of this proposal is consistency and the success of neighboring software
projects&lt;/p&gt;
&lt;/div&gt;&lt;/blockquote&gt;
&lt;p&gt;Project work to date includes:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;38 pull requests merged (plus 6 draft pull requests) across 5 different repositories.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;3 conferences (presentations and organising of specialist workshops)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;1 half day workshop (plus another one upcoming)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Student supervision for Dask’s Google Summer of Code project&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;9 blogposts (plus 2 drafts for upcoming publication)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;section id="code-contributions"&gt;
&lt;h2&gt;Code contributions&lt;/h2&gt;
&lt;p&gt;Code contributions are not limiteed to the main Dask repository, but also neighbouring software projects which use Dask as well (like the &lt;a class="reference external" href="https://napari.org/"&gt;napari&lt;/a&gt; software project), including: &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask&lt;/span&gt;&lt;/code&gt;, &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-image&lt;/span&gt;&lt;/code&gt;, &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-examples&lt;/span&gt;&lt;/code&gt;, &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;napari&lt;/span&gt;&lt;/code&gt;, &amp;amp; &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;napari.github.io&lt;/span&gt;&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;To date, across the five repositories named above the Dask fellow has contributed:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;38 pull requests&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;6 draft pull requests&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;12 closed pull requests (not merged, discarded in favour of another approach)&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The Dask fellow is an official maintainer of the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-image&lt;/span&gt;&lt;/code&gt; project, and additional milestones achieved for that project include:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;The maintainer team has been grown by one (we welcome Marvin Albert to our ranks)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;2 new dask-image releases in 2020&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;section id="code-contribution-highlights"&gt;
&lt;h2&gt;Code contribution highlights&lt;/h2&gt;
&lt;p&gt;Highlights include:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Bugfixes benefitting the broader community&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://github.com/dask/dask/pull/7391"&gt;dask PR #7391&lt;/a&gt;: This PR fixed slicing the output from Dask’s bincount function. The impact of this fix was substantial, as it solved issues filed in four separate projects: &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;scikit-image&lt;/span&gt;&lt;/code&gt;, &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-ml&lt;/span&gt;&lt;/code&gt;, &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;xgcm/xhistogram&lt;/span&gt;&lt;/code&gt; and the cupy dask tests.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Expanded GPU support&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://github.com/dask/dask/pull/6680"&gt;dask PR #6680&lt;/a&gt;: This PR provided support for different array types in the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;*_like&lt;/span&gt;&lt;/code&gt; array creation functions. Now users can create &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;cupy&lt;/span&gt;&lt;/code&gt; like Dask arrays for GPU processing, or indeed any other array type (eg: &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;sparse&lt;/span&gt;&lt;/code&gt;).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://github.com/dask/dask-image/pull/157"&gt;dask-image PR #157&lt;/a&gt;: This PR provided GPU support for binary morphological functions in the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-image&lt;/span&gt;&lt;/code&gt; project.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Visualization tools benefitting all Dask users&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://github.com/dask/dask/pull/7716"&gt;dask PR #7716&lt;/a&gt;: This PR automatically displays the high level graph visualization in the jupyter notebook cell output (somthing already done automatically for low level graphs).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://github.com/dask/dask/pull/7763"&gt;dask PR #7763&lt;/a&gt;: This PR introduced a HTML representation for Dask &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;HighLevelGraph&lt;/span&gt;&lt;/code&gt; objects. This allows users and developers a much easier way to inspect the structure and status of HighLevelGraphs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Further developed on during the Dask Google Summer of Code project, full report available &lt;a class="reference external" href="https://blog.dask.org/2021/08/23/gsoc-2021-project"&gt;here&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;High Level Graphs&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://github.com/dask/dask/pull/7595"&gt;dask PR #7595&lt;/a&gt;: This PR introduced a high level graph layer for array overlaps. High level graphs are a tool we can use to optimize Dask’s performance.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://github.com/dask/dask/pull/7655"&gt;dask PR #7655&lt;/a&gt; (ongoing): This PR introduces a high level graph for Dask array slicing operations.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Memory improvements (ongoing)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://github.com/dask/dask/pull/8124"&gt;dask PR #8124&lt;/a&gt; (ongoing): This PR investigates improved automatic rechunking strategies for &lt;a class="reference external" href="https://github.com/dask/dask/issues/8110"&gt;memory problems&lt;/a&gt; caused by reshaping Dask arrays.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://github.com/dask/dask/pull/7950"&gt;dask PR #7950&lt;/a&gt; (ongoing): This PR aims to improve memory and performance of the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;tensordot&lt;/span&gt;&lt;/code&gt; function with auto-rechunking of Dask arrays.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://github.com/dask/dask/pull/7980"&gt;dask PR #7980&lt;/a&gt; (ongoing): This PR aims to fix the unbounded memory use problem in &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;tensordot&lt;/span&gt;&lt;/code&gt;, reported &lt;a class="reference external" href="https://github.com/dask/dask/issues/6916"&gt;here&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;section id="conferences"&gt;
&lt;h2&gt;Conferences&lt;/h2&gt;
&lt;p&gt;Notable conference events in 2021 included the SciPy conference, the Dask Summit, and VIS2021.&lt;/p&gt;
&lt;section id="scipy-conference"&gt;
&lt;h3&gt;SciPy conference&lt;/h3&gt;
&lt;p&gt;The Dask fellow presented a talk titled &lt;em&gt;“Scaling Science: leveraging Dask for life sciences”&lt;/em&gt; at the 2021 SciPy conference. Full recording &lt;a class="reference external" href="https://www.youtube.com/watch?v=tY_lCGS1BMk&amp;amp;amp;t=60s"&gt;available here&lt;/a&gt;.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="dask-summit"&gt;
&lt;h3&gt;Dask Summit&lt;/h3&gt;
&lt;p&gt;The Dask fellow organised two workshops at the 2021 &lt;a class="reference external" href="https://summit.dask.org/"&gt;Dask Summit&lt;/a&gt;:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;&lt;p&gt;Dask Down Under (co-organised with Nick Mortimer), and&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The Dask life science workshop&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;section id="dask-down-under"&gt;
&lt;h4&gt;Dask Down Under&lt;/h4&gt;
&lt;p&gt;The scope of Dask Down Under was more like a mini-conference for Australian timezones, rather than a typical workshop. Dask Down Under involved two days of events, covering:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;5 talks&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;2 tutorials&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;1 panel discussion&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;1 meet and greet networking event&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It was very well recieved by the community. A full report on the Dask Down under events is available &lt;a class="reference external" href="https://blog.dask.org/2021/06/25/dask-down-under"&gt;here&lt;/a&gt;. A YouTube playlist of the Dask Down Under events is available &lt;a class="reference external" href="https://www.youtube.com/watch?v=10Ws59NGDaE&amp;amp;amp;list=PLJ0vO2F_f6OAXBfb_SAF2EbJve9k1vkQX"&gt;here on the Dask YouTube channel&lt;/a&gt;.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="dask-life-science-workshop"&gt;
&lt;h4&gt;Dask life science workshop&lt;/h4&gt;
&lt;p&gt;The Dask life science workshop involved:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;15 pre-recorded lightning talks&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;3 interactive discussion times (accessible across timezones in Europe, Oceania, and the Americas)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Asynchronous text chat throughout the Dask Summit&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A full report on the Dask life science workshop is available &lt;a class="reference external" href="https://blog.dask.org/2021/05/24/life-science-summit-workshop"&gt;here&lt;/a&gt;. A YouTube playlist of all the Dask life science lightning talks is available &lt;a class="reference external" href="https://www.youtube.com/watch?v=6PerbQhcupM&amp;amp;amp;list=PLJ0vO2F_f6OBAY6hjRHM_mIQ9yh32mWr0"&gt;here on the Dask YouTube channel&lt;/a&gt;.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="vis2021-symposium"&gt;
&lt;h4&gt;VIS2021 symposium&lt;/h4&gt;
&lt;p&gt;The Dask fellow was an invited panellist at the &lt;a class="reference external" href="https://www.vis2021.com.au/"&gt;VIS2021 symposium&lt;/a&gt; in February 2021. The “Problem Solver” panel discussion covered practical problems in image analysis and how tools like Dask and napari can help solve them.&lt;/p&gt;
&lt;/section&gt;
&lt;/section&gt;
&lt;/section&gt;
&lt;section id="tutorials-and-workshops"&gt;
&lt;h2&gt;Tutorials and workshops&lt;/h2&gt;
&lt;p&gt;The Dask fellow co-presented a half-day workshop (five hours) at the 2021 &lt;a class="reference external" href="https://www.lmameeting.com.au/"&gt;Light Microscopy Australia Meeting&lt;/a&gt; with Juan Nunez-Iglesias. &lt;a class="reference external" href="https://napari.org/"&gt;napari&lt;/a&gt; is an open source multidimensional image viewer built using Dask for out-of-core image processing. Workshop content is available at this link: &lt;a class="github reference external" href="https://github.com/jni/lma-2021-bioimage-analysis-python/"&gt;jni/lma-2021-bioimage-analysis-python&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Upcoming workshop:&lt;/strong&gt;
The Dask fellow has been invited to deliver a workshop on &lt;a class="reference external" href="https://napari.org/"&gt;napari&lt;/a&gt; and big data using &lt;a class="reference external" href="https://dask.org/"&gt;Dask&lt;/a&gt; at an upcoming &lt;a class="reference external" href="http://eubias.org/NEUBIAS/training-schools/neubias-academy-home/"&gt;NEUBIAS Academy&lt;/a&gt;. Workshop content is available at this link: &lt;a class="github reference external" href="https://github.com/GenevieveBuckley/napari-big-data-training"&gt;GenevieveBuckley/napari-big-data-training&lt;/a&gt;&lt;/p&gt;
&lt;/section&gt;
&lt;section id="google-summer-of-code"&gt;
&lt;h2&gt;Google Summer of Code&lt;/h2&gt;
&lt;p&gt;The Dask fellow supervised a Google Summer of Code student in 2021. Martin Durant acted as a secondary supervisor. The project ran over a 3 month period, and involved implementing a number of features to improve visualization of Dask graphs and objects. A full report on the Dask GSOC project is available &lt;a class="reference external" href="https://blog.dask.org/2021/08/23/gsoc-2021-project"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="blogposts"&gt;
&lt;h2&gt;Blogposts&lt;/h2&gt;
&lt;p&gt;We set a goal of one blogpost per month, and exceeded it. To date, nine blogposts have been published by the Dask fellow, with another two currently in draft status.&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://blog.dask.org/2021/03/04/the-life-science-community"&gt;Getting to know the life science community&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://blog.dask.org/2021/03/29/apply-pretrained-pytorch-model"&gt;Dask with PyTorch for large scale image analysis&lt;/a&gt; (co-authored with Nick Sofreniew)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://blog.dask.org/2021/05/07/skeleton-analysis"&gt;Skeleton analysis&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://blog.dask.org/2021/05/24/life-science-summit-workshop"&gt;Life sciences at the 2021 Dask Summit&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://blog.dask.org/2021/05/25/user-survey"&gt;The 2021 Dask User Survey is out now&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://blog.dask.org/2021/06/25/dask-down-under"&gt;Dask Down Under&lt;/a&gt; (co-authored with Nick Mortimer)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://blog.dask.org/2021/07/02/ragged-output"&gt;Ragged output, how to handle awkward shaped results&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://blog.dask.org/2021/07/07/high-level-graphs"&gt;High Level Graphs update&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://blog.dask.org/2021/08/23/gsoc-2021-project"&gt;Google Summer of Code 2021 - Dask Project&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Draft status, will be published soon:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://github.com/dask/dask-blog/pull/108"&gt;Mosaic Image Fusion&lt;/a&gt; (co-authored with Volker Hisenstein)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://github.com/dask/dask-blog/pull/109"&gt;2021 Dask user survey results&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;/section&gt;
</content>
    <link href="https://blog.dask.org/2021/10/20/czi-eoss-update/"/>
    <summary>Dask was awarded funding last year in round 2 of the CZI Essential Open Source Software grant program.
That funding was used to hire Genevieve Buckley to work on Dask with a focus on life sciences.
Last month Dask submitted an interim progress report to CZI, covering the period from February to September 2021.
That progress update is published verbatim below, to share with the wider Dask community.</summary>
    <category term="lifescience" label="life science"/>
    <published>2021-10-20T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://blog.dask.org/2021/05/24/life-science-summit-workshop/</id>
    <title>Life sciences at the 2021 Dask Summit</title>
    <updated>2021-05-24T00:00:00+00:00</updated>
    <author>
      <name>Genevieve Buckley</name>
    </author>
    <content type="html">&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/24/life-science-summit-workshop.md&lt;/span&gt;, line 9)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;section id="executive-summary"&gt;

&lt;p&gt;The Dask life science workshop ran as part of the 2021 Dask Summit. Lightning talks from this workshop are &lt;a class="reference external" href="https://www.youtube.com/playlist?list=PLJ0vO2F_f6OBAY6hjRHM_mIQ9yh32mWr0"&gt;available here&lt;/a&gt;, and you can read on for a summary of the event.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/24/life-science-summit-workshop.md&lt;/span&gt;, line 13)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="what-is-the-dask-life-science-workshop"&gt;
&lt;h1&gt;What is the Dask life science workshop?&lt;/h1&gt;
&lt;p&gt;The Dask life science workshop ran as part of the 2021 Dask Summit. Currently many people in life sciences use Dask, but individual groups are relatively isolated from one another. This workshop gave us an opportunity to learn from each other, as well as opportunities to identify common frustrations and areas for improvement.&lt;/p&gt;
&lt;p&gt;The workshop involved:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Pre-recorded lightning talks&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Interactive discussion times (accessible across timezones in Europe, Oceania, and the Americas)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Asynchronous text chat throughout the Dask Summit&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/24/life-science-summit-workshop.md&lt;/span&gt;, line 23)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="if-i-missed-it-how-can-i-catch-up"&gt;
&lt;h1&gt;If I missed it, how can I catch up?&lt;/h1&gt;
&lt;p&gt;If you missed the Dask Summit, you can catch up on YouTube.
There is a playlist of all the life science lightning talks &lt;a class="reference external" href="https://www.youtube.com/playlist?list=PLJ0vO2F_f6OBAY6hjRHM_mIQ9yh32mWr0"&gt;available here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You can also join our &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;#life-science&lt;/span&gt;&lt;/code&gt; channel on Slack:
&lt;a class="reference external" href="https://join.slack.com/t/dask/shared_invite/zt-mfmh7quc-nIrXL6ocgiUH2haLYA914g"&gt;Click here for an invitation link&lt;/a&gt;.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/24/life-science-summit-workshop.md&lt;/span&gt;, line 31)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="who-came"&gt;
&lt;h1&gt;Who came?&lt;/h1&gt;
&lt;p&gt;We invited attendees at the life science workshop to do a short Q&amp;amp;A about their work with Dask. This is a small subset of the people who joined us, many people came to the conference and did not do a Q&amp;amp;A.&lt;/p&gt;
&lt;p&gt;The responses give us an overview of the diversity of work people in the community are doing. In no particular order, here are some of those Q&amp;amp;As:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Name:&lt;/strong&gt; Tom White&lt;br /&gt;
&lt;strong&gt;Timezone:&lt;/strong&gt; EU/UK&lt;br /&gt;
&lt;strong&gt;What kind of science do you work on?&lt;/strong&gt; Statistical genetics&lt;br /&gt;
&lt;strong&gt;Something you’ve tried (or would like to try) with Dask?&lt;/strong&gt; Run per-row linear regressions at scale.&lt;br /&gt;
&lt;strong&gt;What do you want to do next with Dask?&lt;/strong&gt; Collaborative optimization of a public workflow (GWAS).&lt;br /&gt;
&lt;strong&gt;Lightning talk:&lt;/strong&gt; &lt;a class="reference external" href="https://www.youtube.com/watch?v=qt6YsHoPpZs&amp;amp;amp;list=PLJ0vO2F_f6OBAY6hjRHM_mIQ9yh32mWr0&amp;amp;amp;index=2"&gt;click here&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Name:&lt;/strong&gt; Giovanni Palla&lt;br /&gt;
&lt;strong&gt;Affiliation:&lt;/strong&gt; Helmholtz Center Munich&lt;br /&gt;
&lt;strong&gt;Timezone:&lt;/strong&gt; Europe&lt;br /&gt;
&lt;strong&gt;What kind of science do you work on?&lt;/strong&gt; Computational Biology and Spatial transcriptomics&lt;br /&gt;
&lt;strong&gt;Something you’ve tried (or would like to try) with Dask?&lt;/strong&gt; &lt;a class="reference external" href="http://image.dask.org/en/latest/"&gt;dask-image&lt;/a&gt; for image processing.&lt;br /&gt;
**What do you want to do next with Dask? Further integration with &lt;a class="reference external" href="https://squidpy.readthedocs.io/en/latest/"&gt;Squidpy&lt;/a&gt;.&lt;br /&gt;
**Lightning talk:** &lt;a class="reference external" href="https://www.youtube.com/watch?v=sGr7O8spfvE&amp;amp;amp;list=PLJ0vO2F_f6OBAY6hjRHM_mIQ9yh32mWr0&amp;amp;amp;index=8"&gt;click here&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Name:&lt;/strong&gt; Isaac Virshup&lt;br /&gt;
&lt;strong&gt;Affiliation:&lt;/strong&gt; University of Melbourne. Open source projects Scanpy and AnnData
&lt;strong&gt;Timezone:&lt;/strong&gt; AEST&lt;br /&gt;
&lt;strong&gt;What kind of science do you work on?&lt;/strong&gt; Single cell omics data.&lt;br /&gt;
&lt;strong&gt;Something you’ve tried (or would like to try) with Dask?&lt;/strong&gt;&lt;br /&gt;
I’ve used dask for some nested embarrassingly parallel calculations. Having an intelligent scheduler with good monitoring made this task as easy as it should be, especially compared with multiprocessing or joblib.&lt;br /&gt;
&lt;strong&gt;What do you want to do next with Dask?&lt;/strong&gt;&lt;br /&gt;
I would love to get AnnData, a container for working with single cell assays integrated with dask. Dataset sizes in this field are constantly increasing, and it would be good to be able to work with the coolest new dataset regardless of available RAM.&lt;br /&gt;
Since we rely heavily on sparse arrays, a key step towards this will be getting better sparse array support (CSC and CSR especially) inside dask. After all, it’s not great if our strategy for scaling out requires many times the total memory! As a maintainer, I’m interested in hearing people’s experience with distributing tools that integrate well with dask.&lt;br /&gt;
&lt;strong&gt;Lightning talk:&lt;/strong&gt; &lt;a class="reference external" href="https://www.youtube.com/watch?v=e8pWpRo5Ars&amp;amp;amp;list=PLJ0vO2F_f6OBAY6hjRHM_mIQ9yh32mWr0&amp;amp;amp;index=14"&gt;click here&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Name:&lt;/strong&gt; Anna Kreshuk&lt;br /&gt;
&lt;strong&gt;Affiliation:&lt;/strong&gt; European Molecular Biology Laboratory&lt;br /&gt;
&lt;strong&gt;Timezone:&lt;/strong&gt; CEST (GMT+2)&lt;br /&gt;
&lt;strong&gt;What kind of science do you work on?&lt;/strong&gt; Machine learning for microscopy image analysis.&lt;br /&gt;
&lt;strong&gt;Something you’ve tried (or would like to try) with Dask?&lt;/strong&gt; We run a lot of image processing workflows and want to see how Dask can be exploited in this context.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Name:&lt;/strong&gt; Beth Cimini&lt;br /&gt;
&lt;strong&gt;Affiliation:&lt;/strong&gt; Broad Institute&lt;br /&gt;
&lt;strong&gt;Timezone:&lt;/strong&gt; US-East&lt;br /&gt;
&lt;strong&gt;What kind of science do you work on?&lt;/strong&gt; User friendly image analysis tools for microscopy imaging.&lt;br /&gt;
&lt;strong&gt;Something you’ve tried (or would like to try) with Dask?&lt;/strong&gt; Making Dask work in CellProfiler, to make it easy to analyze big images in high throughput!&lt;br /&gt;
&lt;strong&gt;Lightning talk:&lt;/strong&gt; &lt;a class="reference external" href="https://www.youtube.com/playlist?list=PLJ0vO2F_f6OBAY6hjRHM_mIQ9yh32mWr0"&gt;click here&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Name:&lt;/strong&gt; Volker Hilsenstein&lt;br /&gt;
&lt;strong&gt;Affiliation:&lt;/strong&gt; EMBL / Alexandrov lab&lt;br /&gt;
&lt;strong&gt;Timezone:&lt;/strong&gt; Central European Summer Time&lt;br /&gt;
&lt;strong&gt;What kind of science do you work on?&lt;/strong&gt; Spatial Metabolomics, combining microscopy and mass spectrometry.&lt;br /&gt;
&lt;strong&gt;Something I would like to try with dask:&lt;/strong&gt; fusing large mosaics of individual images or image volumes for which affine transformation into a joint coordinate system are available.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Name:&lt;/strong&gt; Marvin Albert&lt;br /&gt;
&lt;strong&gt;Affiliation:&lt;/strong&gt; University of Zurich&lt;br /&gt;
&lt;strong&gt;Timezone:&lt;/strong&gt; UTC/GMT +2&lt;br /&gt;
&lt;strong&gt;What kind of science do you work on?&lt;/strong&gt; Life sciences / image analysis&lt;br /&gt;
&lt;strong&gt;Something you’ve tried (or would like to try) with Dask? What do you want to do next with Dask?&lt;/strong&gt; Parallelise / reduce the memory footprint of image processing tasks and define workflows that can run on different compute environments.&lt;br /&gt;
&lt;strong&gt;Lightning talk:&lt;/strong&gt; &lt;a class="reference external" href="https://www.youtube.com/watch?v=YIblUvonMvo&amp;amp;amp;list=PLJ0vO2F_f6OBAY6hjRHM_mIQ9yh32mWr0&amp;amp;amp;index=9"&gt;click here&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Name:&lt;/strong&gt; Jordao Bragantini&lt;br /&gt;
&lt;strong&gt;Affiliation:&lt;/strong&gt; CZ Biohub&lt;br /&gt;
&lt;strong&gt;Timezone:&lt;/strong&gt; Pacific Daylight Time (UTC -7)&lt;br /&gt;
&lt;strong&gt;What kind of science do you work on?&lt;/strong&gt; Light-sheet microscopy&lt;br /&gt;
&lt;strong&gt;Something you’ve tried (or would like to try) with Dask?&lt;/strong&gt; Image processing of very large data.&lt;br /&gt;
&lt;strong&gt;What do you want to do next with Dask?&lt;/strong&gt; Implement algorithms for cell segmentation.&lt;br /&gt;
&lt;strong&gt;Lightning talk:&lt;/strong&gt; &lt;a class="reference external" href="https://www.youtube.com/watch?v=xadb-oXMFKI&amp;amp;amp;list=PLJ0vO2F_f6OBAY6hjRHM_mIQ9yh32mWr0&amp;amp;amp;index=3"&gt;click here&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Name:&lt;/strong&gt; Josh Moore&lt;br /&gt;
&lt;strong&gt;Affiliation:&lt;/strong&gt; Open Microscopy Environment (OME)&lt;br /&gt;
&lt;strong&gt;Timezone:&lt;/strong&gt; CEST&lt;br /&gt;
&lt;strong&gt;What kind of science do you work on?&lt;/strong&gt; Bioimaging (infrastructure for RDM)&lt;br /&gt;
&lt;strong&gt;Something you’ve tried (or would like to try) with Dask?&lt;/strong&gt; Accessing large image (Zarr) volumes over HTTP, primarily.
What do you want to do next with Dask? Improve pre-fetching for typical usage patterns, possibly integrating multiscale data (i.e. google maps zooming)&lt;br /&gt;
&lt;strong&gt;Lightning talk:&lt;/strong&gt; &lt;a class="reference external" href="https://www.youtube.com/watch?v=6PerbQhcupM&amp;amp;amp;list=PLJ0vO2F_f6OBAY6hjRHM_mIQ9yh32mWr0&amp;amp;amp;index=1"&gt;click here&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Name:&lt;/strong&gt; Jackson Maxfield Brown&lt;br /&gt;
&lt;strong&gt;Timezone:&lt;/strong&gt; PST&lt;br /&gt;
&lt;strong&gt;What kind of science do you work in?&lt;/strong&gt; Cell biology, specifically microscopy and computational biology.&lt;br /&gt;
&lt;strong&gt;Something you’ve tried (or would like to try) with Dask?&lt;/strong&gt; Built a metadata aware / backed microscopy imaging reading library that uses Dask to read any size image w/ chunking by metadata dimension information. As well as TB-scale image processing pipelines using Dask + Prefect.&lt;br /&gt;
&lt;strong&gt;What do you want to do next with Dask?&lt;/strong&gt; Tighter integration with other libraries. I see cuCim from the RAPIDs team and would love to extend work with them to have a more general “bio-image-spec” so we can all play nicely together.&lt;br /&gt;
&lt;strong&gt;Lightning talk:&lt;/strong&gt; &lt;a class="reference external" href="https://www.youtube.com/watch?v=LNa_gGpSnvc&amp;amp;amp;list=PLJ0vO2F_f6OBAY6hjRHM_mIQ9yh32mWr0&amp;amp;amp;index=8"&gt;click here&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Name:&lt;/strong&gt; Gregory R. Lee&lt;br /&gt;
&lt;strong&gt;Affiliation:&lt;/strong&gt; Quansight&lt;br /&gt;
&lt;strong&gt;Timezone:&lt;/strong&gt; EST (UTC-5)&lt;br /&gt;
&lt;strong&gt;What kind of science do you work on?&lt;/strong&gt; Scientific software development (with a background doing research in magnetic resonance imaging).&lt;br /&gt;
&lt;strong&gt;Something you’ve tried (or would like to try) with Dask?&lt;/strong&gt;&lt;br /&gt;
In past research work, I used Dask primarily in two scenarios, both on a single workstation:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;&lt;p&gt;To achieve multi-threading by processing image blocks in parallel on the CPU (e.g. like in dask-image)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Serial blockwise processing of large volumetric data on the GPU (i.e. CuPy arrays of 10-100 GB in size) to reduce peak memory requirements.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;strong&gt;What do you want to do next with Dask?&lt;/strong&gt;&lt;br /&gt;
Audit scikit-image functions to determine which can easily be accelerated using block-wise approaches as in dask-image. Ideally a subset of functions would work directly with dask-arrays as inputs rather than requiring users to learn about Dask’s map_overlap, etc. to use this feature.&lt;br /&gt;
&lt;strong&gt;Lightning talk:&lt;/strong&gt; &lt;a class="reference external" href="https://www.youtube.com/watch?v=vPorCnEhM6g&amp;amp;amp;list=PLJ0vO2F_f6OBAY6hjRHM_mIQ9yh32mWr0&amp;amp;amp;index=16"&gt;click here&lt;/a&gt;&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/24/life-science-summit-workshop.md&lt;/span&gt;, line 126)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="what-s-next"&gt;
&lt;h1&gt;What’s next?&lt;/h1&gt;
&lt;p&gt;Dask is now considering holding “office hours” for the life science community. If we can find enough maintainers able to host one-hour Q&amp;amp;A sessions, then we’ll trial this for a short period of time.&lt;/p&gt;
&lt;/section&gt;
</content>
    <link href="https://blog.dask.org/2021/05/24/life-science-summit-workshop/"/>
    <summary>Document headings start at H2, not H1 [myst.header]</summary>
    <category term="DaskSummit" label="Dask Summit"/>
    <category term="lifescience" label="life science"/>
    <published>2021-05-24T00:00:00+00:00</published>
  </entry>
  <entry>
    <id>https://blog.dask.org/2021/05/07/skeleton-analysis/</id>
    <title>Skeleton analysis</title>
    <updated>2021-05-07T00:00:00+00:00</updated>
    <author>
      <name>Genevieve Buckley</name>
    </author>
    <content type="html">&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/07/skeleton-analysis.md&lt;/span&gt;, line 9)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;section id="executive-summary"&gt;

&lt;p&gt;In this blogpost, we show how to modify a skeleton network analysis with Dask to work with constrained RAM (eg: on your laptop). This makes it more accessible: it can run on a small laptop, instead of requiring access to a supercomputing cluster. Example code is also &lt;a class="reference external" href="https://github.com/GenevieveBuckley/distributed-skeleton-analysis/blob/main/distributed-skeleton-analysis-with-dask.ipynb"&gt;provided here&lt;/a&gt;.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/07/skeleton-analysis.md&lt;/span&gt;, line 13)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="contents"&gt;
&lt;h1&gt;Contents&lt;/h1&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#skeleton-structures-are-everywhere"&gt;&lt;span class="xref myst"&gt;Skeleton structures are everywhere&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#the-scientific-problem"&gt;&lt;span class="xref myst"&gt;The scientific problem&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#the-compute-problem"&gt;&lt;span class="xref myst"&gt;The compute problem&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#our-approach"&gt;&lt;span class="xref myst"&gt;Our approach&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#results"&gt;&lt;span class="xref myst"&gt;Results&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#limitations"&gt;&lt;span class="xref myst"&gt;Limitations&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#problems-encountered"&gt;&lt;span class="xref myst"&gt;Problems encountered&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#how-we-solved-them"&gt;&lt;span class="xref myst"&gt;How we solved them&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#problem-1-the-skeletonize-function-from-scikit-image-crashes-due-to-lack-of-ram"&gt;&lt;span class="xref myst"&gt;Problem 1: The skeletonize function from scikit-image crashes due to lack of RAM&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#problem-2-ragged-or-non-uniform-output-from-dask-array-chunks"&gt;&lt;span class="xref myst"&gt;Problem 2: Ragged or non-uniform output from Dask array chunks&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#problem-3-grabbing-the-image-chunks-with-an-overlap"&gt;&lt;span class="xref myst"&gt;Problem 3: Grabbing the image chunks with an overlap&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#problem-4-summary-statistics-with-skan"&gt;&lt;span class="xref myst"&gt;Problem 4: Summary statistics with skan&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#what's-next"&gt;&lt;span class="xref myst"&gt;What’s next&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference internal" href="#how-you-can-help"&gt;&lt;span class="xref myst"&gt;How you can help&lt;/span&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/07/skeleton-analysis.md&lt;/span&gt;, line 30)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="skeleton-structures-are-everywhere"&gt;
&lt;h1&gt;Skeleton structures are everywhere&lt;/h1&gt;
&lt;p&gt;Lots of biological structures have a skeleton or network-like shape. We see these in all kinds of places, including:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;blood vessel branching&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;the branching of airways&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;neuron networks in the brain&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;the root structure of plants&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;the capillaries in leaves&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;… and many more&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Analysing the structure of these skeletons can give us important information about the biology of that system.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/07/skeleton-analysis.md&lt;/span&gt;, line 43)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="the-scientific-problem"&gt;
&lt;h1&gt;The scientific problem&lt;/h1&gt;
&lt;p&gt;For this bogpost, we will look at the blood vessels inside of a lung. This data was shared with us by &lt;a class="reference external" href="https://research.monash.edu/en/persons/marcus-kitchen"&gt;Marcus Kitchen&lt;/a&gt;, &lt;a class="reference external" href="https://hudson.org.au/researcher-profile/andrew-stainsby/"&gt;Andrew Stainsby&lt;/a&gt;, and their team of collaborators.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Skeleton network of blood vessels within a healthy lung" src="https://blog.dask.org/_images/skeleton-screenshot-crop.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;This research group focusses on lung development.
We want to compare the blood vessels in a healthy lung, against a lung from a hernia model. In the hernia model the lung is underdeveloped, squashed, and smaller.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/07/skeleton-analysis.md&lt;/span&gt;, line 52)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="the-compute-problem"&gt;
&lt;h1&gt;The compute problem&lt;/h1&gt;
&lt;p&gt;These image volumes have a shape of roughtly 1000x1000x1000 pixels.
That doesn’t seem huge but given the high RAM consumption involved in processing the analysis, it crashes when running on a laptop.&lt;/p&gt;
&lt;p&gt;If you’re running out of RAM, there are two possible appoaches:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;&lt;p&gt;Get more RAM. Run things on a bigger computer, or move things to a supercomputing cluster. This has the advantage that you don’t need to rewrite your code, but it does require access to more powerful computer hardware.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Manage the RAM you’ve got. Dask is good for this. If we use Dask, and some reasonable chunking of our arrays, we can manage things so that we never hit the RAM ceiling and crash. This has the advantage that you don’t need to buy more computer hardware, but it will require re-writing some code.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/07/skeleton-analysis.md&lt;/span&gt;, line 63)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="our-approach"&gt;
&lt;h1&gt;Our approach&lt;/h1&gt;
&lt;p&gt;We took the second approach, using Dask so we can run our analysis on a small laptop with constrained RAM without crashing. This makes it more accessible, to more people.&lt;/p&gt;
&lt;p&gt;All the image pre-processing steps will be done with &lt;a class="reference external" href="http://image.dask.org/en/latest/"&gt;dask-image&lt;/a&gt;, and the &lt;a class="reference external" href="https://scikit-image.org/docs/dev/auto_examples/edges/plot_skeleton.html"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;skeletonize&lt;/span&gt;&lt;/code&gt;&lt;/a&gt; function of &lt;a class="reference external" href="https://scikit-image.org/"&gt;scikit-image&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We use &lt;a class="reference external" href="https://jni.github.io/skan/"&gt;skan&lt;/a&gt; as the backbone of our analysis pipeline. &lt;a class="reference external" href="https://jni.github.io/skan/"&gt;skan&lt;/a&gt; is a library for skeleton image analysis. Given a skeleton image, it can describe statistics of the branches. To make it fast, the library is accelerated with &lt;a class="reference external" href="https://numba.pydata.org/"&gt;numba&lt;/a&gt; (if you’re curious, you can hear more about that in &lt;a class="reference external" href="https://www.youtube.com/watch?v=0pUPNMglnaE"&gt;this talk&lt;/a&gt; and its &lt;a class="reference external" href="https://github.com/jni/skan-talk-scipy-2019"&gt;related notebook&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;There is an example notebook containing the full details of the skeleton analysis &lt;a class="reference external" href="https://github.com/GenevieveBuckley/distributed-skeleton-analysis/blob/main/distributed-skeleton-analysis-with-dask.ipynb"&gt;available here&lt;/a&gt;. You can read on to hear just the highlights.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/07/skeleton-analysis.md&lt;/span&gt;, line 73)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="results"&gt;
&lt;h1&gt;Results&lt;/h1&gt;
&lt;p&gt;The statistics from the blood vessel branches in the healthy and herniated lung shows clear differences between the two.&lt;/p&gt;
&lt;p&gt;Most striking is the difference in the number of blood vessel branches.
The herniated lung has less than 40% of the number of blood vessel branches in the healthy lung.&lt;/p&gt;
&lt;p&gt;There are also quantitative differences in the sizes of the blood vessels.
Here is a violin plot showing the distribution of the distances between the start and end points of each blood vessel branch. We can see that overall the blood vessel branches start and end closer together in the herniated lung. This is consistent with what we might expect, since the healthy lung is more well developed than the lung from the hernia model and the hernia has compressed that lung into a smaller overall space.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Violin plot comparing blood vessel thickness between a healthy and herniated lung" src="https://blog.dask.org/_images/compare-euclidean-distance.png" /&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;EDIT: This blogpost previously described the euclidean distance violin plot as measuring the thickness of the blood vessels. This is incorrect, and the mistake was not caught in the review process before publication. This post has been updated to correctly describe the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;euclidean-distance&lt;/span&gt;&lt;/code&gt; measuremet as the distance between the start and end of branches, as if you pulled a string taught between those points. An alternative measurement, &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;branch-length&lt;/span&gt;&lt;/code&gt; describes the total branch length, including any winding twists and turns.&lt;/em&gt;&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/07/skeleton-analysis.md&lt;/span&gt;, line 87)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="limitations"&gt;
&lt;h1&gt;Limitations&lt;/h1&gt;
&lt;p&gt;We rely on one big assumption: once skeletonized the reduced non-zero pixel data will fit into memory. While this holds true for datasets of this size (the cropped rabbit lung datasets are roughly 1000 x 1000 x 1000 pixels), it may not hold true for much larger data.&lt;/p&gt;
&lt;p&gt;Dask computation is also triggered at a few points through our prototype workflow. Ideally all computation would be delayed until the very final stage.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/07/skeleton-analysis.md&lt;/span&gt;, line 93)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="problems-encountered"&gt;
&lt;h1&gt;Problems encountered&lt;/h1&gt;
&lt;p&gt;This project was originally intended to be a quick &amp;amp; easy one. Famous last words!&lt;/p&gt;
&lt;p&gt;What I wanted to do was to put the image data in a Dask array, and then use the &lt;a class="reference external" href="https://docs.dask.org/en/latest/array-overlap.html"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;map_overlap&lt;/span&gt;&lt;/code&gt;&lt;/a&gt; function to do the image filtering, thresholding, skeletonizing, and skeleton analysis. What I soon found was that although the image filtering, thresholding, and skeletonization worked well, the skeleton analysis step had some problems:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Dask’s map_overlap function doesn’t handle ragged or non-uniformly shaped results from different image chunks very well, and…&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Internal function in the skan library were written in a way that was incompatible with distributed computation.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/07/skeleton-analysis.md&lt;/span&gt;, line 103)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="how-we-solved-them"&gt;
&lt;h1&gt;How we solved them&lt;/h1&gt;
&lt;section id="problem-1-the-skeletonize-function-from-scikit-image-crashes-due-to-lack-of-ram"&gt;
&lt;h2&gt;Problem 1: The skeletonize function from scikit-image crashes due to lack of RAM&lt;/h2&gt;
&lt;p&gt;The &lt;a class="reference external" href="https://scikit-image.org/docs/dev/auto_examples/edges/plot_skeleton.html"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;skeletonize&lt;/span&gt;&lt;/code&gt;&lt;/a&gt; function of &lt;a class="reference external" href="https://scikit-image.org/"&gt;scikit-image&lt;/a&gt; is very memory intensive, and was crashing on a laptop with 16GB RAM.&lt;/p&gt;
&lt;p&gt;We solved this by:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Putting our image data into a Dask array with &lt;a class="reference external" href="http://image.dask.org/en/latest/dask_image.imread.html"&gt;dask-image &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;imread&lt;/span&gt;&lt;/code&gt;&lt;/a&gt;,&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://docs.dask.org/en/latest/array-chunks.html?highlight=rechunk#rechunking"&gt;Rechunking&lt;/a&gt; the Dask array. We need to change the chunk shapes from 2D slices to small cuboid volumes, so the next step in the computation is efficient. We can choose the overall size of the chunks so that we can stay under the memory threshold needed for skeletonize.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Finally, we run the &lt;a class="reference external" href="https://scikit-image.org/docs/dev/auto_examples/edges/plot_skeleton.html"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;skeletonize&lt;/span&gt;&lt;/code&gt; function&lt;/a&gt; on the Dask array chunks using the &lt;a class="reference external" href="https://docs.dask.org/en/latest/array-overlap.html"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;map_overlap&lt;/span&gt;&lt;/code&gt; function&lt;/a&gt;. By limiting the size of the array chunks, we stay under our memory threshold!&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/section&gt;
&lt;section id="problem-2-ragged-or-non-uniform-output-from-dask-array-chunks"&gt;
&lt;h2&gt;Problem 2: Ragged or non-uniform output from Dask array chunks&lt;/h2&gt;
&lt;p&gt;The skeleton analysis functions will return results with ragged or non-uniform length for each image chunk. This is unsurpising, because different chunks will have different numbers of non-zero pixels in our skeleton shape.&lt;/p&gt;
&lt;p&gt;When working with Dask arrays, there are two very commonly used functions: &lt;a class="reference external" href="https://docs.dask.org/en/latest/array-api.html#dask.array.map_blocks"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;map_blocks&lt;/span&gt;&lt;/code&gt;&lt;/a&gt; and &lt;a class="reference external" href="https://docs.dask.org/en/latest/array-overlap.html"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;map_overlap&lt;/span&gt;&lt;/code&gt;&lt;/a&gt;. Here’s what happens when we try a function with ragged outputs with &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;map_blocks&lt;/span&gt;&lt;/code&gt; versus &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;map_overlap&lt;/span&gt;&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;dask.array&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;da&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;numpy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;np&lt;/span&gt;

&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;da&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ones&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  &lt;span class="c1"&gt;# our dummy analysis function&lt;/span&gt;
    &lt;span class="n"&gt;random_length&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;random_length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;With &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;map_blocks&lt;/span&gt;&lt;/code&gt;, everything works well:&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;da&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map_blocks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;drop_axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;compute&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# this works well&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;But if we need some overlap for function &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;foo&lt;/span&gt;&lt;/code&gt; to work correctly, then we run into problems:&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;da&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map_overlap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;drop_axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;compute&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# incorrect results&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Here, the first and last element of the results from foo are trimmed off before the results are concatenated, which we don’t want! Setting the keyword argument &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;trim=False&lt;/span&gt;&lt;/code&gt; would help avoid this problem, except then we get an error:&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;da&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map_overlap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;trim&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;drop_axis&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;compute&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# ValueError&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Unfortunately for us, it’s really important to have a 1 pixel overlap in our array chunks, so that we can tell if a skeleton branch is ending or continuing on into the next chunk.&lt;/p&gt;
&lt;p&gt;There’s some complexity in the way &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;map_overlap&lt;/span&gt;&lt;/code&gt; results are concatenated back together so rather than diving into that, a more straightforward solution is to use &lt;a class="reference external" href="https://docs.dask.org/en/latest/delayed.html"&gt;Dask delayed&lt;/a&gt; instead. &lt;a class="reference external" href="https://github.com/chrisroat"&gt;Chris Roat&lt;/a&gt; shows a nice example of how we can use &lt;a class="reference external" href="https://docs.dask.org/en/latest/delayed.html"&gt;Dask delayed&lt;/a&gt; in a list comprehension that is then concatenated with Dask (&lt;a class="reference external" href="https://github.com/dask/dask/issues/7589"&gt;link to original discussion&lt;/a&gt;).&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;numpy&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;np&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;pandas&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;pd&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;dask&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;dask.array&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;da&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;dask.dataframe&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;dd&lt;/span&gt;

&lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;da&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ones&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="nd"&gt;@dask&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;delayed&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;randint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Make each dataframe a different size&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;x&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                         &lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;arange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)})&lt;/span&gt;

&lt;span class="n"&gt;meta&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;make_meta&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;x&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;int64&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;y&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;int64&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;span class="n"&gt;blocks&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_delayed&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ravel&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# no overlap&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;dd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_delayed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;blocks&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;ddf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;concat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;ddf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;compute&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;&lt;strong&gt;Warning:&lt;/strong&gt; It’s very important to pass in a &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;meta&lt;/span&gt;&lt;/code&gt; keyword argument to the function &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;from_delayed&lt;/span&gt;&lt;/code&gt;. Without it, things will be extremely inefficient!&lt;/p&gt;
&lt;p&gt;If the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;meta&lt;/span&gt;&lt;/code&gt; keyword argument is not given, Dask will try and work out what it should be. Ordinarily that might be a good thing, but inside a list comprehension that means those tasks are computed slowly and sequentially before the main computation even begins, which is horribly inefficient. Since we know ahead of time what kinds of results we expect from our analysis function (we just don’t know the length of each set of results), we can use the &lt;a class="reference external" href="https://docs.dask.org/en/latest/dataframe-api.html#dask.dataframe.utils.make_meta"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;utils.make_meta&lt;/span&gt;&lt;/code&gt;&lt;/a&gt; function to help us here.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="problem-3-grabbing-the-image-chunks-with-an-overlap"&gt;
&lt;h2&gt;Problem 3: Grabbing the image chunks with an overlap&lt;/h2&gt;
&lt;p&gt;Now that we’re using &lt;a class="reference external" href="https://docs.dask.org/en/latest/delayed.html"&gt;Dask delayed&lt;/a&gt; to piece together our skeleton analysis results, it’s up to us to handle the array chunks overlap ourselves.&lt;/p&gt;
&lt;p&gt;We’ll do that by modifying Dask’s &lt;a class="reference external" href="https://github.com/dask/dask/blob/21aaf44d4d25bdba05951b85f3f2d943b823e82d/dask/array/core.py#L209-L225"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask.array.core.slices_from_chunks&lt;/span&gt;&lt;/code&gt;&lt;/a&gt; function, into something that will be able to handle an overlap. Some special handling is required at the boundaries of the Dask array, so that we don’t try to slice past the edge of the array.&lt;/p&gt;
&lt;p&gt;Here’s what that looks like (&lt;a class="reference external" href="https://gist.github.com/GenevieveBuckley/decd23c22ee3417f7d78e87f791bc081"&gt;gist&lt;/a&gt;):&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;itertools&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;product&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;dask.array.slicing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;cached_cumsum&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;slices_from_chunks_overlap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;array_shape&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;cumdims&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;cached_cumsum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;initial_zero&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kc"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;bds&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;slices&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;starts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cumdims&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;inner_slices&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dim&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;maxshape&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;starts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shapes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;array_shape&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;slice_start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
            &lt;span class="n"&gt;slice_stop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;dim&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;slice_start&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;slice_start&lt;/span&gt; &lt;span class="o"&gt;-=&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;slice_stop&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;maxshape&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;slice_stop&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt;
            &lt;span class="n"&gt;inner_slices&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;slice_start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;slice_stop&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="n"&gt;slices&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inner_slices&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;product&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;slices&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Now that we can slice an image chunk plus an extra pixel of overlap, all we need is a way to do that for all the chunks in an array. Drawing inspiration from this &lt;a class="reference external" href="https://github.com/dask/dask-image/blob/63543bf2f6553a8150f45289492bf614e1945ac0/dask_image/ndmeasure/__init__.py#L299-L303"&gt;block iteration&lt;/a&gt; we make a similar iterator.&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;block_iter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ndindex&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;numblocks&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nb"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;functools&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;partial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;operator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getitem&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;slices_from_chunks_overlap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chunks&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;shape&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;meta&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;utils&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;make_meta&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;row&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;int64&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;col&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;int64&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;data&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;)])&lt;/span&gt;
&lt;span class="n"&gt;intermediate_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;dd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_delayed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skeleton_graph_func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;block_iter&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;concat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;intermediate_results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;drop_duplicates&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# we need to drop duplicates because it counts pixels in the overlapping region twice&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;With these results, we’re able to create the sparse skeleton graph.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="problem-4-summary-statistics-with-skan"&gt;
&lt;h2&gt;Problem 4: Summary statistics with skan&lt;/h2&gt;
&lt;p&gt;Skeleton branch statistics can be calculate with the &lt;a class="reference external" href="https://jni.github.io/skan/api/skan.csr.html#skan.csr.summarize"&gt;skan &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;summarize&lt;/span&gt;&lt;/code&gt;&lt;/a&gt; function. The problem here is that the function expects a &lt;a class="reference external" href="https://jni.github.io/skan/api/skan.csr.html#skan.csr.Skeleton"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;Skeleton&lt;/span&gt;&lt;/code&gt;&lt;/a&gt; object instance, but initializing a &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;Skeleton&lt;/span&gt;&lt;/code&gt; object calls methods that are not compatible for distributed analysis.&lt;/p&gt;
&lt;p&gt;We’ll solve this problem by first initializing a &lt;a class="reference external" href="https://jni.github.io/skan/api/skan.csr.html#skan.csr.Skeleton"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;Skeleton&lt;/span&gt;&lt;/code&gt;&lt;/a&gt; object instance with a tiny dummy dataset, then overwriting the attributes of the skeleton object with our real results. This is a hack, but it lets us achieve our goal: summary branch statistics for our large dataset.&lt;/p&gt;
&lt;p&gt;First we make a Skeleton object instance with dummy data:&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;skan._testdata&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;skeleton0&lt;/span&gt;

&lt;span class="n"&gt;skeleton_object&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Skeleton&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skeleton0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# initialize with dummy data&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Then we overwrite the attributes with the previously calculated results:&lt;/p&gt;
&lt;div class="highlight-default notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;skeleton_object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;skeleton_image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="n"&gt;skeleton_object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="n"&gt;skeleton_object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;coordinates&lt;/span&gt;
&lt;span class="n"&gt;skeleton_object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;degrees&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="n"&gt;skeleton_object&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;distances&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;...&lt;/span&gt;
&lt;span class="o"&gt;...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Then finally we can calculate the summary branch statistics:&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;skan&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;summarize&lt;/span&gt;

&lt;span class="n"&gt;statistics&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;summarize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;skel_obj&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;statistics&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;head&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="pst-scrollable-table-container"&gt;&lt;table class="table"&gt;
&lt;thead&gt;
&lt;tr class="row-odd"&gt;&lt;th class="head text-right"&gt;&lt;p&gt;&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;skeleton-id&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;node-id-src&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;node-id-dst&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;branch-distance&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;branch-type&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;mean-pixel-value&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;stdev-pixel-value&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;image-coord-src-0&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;image-coord-src-1&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;image-coord-src-2&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;image-coord-dst-0&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;image-coord-dst-1&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;image-coord-dst-2&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;coord-src-0&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;coord-src-1&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;coord-src-2&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;coord-dst-0&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;coord-dst-1&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;coord-dst-2&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;euclidean-distance&lt;/p&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class="row-even"&gt;&lt;td class="text-right"&gt;&lt;p&gt;0&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.474584&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.00262514&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;22&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;400&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;595&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;22&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;400&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;596&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;22&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;400&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;595&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;22&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;400&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;596&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class="row-odd"&gt;&lt;td class="text-right"&gt;&lt;p&gt;1&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;3&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;9&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;8.19615&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.464662&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.00299629&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;37&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;400&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;622&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;43&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;392&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;590&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;37&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;400&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;622&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;43&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;392&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;590&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;33.5261&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class="row-even"&gt;&lt;td class="text-right"&gt;&lt;p&gt;2&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;3&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;10&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;11&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.483393&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.00771038&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;49&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;391&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;589&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;50&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;391&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;589&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;49&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;391&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;589&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;50&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;391&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;589&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class="row-odd"&gt;&lt;td class="text-right"&gt;&lt;p&gt;3&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;5&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;13&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;19&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;6.82843&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.464325&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.0139064&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;52&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;389&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;588&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;55&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;385&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;588&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;52&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;389&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;588&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;55&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;385&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;588&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;5&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class="row-even"&gt;&lt;td class="text-right"&gt;&lt;p&gt;4&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;7&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;21&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;23&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.45862&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.0104024&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;57&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;382&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;587&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;58&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;380&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;586&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;57&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;382&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;587&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;58&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;380&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;586&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2.44949&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;statistics&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="pst-scrollable-table-container"&gt;&lt;table class="table"&gt;
&lt;thead&gt;
&lt;tr class="row-odd"&gt;&lt;th class="head text-left"&gt;&lt;p&gt;&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;skeleton-id&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;node-id-src&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;node-id-dst&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;branch-distance&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;branch-type&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;mean-pixel-value&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;stdev-pixel-value&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;image-coord-src-0&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;image-coord-src-1&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;image-coord-src-2&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;image-coord-dst-0&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;image-coord-dst-1&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;image-coord-dst-2&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;coord-src-0&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;coord-src-1&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;coord-src-2&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;coord-dst-0&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;coord-dst-1&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;coord-dst-2&lt;/p&gt;&lt;/th&gt;
&lt;th class="head text-right"&gt;&lt;p&gt;euclidean-distance&lt;/p&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr class="row-even"&gt;&lt;td class="text-left"&gt;&lt;p&gt;count&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1095&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class="row-odd"&gt;&lt;td class="text-left"&gt;&lt;p&gt;mean&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2089.38&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;11520.1&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;11608.6&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;22.9079&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2.00091&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.663422&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.0418607&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;591.939&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;430.303&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;377.409&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;594.325&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;436.596&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;373.419&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;591.939&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;430.303&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;377.409&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;594.325&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;436.596&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;373.419&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;190.13&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class="row-even"&gt;&lt;td class="text-left"&gt;&lt;p&gt;std&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;636.377&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;6057.61&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;6061.18&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;24.2646&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.0302199&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.242828&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.0559064&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;174.04&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;194.499&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;97.0219&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;173.353&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;188.708&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;96.8276&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;174.04&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;194.499&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;97.0219&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;173.353&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;188.708&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;96.8276&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;151.171&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class="row-odd"&gt;&lt;td class="text-left"&gt;&lt;p&gt;min&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.414659&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;6.79493e-06&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;22&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;39&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;116&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;22&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;39&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;114&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;22&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;39&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;116&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;22&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;39&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;114&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class="row-even"&gt;&lt;td class="text-left"&gt;&lt;p&gt;25%&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1586&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;6215.5&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;6429.5&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1.73205&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.482&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.00710439&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;468.5&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;278.5&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;313&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;475&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;299.5&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;307&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;468.5&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;278.5&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;313&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;475&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;299.5&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;307&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;72.6946&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class="row-odd"&gt;&lt;td class="text-left"&gt;&lt;p&gt;50%&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2431&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;11977&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;12010&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;16.6814&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.552626&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.0189069&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;626&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;405&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;388&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;627&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;410&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;381&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;626&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;405&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;388&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;627&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;410&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;381&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;161.059&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class="row-even"&gt;&lt;td class="text-left"&gt;&lt;p&gt;75%&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2542.5&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;16526.5&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;16583&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;35.0433&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;2&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.768359&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.0528814&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;732&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;579&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;434&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;734&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;590&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;432&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;732&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;579&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;434&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;734&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;590&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;432&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;265.948&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class="row-odd"&gt;&lt;td class="text-left"&gt;&lt;p&gt;max&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;8034&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;26820&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;26822&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;197.147&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;3&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;1.29687&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;0.357193&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;976&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;833&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;622&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;976&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;841&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;606&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;976&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;833&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;622&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;976&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;841&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;606&lt;/p&gt;&lt;/td&gt;
&lt;td class="text-right"&gt;&lt;p&gt;737.835&lt;/p&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;/div&gt;
&lt;p&gt;Success!&lt;/p&gt;
&lt;p&gt;We’ve achieved distributed skeleton analysis with Dask.
You can see the example notebook containing the full details of the skeleton analysis &lt;a class="reference external" href="https://github.com/GenevieveBuckley/distributed-skeleton-analysis/blob/main/distributed-skeleton-analysis-with-dask.ipynb"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/07/skeleton-analysis.md&lt;/span&gt;, line 294)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;/section&gt;
&lt;section id="what-s-next"&gt;
&lt;h1&gt;What’s next?&lt;/h1&gt;
&lt;p&gt;A good next step is modifing the &lt;a class="reference external" href="https://github.com/jni/skan"&gt;skan&lt;/a&gt; library code so that it directly supports distributed skeleton analysis.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2021/05/07/skeleton-analysis.md&lt;/span&gt;, line 298)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="how-you-can-help"&gt;
&lt;h1&gt;How you can help&lt;/h1&gt;
&lt;p&gt;If you’d like to get involved, there are a couple of options:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;&lt;p&gt;Try a similar analysis on your own data. The notebook with the full example code is &lt;a class="reference external" href="https://github.com/GenevieveBuckley/distributed-skeleton-analysis/blob/main/distributed-skeleton-analysis-with-dask.ipynb"&gt;available here&lt;/a&gt;. You can share or ask questions in the &lt;a class="reference external" href="https://join.slack.com/t/dask/shared_invite/zt-mfmh7quc-nIrXL6ocgiUH2haLYA914g"&gt;Dask slack&lt;/a&gt; or &lt;a class="reference internal" href="#twitter.com/dask_dev"&gt;&lt;span class="xref myst"&gt;on twitter&lt;/span&gt;&lt;/a&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Help add support for distributed skeleton analysis to skan. Head on over to the &lt;a class="reference external" href="https://github.com/jni/skan/issues/"&gt;skan issues page&lt;/a&gt; and leave a comment if you’d like to join in.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;/section&gt;
</content>
    <link href="https://blog.dask.org/2021/05/07/skeleton-analysis/"/>
    <summary>Document headings start at H2, not H1 [myst.header]</summary>
    <category term="imaging" label="imaging"/>
    <category term="lifescience" label="life science"/>
    <category term="skan" label="skan"/>
    <category term="skeletonanalysis" label="skeleton analysis"/>
    <published>2021-05-07T00:00:00+00:00</published>
  </entry>
</feed>
