<?xml version='1.0' encoding='UTF-8'?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
  <id>https://blog.dask.org</id>
  <title>Dask Working Notes - Posts tagged clusters</title>
  <updated>2026-03-05T15:05:25.180180+00:00</updated>
  <link href="https://blog.dask.org"/>
  <link href="https://blog.dask.org/blog/tag/clusters/atom.xml" rel="self"/>
  <generator uri="https://ablog.readthedocs.io/" version="0.11.12">ABlog</generator>
  <entry>
    <id>https://blog.dask.org/2022/11/09/dask-kubernetes-operator/</id>
    <title>Dask Kubernetes Operator</title>
    <updated>2022-11-09T00:00:00+00:00</updated>
    <author>
      <name>Jacob Tomlinson (NVIDIA)</name>
    </author>
    <content type="html">&lt;p&gt;We are excited to announce that the &lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/operator.html"&gt;Dask Kubernetes Operator&lt;/a&gt; is now generally available 🎉!&lt;/p&gt;
&lt;p&gt;Notable new features include:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;Dask Clusters are now &lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/operator_resources.html"&gt;native custom resources&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Clusters can be managed with &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;kubectl&lt;/span&gt;&lt;/code&gt; or the &lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/operator_kubecluster.html"&gt;Python API&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Cascaded deletions allow for proper teardown&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Multiple &lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/operator_resources.html#daskworkergroup"&gt;worker groups&lt;/a&gt; enable heterogenous/tagged deployments&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/operator_resources.html#daskjob"&gt;DaskJob&lt;/a&gt;: running dask workloads with K8s batched job infrastructure&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Clusters can be reused between different Python processes&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/operator_resources.html#daskautoscaler"&gt;Autoscaling&lt;/a&gt; is handled by a custom Kubernetes controller instead of the user code&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Scheduler and worker Pods and Services are &lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/operator_resources.html#daskcluster"&gt;fully configurable&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="highlight-console notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="gp"&gt;$ &lt;/span&gt;kubectl&lt;span class="w"&gt; &lt;/span&gt;get&lt;span class="w"&gt; &lt;/span&gt;daskcluster
&lt;span class="go"&gt;NAME         AGE&lt;/span&gt;
&lt;span class="go"&gt;my-cluster   4m3s&lt;/span&gt;

&lt;span class="gp"&gt;$ &lt;/span&gt;kubectl&lt;span class="w"&gt; &lt;/span&gt;get&lt;span class="w"&gt; &lt;/span&gt;all&lt;span class="w"&gt; &lt;/span&gt;-A&lt;span class="w"&gt; &lt;/span&gt;-l&lt;span class="w"&gt; &lt;/span&gt;dask.org/cluster-name&lt;span class="o"&gt;=&lt;/span&gt;my-cluster
&lt;span class="go"&gt;NAMESPACE   NAME                                       READY   STATUS    RESTARTS   AGE&lt;/span&gt;
&lt;span class="go"&gt;default     pod/my-cluster-default-worker-22bd39e33a   1/1     Running   0          3m43s&lt;/span&gt;
&lt;span class="go"&gt;default     pod/my-cluster-default-worker-5f4f2c989a   1/1     Running   0          3m43s&lt;/span&gt;
&lt;span class="go"&gt;default     pod/my-cluster-default-worker-72418a589f   1/1     Running   0          3m43s&lt;/span&gt;
&lt;span class="go"&gt;default     pod/my-cluster-default-worker-9b00a4e1fd   1/1     Running   0          3m43s&lt;/span&gt;
&lt;span class="go"&gt;default     pod/my-cluster-default-worker-d6fc172526   1/1     Running   0          3m43s&lt;/span&gt;
&lt;span class="go"&gt;default     pod/my-cluster-scheduler                   1/1     Running   0          4m21s&lt;/span&gt;

&lt;span class="go"&gt;NAMESPACE   NAME                           TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)             AGE&lt;/span&gt;
&lt;span class="go"&gt;default     service/my-cluster-scheduler   ClusterIP   10.96.33.67   &amp;lt;none&amp;gt;        8786/TCP,8787/TCP   4m21s&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;At the start of 2022 we began the large undertaking of rewriting the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-kubernetes&lt;/span&gt;&lt;/code&gt; package in the &lt;a class="reference external" href="https://kubernetes.io/docs/concepts/extend-kubernetes/operator/"&gt;operator pattern&lt;/a&gt;. This design pattern has become very popular in the Kubernetes community with companies like &lt;a class="reference external" href="https://www.redhat.com/en/technologies/cloud-computing/openshift/what-are-openshift-operators"&gt;Red Hat building their whole Kubernetes offering Openshift around it&lt;/a&gt;.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2022/11/09/dask-kubernetes-operator.md&lt;/span&gt;, line 42)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;section id="what-is-an-operator"&gt;

&lt;p&gt;If you’ve spent any time in the Kubernetes community you’ll have heard the term operator being thrown around seen projects like &lt;a class="reference external" href="https://github.com/operator-framework"&gt;Golang’s Operator Framework&lt;/a&gt; being used to deploy modern applications.&lt;/p&gt;
&lt;p&gt;At it’s core an operator is made up of a data structure for describing the thing you want to deploy (in our case a Dask cluster) and a controller which does the actual deploying. In Kubernetes the templates for these data structures are called &lt;a class="reference external" href="https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/"&gt;Custom Resource Definitions&lt;/a&gt; (CRDs) and allow you to extend the Kubernetes API with new resource types of your own design.&lt;/p&gt;
&lt;p&gt;For &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-kubernetes&lt;/span&gt;&lt;/code&gt; we have created a few CRDs to describe things like &lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/operator_resources.html#daskcluster"&gt;Dask clusters&lt;/a&gt;, groups of &lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/operator_resources.html#daskworkergroup"&gt;Dask workers&lt;/a&gt;, &lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/operator_resources.html#daskjob"&gt;adaptive autoscalers&lt;/a&gt; and a new &lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/operator_resources.html#daskautoscaler"&gt;Dask powered batch job&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We also built a controller using &lt;a class="reference external" href="https://kopf.readthedocs.io/en/stable/"&gt;kopf&lt;/a&gt; that handles watching for changes to any of these resources and creates/updates/deletes lower level Kubernetes resources like Pods and Services.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2022/11/09/dask-kubernetes-operator.md&lt;/span&gt;, line 52)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="why-did-we-build-this"&gt;
&lt;h1&gt;Why did we build this?&lt;/h1&gt;
&lt;p&gt;The &lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/history.html"&gt;original implementation&lt;/a&gt; of &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-kubernetes&lt;/span&gt;&lt;/code&gt; was started shortly after Kubernetes went &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;1.0&lt;/span&gt;&lt;/code&gt; and before any established design patterns had emerged. Its model was based on spawning Dask workers as subprocesses, except those subprocesses are Pods running in Kubernetes. This is the same way &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-jobqueue&lt;/span&gt;&lt;/code&gt; launches workers as individual job scheduler allocations or &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-ssh&lt;/span&gt;&lt;/code&gt; opens many SSH connections to various machines.&lt;/p&gt;
&lt;p&gt;Over time this has been refactored, rewritten and extended multiple times. One long-asked-for change was to also place the Dask scheduler inside the Kubernetes cluster to simplify scheduler-worker communication and network connectivity. Naturally this lead to more feature requests around configuring the scheduler service and having more control over the cluster. As we extended more and more the original premise of spawning worker subprocesses on a remote system became less helpful.&lt;/p&gt;
&lt;p&gt;The final straw in the original design was folks asking for the ability to leave a cluster running and come back to it later. Either to reuse a cluster between separate jobs, or just different stages in a multi-stage pipeline. The premise of spawning subprocesses leads to an assumption that the parent process will be around for the lifetime of the cluster which makes it a reasonable place to hold state such as the template for launching new workers when scaling up. We attempted to implement this feature but it just wasn’t possible with the current design. Moving to a model where the parent process can die and new processes can pick up means that state needs to be moved elsewhere and things were too entangled to successfully pull this out.&lt;/p&gt;
&lt;p&gt;The classic implementation that had served us well for so long was creaking and becoming increasingly difficult to modify and maintain. The time had come to pay down our technical debt by rebuilding from scratch under a new model, the operator pattern.&lt;/p&gt;
&lt;p&gt;In this new model a Dask cluster is an abstract object that exists within a Kubernetes cluster. We use custom resources to store the state for each cluster and a custom controller to map that state onto reality by creating the individual components that make up the cluster. Want to scale up your cluster? Instead of having some Python code locally that spawns a new Pod on Kubernetes we just modify the state of the Dask cluster resource to specify the desired number of workers and the controller handles adding/removing Pods to match.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2022/11/09/dask-kubernetes-operator.md&lt;/span&gt;, line 64)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;section id="new-features"&gt;
&lt;h1&gt;New features&lt;/h1&gt;
&lt;p&gt;While our primary goal was allowing cluster reuse between Python processes and paying down technical debt switching to the operator pattern has allowed us to add a bunch of nice new features. So let’s explore those.&lt;/p&gt;
&lt;section id="python-or-yaml-api"&gt;
&lt;h2&gt;Python or YAML API&lt;/h2&gt;
&lt;p&gt;With our new implementation we create Dask clusters by creating a &lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/operator_resources.html#daskcluster"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskCluster&lt;/span&gt;&lt;/code&gt; resource&lt;/a&gt; on our Kubernetes cluster. The controller sees this appear and spawns child resources for the scheduler, workers, etc.&lt;/p&gt;
&lt;img alt="Diagram of a DaskCluster resource and its child resources" src="/images/2022-kubernetes/daskcluster.png" style="max-width: 100%;" width="100%" /&gt;
&lt;p&gt;We modify our cluster by editing the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskCluster&lt;/span&gt;&lt;/code&gt; resource and our controller reacts to those changes and updates the child resources accordingly.&lt;/p&gt;
&lt;p&gt;We delete our cluster by deleting the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskCluster&lt;/span&gt;&lt;/code&gt; resource and Kubernetes handles the rest (see the next section on cascade deletion).&lt;/p&gt;
&lt;p&gt;By storing all of our state in the resource and all of our logic in the controller this means the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;KubeCluster&lt;/span&gt;&lt;/code&gt; class is now much simpler. It’s actually so simple that it is entirely optional.&lt;/p&gt;
&lt;p&gt;The primary purpose of the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;KubeCluster&lt;/span&gt;&lt;/code&gt; class now is to provide a nice clean API for creating/scaling/deleting your clusters in Python. It can take a small number of keyword arguments and generate all of the YAML to submit to Kubernetes.&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;dask_kubernetes.operator&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;KubeCluster&lt;/span&gt;

&lt;span class="n"&gt;cluster&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;KubeCluster&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;my-cluster&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;FOO&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;bar&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;The above snippet creates the following resource.&lt;/p&gt;
&lt;div class="highlight-yaml notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;kubernetes.dask.org/v1&lt;/span&gt;
&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;DaskCluster&lt;/span&gt;
&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;my-cluster&lt;/span&gt;
&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;scheduler&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;service&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;ports&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;tcp-comm&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;8786&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;protocol&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;TCP&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;targetPort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;tcp-comm&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;http-dashboard&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;8787&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;protocol&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;TCP&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;targetPort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;http-dashboard&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;dask.org/cluster-name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;my-cluster&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="nt"&gt;dask.org/component&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;scheduler&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;ClusterIP&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;containers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;dask-scheduler&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;--host&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;0.0.0.0&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;env&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;FOO&lt;/span&gt;
&lt;span class="w"&gt;              &lt;/span&gt;&lt;span class="nt"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;bar&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;ghcr.io/dask/dask:latest&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;livenessProbe&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="nt"&gt;httpGet&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;              &lt;/span&gt;&lt;span class="nt"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;/health&lt;/span&gt;
&lt;span class="w"&gt;              &lt;/span&gt;&lt;span class="nt"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;http-dashboard&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="nt"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;15&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="nt"&gt;periodSeconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;20&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;scheduler&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;ports&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;containerPort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;8786&lt;/span&gt;
&lt;span class="w"&gt;              &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;tcp-comm&lt;/span&gt;
&lt;span class="w"&gt;              &lt;/span&gt;&lt;span class="nt"&gt;protocol&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;TCP&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;containerPort&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;8787&lt;/span&gt;
&lt;span class="w"&gt;              &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;http-dashboard&lt;/span&gt;
&lt;span class="w"&gt;              &lt;/span&gt;&lt;span class="nt"&gt;protocol&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;TCP&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;readinessProbe&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="nt"&gt;httpGet&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;              &lt;/span&gt;&lt;span class="nt"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;/health&lt;/span&gt;
&lt;span class="w"&gt;              &lt;/span&gt;&lt;span class="nt"&gt;port&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;http-dashboard&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="nt"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;5&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="nt"&gt;periodSeconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;10&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;null&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;worker&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;cluster&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;my-cluster&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;replicas&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;3&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="nt"&gt;containers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;dask-worker&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;--name&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;$(DASK_WORKER_NAME)&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;env&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;            &lt;/span&gt;&lt;span class="p p-Indicator"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;FOO&lt;/span&gt;
&lt;span class="w"&gt;              &lt;/span&gt;&lt;span class="nt"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;bar&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;ghcr.io/dask/dask:latest&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;worker&lt;/span&gt;
&lt;span class="w"&gt;          &lt;/span&gt;&lt;span class="nt"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;null&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;If I want to scale up my workers to 5 I can do this in Python.&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="n"&gt;cluster&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;All this does is apply a patch to the resource and modify the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;spec.worker.replicas&lt;/span&gt;&lt;/code&gt; value to be &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;5&lt;/span&gt;&lt;/code&gt; and the controller handles the rest.&lt;/p&gt;
&lt;p&gt;Ultimately our Python API is generating YAML and handing it to Kubernetes to action. Everything about our cluster is contained in that YAML. If we prefer we can write and store this YAML ourselves and manage our cluster entirely via &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;kubectl&lt;/span&gt;&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;If we put the above YAML example into a file called &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;my-cluster.yaml&lt;/span&gt;&lt;/code&gt; we can create it like this. No Python necessary.&lt;/p&gt;
&lt;div class="highlight-console notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="gp"&gt;$ &lt;/span&gt;kubectl&lt;span class="w"&gt; &lt;/span&gt;apply&lt;span class="w"&gt; &lt;/span&gt;-f&lt;span class="w"&gt; &lt;/span&gt;my-cluster.yaml
&lt;span class="go"&gt;daskcluster.kubernetes.dask.org/my-cluster created&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;We can also scale our cluster with &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;kubectl&lt;/span&gt;&lt;/code&gt;.&lt;/p&gt;
&lt;div class="highlight-console notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="gp"&gt;$ &lt;/span&gt;kubectl&lt;span class="w"&gt; &lt;/span&gt;scale&lt;span class="w"&gt; &lt;/span&gt;--replicas&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="m"&gt;5&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;daskworkergroup&lt;span class="w"&gt; &lt;/span&gt;my-cluster-default
&lt;span class="go"&gt;daskworkergroup.kubernetes.dask.org/my-cluster-default&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This is extremely powerful for advanced users who want to integrate with existing Kubernetes tooling and really modify everything about their Dask cluster.&lt;/p&gt;
&lt;p&gt;You can still construct a &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;KubeCluster&lt;/span&gt;&lt;/code&gt; object in the future and point it to this existing cluster for convenience.&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;dask.distributed&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Client&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;dask_kubernetes.operator&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;KubeCluster&lt;/span&gt;

&lt;span class="n"&gt;cluster&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;KubeCluster&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;from_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;my-cluster&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;cluster&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;scale&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cluster&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/section&gt;
&lt;section id="cascade-deletion"&gt;
&lt;h2&gt;Cascade deletion&lt;/h2&gt;
&lt;p&gt;Having a &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskCluster&lt;/span&gt;&lt;/code&gt; resource also makes deletion much more pleasant.&lt;/p&gt;
&lt;p&gt;In the old implementation your local Python process would spawn a bunch of &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;Pod&lt;/span&gt;&lt;/code&gt; resources along with supporting ones like &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;Service&lt;/span&gt;&lt;/code&gt; and &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;PodDisruptionBudget&lt;/span&gt;&lt;/code&gt; resources. It also had some teardown functionality that was either called directly or via a finalizer that deleted all of these resources when you are done.&lt;/p&gt;
&lt;p&gt;One downside of this was that if something went wrong either due to a bug in &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-kubernetes&lt;/span&gt;&lt;/code&gt; or a more severe failure that caused the Python process to exit without calling finalizers you would be left with a ton of resources that you had to clean up manually. I expect some folks have a label based selector command stored in their snippet manager somewhere but most folks would do this cleanup manually.&lt;/p&gt;
&lt;p&gt;With the new model the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskCluster&lt;/span&gt;&lt;/code&gt; resource is set as the &lt;a class="reference external" href="https://kubernetes.io/docs/concepts/overview/working-with-objects/owners-dependents/"&gt;owner&lt;/a&gt; of all of the other resources spawned by the controller. This means we can take advantage of &lt;a class="reference external" href="https://kubernetes.io/docs/tasks/administer-cluster/use-cascading-deletion/"&gt;cascade deletion&lt;/a&gt; for our cleanup. Regardless of how you create your cluster or whether the initial Python process still exists you can just delete the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskCluster&lt;/span&gt;&lt;/code&gt; resource and Kubernetes will know to automatically delete all of its children.&lt;/p&gt;
&lt;div class="highlight-console notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="gp"&gt;$ &lt;/span&gt;kubectl&lt;span class="w"&gt; &lt;/span&gt;get&lt;span class="w"&gt; &lt;/span&gt;daskcluster&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;# Here we see our Dask cluster resource&lt;/span&gt;
&lt;span class="go"&gt;NAME         AGE&lt;/span&gt;
&lt;span class="go"&gt;my-cluster   4m3s&lt;/span&gt;

&lt;span class="gp"&gt;$ &lt;/span&gt;kubectl&lt;span class="w"&gt; &lt;/span&gt;get&lt;span class="w"&gt; &lt;/span&gt;all&lt;span class="w"&gt; &lt;/span&gt;-A&lt;span class="w"&gt; &lt;/span&gt;-l&lt;span class="w"&gt; &lt;/span&gt;dask.org/cluster-name&lt;span class="o"&gt;=&lt;/span&gt;my-cluster&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;# and all of its child resources&lt;/span&gt;
&lt;span class="go"&gt;NAMESPACE   NAME                                       READY   STATUS    RESTARTS   AGE&lt;/span&gt;
&lt;span class="go"&gt;default     pod/my-cluster-default-worker-22bd39e33a   1/1     Running   0          3m43s&lt;/span&gt;
&lt;span class="go"&gt;default     pod/my-cluster-default-worker-5f4f2c989a   1/1     Running   0          3m43s&lt;/span&gt;
&lt;span class="go"&gt;default     pod/my-cluster-default-worker-72418a589f   1/1     Running   0          3m43s&lt;/span&gt;
&lt;span class="go"&gt;default     pod/my-cluster-default-worker-9b00a4e1fd   1/1     Running   0          3m43s&lt;/span&gt;
&lt;span class="go"&gt;default     pod/my-cluster-default-worker-d6fc172526   1/1     Running   0          3m43s&lt;/span&gt;
&lt;span class="go"&gt;default     pod/my-cluster-scheduler                   1/1     Running   0          4m21s&lt;/span&gt;

&lt;span class="go"&gt;NAMESPACE   NAME                           TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)             AGE&lt;/span&gt;
&lt;span class="go"&gt;default     service/my-cluster-scheduler   ClusterIP   10.96.33.67   &amp;lt;none&amp;gt;        8786/TCP,8787/TCP   4m21s&lt;/span&gt;

&lt;span class="gp"&gt;$ &lt;/span&gt;kubectl&lt;span class="w"&gt; &lt;/span&gt;delete&lt;span class="w"&gt; &lt;/span&gt;daskcluster&lt;span class="w"&gt; &lt;/span&gt;my-cluster&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;# We can delete the daskcluster resource&lt;/span&gt;
&lt;span class="go"&gt;daskcluster.kubernetes.dask.org &amp;quot;my-cluster&amp;quot; deleted&lt;/span&gt;

&lt;span class="gp"&gt;$ &lt;/span&gt;kubectl&lt;span class="w"&gt; &lt;/span&gt;get&lt;span class="w"&gt; &lt;/span&gt;all&lt;span class="w"&gt; &lt;/span&gt;-A&lt;span class="w"&gt; &lt;/span&gt;-l&lt;span class="w"&gt; &lt;/span&gt;dask.org/cluster-name&lt;span class="o"&gt;=&lt;/span&gt;my-cluster&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="c1"&gt;# all of the children are removed&lt;/span&gt;
&lt;span class="go"&gt;No resources found&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/section&gt;
&lt;section id="multiple-worker-groups"&gt;
&lt;h2&gt;Multiple worker groups&lt;/h2&gt;
&lt;p&gt;We also took this opportunity to add support for &lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/operator_resources.html#daskworkergroup"&gt;multiple worker groups&lt;/a&gt; as a first class principle. Some workflows benefit from having a few workers in your cluster with some additional resources. This may be a couple of workers with much higher memory than the rest, or GPUs for accelerated compute. Using &lt;a class="reference external" href="https://distributed.dask.org/en/stable/resources.html"&gt;resource annotations&lt;/a&gt; you can steer certain tasks to those workers, so if you have a single step that creates a large amount of intermediate memory you can ensure that task ends up on a worker with enough memory.&lt;/p&gt;
&lt;p&gt;By default when you create a &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskCluster&lt;/span&gt;&lt;/code&gt; resource it creates a single &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskWorkerGroup&lt;/span&gt;&lt;/code&gt; which in turn creates the worker &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;Pod&lt;/span&gt;&lt;/code&gt; resources for our cluster. If you wish you can add more worker group resources yourself with different resource configurations.&lt;/p&gt;
&lt;img alt="Diagram of a DaskWorkerGroup resource and its child resources" src="/images/2022-kubernetes/daskworkergroup.png" style="max-width: 100%;" width="100%" /&gt;
&lt;p&gt;Here is an example of creating a cluster with five workers that have 16GB of memory and two additional workers with 64GB of memory.&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;dask_kubernetes.operator&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;KubeCluster&lt;/span&gt;

&lt;span class="n"&gt;cluster&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;KubeCluster&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;foo&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="n"&gt;n_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="n"&gt;resources&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
                          &lt;span class="s2"&gt;&amp;quot;requests&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;memory&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;16Gi&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                          &lt;span class="s2"&gt;&amp;quot;limits&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;memory&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;16Gi&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
                      &lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;cluster&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_worker_group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;highmem&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                         &lt;span class="n"&gt;n_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                         &lt;span class="n"&gt;resources&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
                             &lt;span class="s2"&gt;&amp;quot;requests&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;memory&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;64Gi&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                             &lt;span class="s2"&gt;&amp;quot;limits&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;memory&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;64Gi&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
                         &lt;span class="p"&gt;})&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;/section&gt;
&lt;section id="autoscaling"&gt;
&lt;h2&gt;Autoscaling&lt;/h2&gt;
&lt;p&gt;One of the much loved features of the classic implementation of &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;KubeCluster&lt;/span&gt;&lt;/code&gt; was adaptive autoscaling. When enabled the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;KubeCluster&lt;/span&gt;&lt;/code&gt; object would regularly communicate with the scheduler and ask if it wanted to change the number of workers and then add/remove pods accordingly.&lt;/p&gt;
&lt;p&gt;In the new implementation this logic has moved to the controller so the cluster can autoscale even when there is no &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;KubeCluster&lt;/span&gt;&lt;/code&gt; object in existence.&lt;/p&gt;
&lt;img alt="Diagram of a DaskAutoscaler resource and how it interacts with other resources" src="/images/2022-kubernetes/daskautoscaler.png" style="max-width: 100%;" width="100%" /&gt;
&lt;p&gt;The Python API remains the same so you can still use &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;KubeCluster&lt;/span&gt;&lt;/code&gt; to put your cluster into adaptive mode.&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;dask_kubernetes.operator&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;KubeCluster&lt;/span&gt;

&lt;span class="n"&gt;cluster&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;KubeCluster&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;my-cluster&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;cluster&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;adapt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;minimum&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;maximum&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This call creates a &lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/operator_resources.html#daskautoscaler"&gt;&lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskAutoscaler&lt;/span&gt;&lt;/code&gt; resource&lt;/a&gt; which the controller sees and periodically takes action on by asking the scheduler how many workers it wants and updating the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskWorkerGroup&lt;/span&gt;&lt;/code&gt; within the configured bounds.&lt;/p&gt;
&lt;div class="highlight-yaml notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="nt"&gt;apiVersion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;kubernetes.dask.org/v1&lt;/span&gt;
&lt;span class="nt"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;DaskAutoscaler&lt;/span&gt;
&lt;span class="nt"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;my-cluster&lt;/span&gt;
&lt;span class="nt"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;cluster&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;my-cluster&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;minimum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;1&lt;/span&gt;
&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="nt"&gt;maximum&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="l l-Scalar l-Scalar-Plain"&gt;100&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Calling &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;cluster.scale(5)&lt;/span&gt;&lt;/code&gt; will also delete this resource and set the number of workers back to 5.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="daskjob"&gt;
&lt;h2&gt;DaskJob&lt;/h2&gt;
&lt;p&gt;Having composable cluster resources also allows us to put together a &lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/operator_resources.html#daskjob"&gt;new &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskJob&lt;/span&gt;&lt;/code&gt; resource&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Kubernetes has some built-in &lt;a class="reference external" href="https://kubernetes.io/docs/concepts/workloads/controllers/job/"&gt;batch job style resources&lt;/a&gt; which ensure a &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;Pod&lt;/span&gt;&lt;/code&gt; is run to completion one or more times. You can control how many times is should run and how many concurrent pods there should be. This is useful for fire-and-forget jobs that you want to process a specific workload.&lt;/p&gt;
&lt;p&gt;The Dask Operator introduces a &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskJob&lt;/span&gt;&lt;/code&gt; resource which creates a &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskCluster&lt;/span&gt;&lt;/code&gt; alongside a single client &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;Pod&lt;/span&gt;&lt;/code&gt; which it attempts to run to completion. If the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;Pod&lt;/span&gt;&lt;/code&gt; exits unhappily it will be restarted until it returns a &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;0&lt;/span&gt;&lt;/code&gt; exit code, at which point the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskCluster&lt;/span&gt;&lt;/code&gt; is automatically cleaned up.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Diagram of a DaskJob resource and its child resources" src="/images/2022-kubernetes/daskjob.png"
style="max-width: 100%;" width="100%" /&gt;&lt;/p&gt;
&lt;p&gt;The client &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;Pod&lt;/span&gt;&lt;/code&gt; has all of the configuration for the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskCluster&lt;/span&gt;&lt;/code&gt; injected at runtime via environment variables, this means your client code doesn’t need to know anything about how the Dask cluster was constructed it just connects and makes use of it. This allows for excellent separation of concerns between your business logic and your deployment tooling.&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;dask.distributed&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Client&lt;/span&gt;

&lt;span class="c1"&gt;# We don&amp;#39;t need to tell the Client anything about the cluster as&lt;/span&gt;
&lt;span class="c1"&gt;# it will find everything it needs in the environment variables&lt;/span&gt;
&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Do some work...&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This new resource type is useful for some batch workflows, but also demonstrates how you could extend the Dask Operator with your own new resource types and hook them together with a controller plugin.&lt;/p&gt;
&lt;/section&gt;
&lt;section id="extensibility-and-plugins"&gt;
&lt;h2&gt;Extensibility and plugins&lt;/h2&gt;
&lt;p&gt;By moving to native Kubernetes resources and support for the YAML API power users can treat &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;DaskCluster&lt;/span&gt;&lt;/code&gt; resources (or any of the new Dask resources) as building blocks in larger applications. One of Kubernetes’s superpowers is managing everything as composable resources that can be combined to create complex and flexible applications.&lt;/p&gt;
&lt;p&gt;Does your Kubernetes cluster have an opinionated configuration with additional tools like &lt;a class="reference external" href="https://istio.io"&gt;Istio&lt;/a&gt; installed? Have you struggled in the last to integrate &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-kubernetes&lt;/span&gt;&lt;/code&gt; with your existing tooling because it relied on Python to create clusters?&lt;/p&gt;
&lt;p&gt;It’s increasingly common for users to need additional resources to be created alongside their Dask cluster like &lt;a class="reference external" href="https://istio.io/latest/docs/reference/config/networking/gateway/"&gt;Istio Gateway&lt;/a&gt; resources or &lt;a class="reference external" href="https://cert-manager.io/docs/concepts/certificate/"&gt;cert-manager Certificate&lt;/a&gt; resources. Now that everything in &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-kubernetes&lt;/span&gt;&lt;/code&gt; uses custom resources users can mix and match resources from many different operators to construct their application.&lt;/p&gt;
&lt;p&gt;If this isn’t enough you can also &lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/operator_extending.html"&gt;extend our custom controller&lt;/a&gt;. We built the controller with &lt;a class="reference external" href="https://kopf.readthedocs.io/en/stable/"&gt;kopf&lt;/a&gt; primarily because the Dask community is strong in Python and less so in Golang (the most common way to build operators). It made sense to play into our strengths rather than using the most popular option.&lt;/p&gt;
&lt;p&gt;This also means our users should be able to more easily modify the controller logic and we’ve included a plugin system that allows you to add extra logic rules by installing a custom package into the controller container image and registering them via entry points.&lt;/p&gt;
&lt;div class="highlight-python notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# Source for my_controller_plugin.plugin&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nn"&gt;kopf&lt;/span&gt;

&lt;span class="nd"&gt;@kopf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;on&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;service&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;labels&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;dask.org/component&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;scheduler&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nf"&gt;handle_scheduler_service_create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;new&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;namespace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
   &lt;span class="c1"&gt;# Do something here like create an Istio Gateway&lt;/span&gt;
   &lt;span class="c1"&gt;# See https://kopf.readthedocs.io/en/stable/handlers for documentation on what is possible here&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="highlight-toml notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c1"&gt;# pyproject.toml for my_controller_plugin&lt;/span&gt;

&lt;span class="k"&gt;[option.entry_points]&lt;/span&gt;
&lt;span class="n"&gt;dask_operator_plugin&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;
&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="err"&gt;my&lt;/span&gt;&lt;span class="mi"&gt;_&lt;/span&gt;&lt;span class="n"&gt;controller_plugin&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;my&lt;/span&gt;&lt;span class="mi"&gt;_&lt;/span&gt;&lt;span class="n"&gt;controller_plugin&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;plugin&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="highlight-dockerfile notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;span class="c"&gt;# Dockerfile&lt;/span&gt;

&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;ghcr.io/dask/dask-kubernetes-operator:2022.10.0&lt;/span&gt;

&lt;span class="k"&gt;RUN&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;pip&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;my-controller-plugin
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;That’s it, when the controller starts up it will also import all &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;&amp;#64;kopf&lt;/span&gt;&lt;/code&gt; methods from modules listed in the &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask_operator_plugin&lt;/span&gt;&lt;/code&gt; entry point alongside the core functionality.&lt;/p&gt;
&lt;aside class="system-message"&gt;
&lt;p class="system-message-title"&gt;System Message: WARNING/2 (&lt;span class="docutils literal"&gt;/opt/build/repo/2022/11/09/dask-kubernetes-operator.md&lt;/span&gt;, line 356)&lt;/p&gt;
&lt;p&gt;Document headings start at H2, not H1 [myst.header]&lt;/p&gt;
&lt;/aside&gt;
&lt;/section&gt;
&lt;/section&gt;
&lt;section id="migrating"&gt;
&lt;h1&gt;Migrating&lt;/h1&gt;
&lt;p&gt;One caveat to switching to the operator model is that you need to install the CRDs and controller on your Kubernetes before you can start using it. While a small hurdle this is a break in the user experience compared to the classic implementation.&lt;/p&gt;
&lt;div class="highlight-bash notranslate"&gt;&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;helm&lt;span class="w"&gt; &lt;/span&gt;repo&lt;span class="w"&gt; &lt;/span&gt;add&lt;span class="w"&gt; &lt;/span&gt;dask&lt;span class="w"&gt; &lt;/span&gt;https://helm.dask.org&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;helm&lt;span class="w"&gt; &lt;/span&gt;repo&lt;span class="w"&gt; &lt;/span&gt;update
kubectl&lt;span class="w"&gt; &lt;/span&gt;create&lt;span class="w"&gt; &lt;/span&gt;ns&lt;span class="w"&gt; &lt;/span&gt;dask-operator
helm&lt;span class="w"&gt; &lt;/span&gt;install&lt;span class="w"&gt; &lt;/span&gt;--namespace&lt;span class="w"&gt; &lt;/span&gt;dask-operator&lt;span class="w"&gt; &lt;/span&gt;dask-operator&lt;span class="w"&gt; &lt;/span&gt;dask/dask-kubernetes-operator
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;We also took this opportunity to make breaking changes to the constructor of &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;KubeCluster&lt;/span&gt;&lt;/code&gt; to simplify usage for beginners or folks who are happy with the default options. By adopting the YAML API power users can tinker and tweak to their hearts content without having to modify the Python library, so it made sense to make the Python library simpler and more pleasant to use for the majority of users.&lt;/p&gt;
&lt;p&gt;We made an explicit decision not to just replace the old &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;KubeCluster&lt;/span&gt;&lt;/code&gt; with the new one in place because people’s code will just stop working if we did. Instead we are asking folks to &lt;a class="reference external" href="https://kubernetes.dask.org/en/latest/kubecluster_migrating.html"&gt;read the migration guide&lt;/a&gt; and update your imports and construction code. Users of the classic cluster manager will start seeing a deprecation warning as of &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;2022.10.0&lt;/span&gt;&lt;/code&gt; and at some point the classic implementation will be removed all together. If migrating is challenging to do quickly you can always pin your &lt;code class="docutils literal notranslate"&gt;&lt;span class="pre"&gt;dask-kubernetes&lt;/span&gt;&lt;/code&gt; version, and from then on you are clearly not getting bug fixes or enhancements. But in all honesty those have been few and far between for the classic implementation lately anyway.&lt;/p&gt;
&lt;p&gt;We are optimistic that the new cleaner implementation, faster cluster startup times and bucket of new features is enough to convince you that it’s worth the migration effort.&lt;/p&gt;
&lt;p&gt;If you want some help migrating and the migration guide doesn’t cover your use case then don’t hesitate to &lt;a class="reference external" href="https://dask.discourse.group"&gt;reach out on the forum&lt;/a&gt;. We’ve also worked hard to ensure the new implementation has feature parity with the classic one, but if anything is missing or broken then please &lt;a class="reference external" href="https://github.com/dask/dask-kubernetes/issues"&gt;open an issue on GitHub&lt;/a&gt;.&lt;/p&gt;
&lt;/section&gt;
</content>
    <link href="https://blog.dask.org/2022/11/09/dask-kubernetes-operator/"/>
    <summary>We are excited to announce that the Dask Kubernetes Operator is now generally available 🎉!</summary>
    <category term="clusters" label="clusters"/>
    <category term="dask-kubernetes" label="dask-kubernetes"/>
    <category term="deployment" label="deployment"/>
    <category term="kubernetes" label="kubernetes"/>
    <published>2022-11-09T00:00:00+00:00</published>
  </entry>
</feed>
