This post describes how to deploy the Dask Operator for Kubernetes alongside a Helm-based JupyterHub installation. The Operator provides a Kubernetes-native way to create and manage Dask clusters via custom resources, simplifying multi-tenant setups, it is therefore more integrated into the Kubernetes ecosystem compared to Dask Gateway.
These commands target a Jetstream 2 Magnum deployment. The upstream install docs are at https://kubernetes.dask.org/en/latest/installing.html; this is the condensed version aligned with the JupyterHub setup on Jetstream 2.
Preparation
Assumptions:
- Helm-based JupyterHub deployment already running.
helmandkubectlconfigured with cluster-admin permissions.- SSL termination in place
Install the Dask Operator
Install the Operator (CRDs + controller) into its own namespace:
helm install \
--repo https://helm.dask.org \
--create-namespace \
-n dask-operator \
--generate-name \
dask-kubernetes-operatorHelm prints the release name (for example, dask-kubernetes-operator-1762815110). Keep it if you plan to upgrade later instead of reinstalling.
Validate the controller
Confirm the controller pod comes up cleanly:
kubectl get pods -A -l app.kubernetes.io/name=dask-kubernetes-operatorIf you do not see the pod, verify the CRDs are present:
kubectl get crds | grep daskPrepare a service account for JupyterHub
git clone https://github.com/zonca/jupyterhub-deploy-kubernetes-jetstream.git
cd jupyterhub-deploy-kubernetes-jetstream
cd dask_operator
kubectl apply -f dask_cluster_role.yamlConfigure JupyterHub
Apply it via the existing Helm deployment script. First edit install_jhub.sh in the deployment repository and add the extra --values dask_operator/jupyterhub_config.yaml flag to the helm upgrade --install line (alongside any other --values files you already use). Then run:
bash install_jhub.shCreate a Dask cluster from JupyterHub
From JupyterHub, start a new notebook server and open a Python notebook.
In the first cell, install dask-kubernetes (this ensures the Operator Python client is available inside the user environment):
%pip install dask-kubernetesIf you want to avoid running %pip install dask-kubernetes every time a pod restarts, build your JupyterHub spawn image with dask-kubernetes already included so the dependency is available as soon as the notebook starts.
In the next cell, create a Dask cluster using the Operator. The namespace must match the JupyterHub namespace so the spawned pods are visible and accessible to the user environment:
from dask_kubernetes.operator import KubeCluster
cluster = KubeCluster(
name="my-dask-cluster",
namespace="jhub", # important: JupyterHub namespace
image="ghcr.io/dask/dask:latest", # use the same image set in config_standard_storage.yaml
)
cluster.scale(10)The image must match the one specified for the JupyterHub single-user pods (for example in config_standard_storage.yaml). Mixing images leads to client/scheduler/worker incompatibilities that are hard to debug.
Finally, display the cluster object to get a realtime view on the number of workers:
clusterSpecify worker CPU and memory
Simpler: pass a resources dict directly to KubeCluster. This sets requests/limits for both scheduler and the default worker group. Then scale.
from dask_kubernetes.operator import KubeCluster
cluster = KubeCluster(
name="my-dask-cluster-resources",
namespace="jhub",
resources={
"requests": {"cpu": "2", "memory": "2Gi"},
"limits": {"cpu": "2", "memory": "2Gi"},
},
n_workers=0, # start with zero, then scale explicitly
)
cluster.scale(5) # create 5 workers with 2 CPU / 2Gi each
clusterDifferent sizes? Add another worker group with its own resources:
cluster.add_worker_group(
name="highmem",
n_workers=2,
resources={"requests": {"memory": "8Gi"}, "limits": {"memory": "8Gi"}},
)
cluster.scale(4, worker_group="highmem") # scale that groupUse cluster.scale(N, worker_group="groupname") to change replicas for a specific group later.
Notice that if you have the the Kubernetes Autocaler enabled in your cluster, it will automatically scale the number of nodes to accommodate the requested resources.
When you are done with the cluster, shut it down cleanly:
cluster.close()Access the dask dashboard
One of the most important features of Dask is its dashboard, which provides real-time insight into distributed calculations. To visualize the Dask dashboard from within JupyterHub, we need to use jupyter-server-proxy.
A critical point is that jupyter-server-proxy must be baked into the single-user Docker image that JupyterHub spawns. This is because the proxy needs to be available when the user’s pod starts. Installing it with %pip install inside a running notebook will not work, as the server proxy components are not loaded dynamically.
Ensure install_jhub.sh also passes --values dask_operator/jupyterhub_dask_dashboard_config.yaml to the helm upgrade --install command so JupyterHub is configured to expose the Dask dashboard through the proxy.
I have created an example single-user image derived from scipy-notebook that includes jupyter-server-proxy and other useful packages for scientific computing. The repository is available at github.com/zonca/jupyterhub-dask-docker-image. The image is automatically built using GitHub Actions and hosted on the GitHub Container Registry (which is a container registry similar to Docker Hub or Quay.io).
The list of available image tags is at github.com/zonca/jupyterhub-dask-docker-image/pkgs/container/jupyterhub-dask-docker-image.
If you want the simplest maintenance flow for your own image, start from the custom template and just edit requirements.txt; a GitHub Actions workflow will rebuild and publish automatically.
To use one of these images in your JupyterHub deployment, you need to update your config_standard_storage.yaml file for the JupyterHub Helm chart. For example, to use a specific image tag, you would add:
singleuser:
image:
name: ghcr.io/zonca/jupyterhub-dask-docker-image
tag: "2025-11-18-018e8a6"After updating your JupyterHub configuration to use a compatible image, when you create a KubeCluster and print the cluster object in your notebook, it will now include a clickable link to the Dask dashboard. The link will have a format similar to this:
/user/YOURUSER/proxy/my-dask-cluster-scheduler.jhub:8787/status
This allows you to directly access the dashboard and monitor your Dask cluster’s activity.