Skip to content

Add Kubernetes deployment guide for Dask clusters #1324

@mithr4ndir

Description

@mithr4ndir

Hello! First time contributor here. I hope to help the community with my knowledge around infrastructure. I have my own Kubernetes and Proxmox home lab, and manage it all via code (Terraform, Ansible, ArgoCD). This work should help others quickly set up containerized environments to get going with lsdb!

Proposal

Add documentation for deploying lsdb on a Dask Kubernetes cluster using the Dask Kubernetes Operator.

Per the discussion in #1323, an official container image may not be the best fit since users have varied environment needs. However, a deployment guide showing how to run lsdb on Kubernetes with Dask would be valuable — especially for users who want to scale beyond a single machine.

What the guide would cover

  • Installing the Dask Kubernetes Operator via Helm
  • Creating a DaskCluster CR (scheduler + workers)
  • Connecting lsdb to a Dask cluster and running distributed queries
  • Resource sizing recommendations (memory, CPU, storage for HATS catalogs)
  • Example Kubernetes manifests (DaskCluster, DaskAutoscaler, PVC for catalog storage)
  • Tips for monitoring Dask on K8s (dashboard access, Prometheus metrics)

This is based on a working deployment I have running in my home lab — 3-node K8s cluster with the Dask Operator, validated with spatial queries, crossmatches, and catalog operations.

Related: #1005 (container environment requests), #1285 (Dask execution patterns)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions