Skip to content

Documentation notes for dask-operator, and service account permission problems #521

Open
@zulissi

Description

@zulissi

Thanks for the great work on the dask-operator!

Three quick notes from trying it on our cluster:

  • The scaling of workers assumes that each worker pod only has one dask process. If there are multiple processes the worker-names (in dask) are something like ...-ef8274c-0,...-ef8274c-1 to mark which process they belong to, and scaling down will fail as the pod name will be ...-ef8274c, so scaling down will fail. A note in the documentation that one process per pod is required would be helpful. If you have multiple processes daskworkergroup scale-up works but scale-down fails.
  • If you use NodePort (as in the examples), the default service account created during the latest helm install doesn't have permission to list nodes and can lead to errors. ClusterIP works great.
  • The kubeflow patch script (kubectl patch clusterrole kubeflow-kubernetes-edit --patch '{"rules": [{"apiGroups": ["kubernetes.dask.org"],"resources": ["*"],"verbs": ["*"]}]}') over-writes the kubeflow permissions rather than adding the dask permissions. I'm not sure what the right way to patch-by-adding is.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions