Open
Description
I didn't find a simple way to pass credentials to remote workers, such as S3 and GCS, while both are widely used to store data frames.
In this ticket's scope, I propose creating plugins that will help distribute the required keys to remote workers.
GCP credentials
GCP credentials file path is stored in GOOGLE_APPLICATION_CREDENTIALS
env variable. The plugin has to create a remote file and pass an env variable with a proper path to workers.
S3 credentials
Like GCP, we must update credential files and store them on each worker.
PR: #439
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
Create credentials upload plugins for GCP and AWS (dask#438)
jacobtomlinson commentedon Oct 7, 2024
Usually you would create an IAM instance role and profile that can access S3, then configure workers to have this role via the
iam_instance_profile
keyword argument.The GCP equivalent is to create a service account that can access GCS and configure that with the
service_account
kwarg.This way you don't have to pass credentials around. Is there a reason why you aren't doing it this way?
dbalabka commentedon Feb 19, 2025
@jacobtomlinson, sorry for not being active the last few months because of workload and vacation. Thanks for the question.
You are correct that using a proper service account or IAM role/profile is the more secure way and preferable for production workloads. However, I have a few scenarios when dynamically uploading the key can be more convenient.
For local development, the recommended way for GCP cloud is to use Application Default Credentials acquired with
gcloud auth application-default login
command. Previously, I provided changes and a detailed description in #429. ADC is associated with developers user account that would be preferable to reuse in workers. Otherwise, we have to create a separate service account key for each developer or automate the creation of such a key.If dask deployed on-prem Kubernetes during local development, uploading the key is the most convenient way to provide the key to workers. Otherwise, we have to keep them in Secrets and mount them separately. However, such an approach is more suitable for production workloads.
jacobtomlinson commentedon Feb 24, 2025
I see. So we could create a plugin for the client which grabs those credentials and propagates them to the workers. Do you have any interest in implementing such a plugin?
dbalabka commentedon Mar 1, 2025
@jacobtomlinson, right. I've submitted a PR #439. PR contains source of two separate plugins for AWS and GCP that we are already using. Both provide very convenient way to push required credentials to workers. Developer simply needs to add both plugins and no configuration needed.