Skip to content

[Enhancement] use Template together with JCEKS to mask JDBC password #1036

@kcychien

Description

@kcychien

When using Template to connect through JDBC, for example with https://cloud.google.com/dataproc-serverless/docs/templates/storage-to-jdbc, JDBC_CONNECTION_URL would require user to specify password=JDBC_PASSWORD which is in plaintext.

This is a security concern.

To solve the plaintext password issue, in non-Template serverless Spark on GCP Dataproc, this is done via https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html by hiding the password key in a jceks file, and read in the jceks file using "spark.hadoop.hadoop.security.credential.provider.path=[JCEKS_FILE_PATH]" in the --properties setting.
Note that you also need to specify the spark.hadoop.javax.jdo.option.ConnectionURL=[JDBC_CONNECTION_URL] properties setting.

However, for Templated serverless Spark on GCP Dataproc, as illustrated in https://cloud.google.com/dataproc-serverless/docs/templates/storage-to-jdbc, --templateProperty is used instead of --properties. Furthermore, gcs.jdbc.output.url=[JDBC_CONNECTION_URL] is a required parameter, but this cannot be used together with spark.hadoop.javax.jdo.option.ConnectionURL=[JDBC_CONNECTION_URL].

In other words, user cannot mask password in JDBC connection when using Templated serverless Spark on GCP Dataproc, and I ask for a Feature Request to make this possible.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions