Skip to content

[GCP] Reference architecture should use private connectivity for Google APIs #224

@mwkaufman

Description

@mwkaufman

Is your feature request related to a problem? Please describe.
Customers are deploying following patterns here but not realizing there are more steps needed to ensure traffic to Google's storage buckets from their workspace go over private connectivity. While Google Private Access is enabled for the subnet here: https://github.com/databricks/terraform-databricks-sra/blob/main/gcp/modules/workspace_deployment/vpc.tf#L16 There are additional steps to actually use this. Namely a private DNS zone for googleapis.com to resolve Google APIs to different IPs for private access or to use a PSC endpoint. As is, calls to storage.googleapis.com will resolve to their default public IPs and follow whatever default internet route you have, which could hit (and overwhelm) an on-prem firewall for example.

Describe the solution you'd like
The reference architecture should use the Google Private Access that is enabled on the subnet for talking to Google APIs. Google has a good reference implementation for using a PSC endpoint for Google APIs here: https://github.com/terraform-google-modules/terraform-google-network/tree/main/modules/private-service-connect

Describe alternatives you've considered
Alternatively, there should at least be a callout or discussion of this in the README.

Additional context
Just trying to accelerate Databricks adoption on GCP.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions