[FSTORE-2036] PR 4a/4 — Python SDK support for Unity Catalog OAuth M2M#966
Merged
jimdowling merged 5 commits intoMay 26, 2026
Merged
Conversation
3 tasks
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
||||||||||||||||||||||||||||||||||||||||||
fc02942 to
da4116c
Compare
Contributor
There was a problem hiding this comment.
Pull request overview
Extends the Python SDK’s UnityCatalogConnector model to support Unity Catalog OAuth M2M by round-tripping new OAuth-related fields and adding “secret present” booleans that reflect backend write-only secrets, while keeping legacy PAT-only construction working unchanged.
Changes:
- Add OAuth M2M fields (
auth_method,client_id,client_secret,oauth_endpoint,account_id,account_host) and write-only-friendly flags (has_access_token,has_client_secret) toUnityCatalogConnector. - Update UC connector fixtures to reflect backend write-only secret behavior and add OAuth workspace/account fixtures.
- Expand
TestUnityCatalogConnectorto cover PAT back-compat defaults and OAuth endpoint defaults + parsing.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| python/hsfs/storage_connector.py | Adds UC OAuth M2M fields + secret-presence booleans and exposes them via public properties. |
| python/tests/test_storage_connector.py | Updates UC parsing assertions (write-only token) and adds OAuth + defaulting tests. |
| python/tests/fixtures/storage_connector_fixtures.json | Updates UC GET fixture to new wire format and adds OAuth workspace/account fixtures. |
da4116c to
bb3b56b
Compare
8539c5c to
cb0242c
Compare
bubriks
reviewed
May 25, 2026
ecb45df to
d95b1cf
Compare
…ark reads https://hopsworks.atlassian.net/browse/FSTORE-2036 SDK half of FSTORE-2036. Extends UnityCatalogConnector to round-trip the new OAuth fields the backend (PR 1 / PR 2) and frontend (PR 3) added, and rewires the PySpark read path so the SDK calls Databricks directly for vended S3 temp-credentials. Legacy PAT-only construction keeps working unchanged. OAuth fields. The constructor gains auth_method, client_id, client_secret, oauth_endpoint, account_id, account_host, has_access_token, and has_client_secret. auth_method defaults to "PAT" when absent so existing code paths and fixtures that construct connectors with just access_token keep producing PAT connectors. When the caller asks for OAUTH_M2M without specifying oauth_endpoint, it defaults to "WORKSPACE", matching the frontend default. has_access_token and has_client_secret are write-only-friendly booleans: the server emits them on read so a caller can tell whether a secret is on file without ever seeing it. When constructed locally with a secret in hand, has_* falls back to "is the secret non-None" so client code that builds a connector in-process still reports the correct state. from_response_json keeps using humps.decamelize + **kwargs splat; the new fields are picked up by name. The existing get_unity_catalog fixture is updated to match the post-PR-1 backend wire format (hasAccessToken: true on read; no decrypted access_token in the response). Two new fixtures (get_unity_catalog_oauth_workspace, get_unity_catalog_oauth_account) cover the OAuth modes. PySpark read path. Mirrors the existing Python / Arrow-Flight architecture where the backend provides the bearer and flyingduck owns the data plane. The backend exposes a single new endpoint GET /storageconnectors/{name}/uc_bearer that returns {access_token, expires_in_seconds} with Cache-Control: no-store; the SDK takes that bearer, calls Databricks for the table metadata and vended temp-table-credentials, validates Delta + AWS, builds per-bucket S3A keys, and runs the Delta read locally. UnityCatalogConnector.read auto-detects Databricks-hosted Spark (cluster usage tags or DATABRICKS_RUNTIME_VERSION) and routes to spark.read.table() in that case; force_vended=True overrides the detection when the cluster identity lacks the SP's grants. _assert_delta_extension_loaded surfaces an actionable error when the SparkSession was built without the Delta extension, with a copy-paste fix block. Reviewed-by: GitHub Copilot <Copilot@users.noreply.github.com> Signed-off-by: Jim Dowling <jim@logicalclocks.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
d95b1cf to
d740327
Compare
bubriks
approved these changes
May 26, 2026
…tore-2036-uc-oauth-m2m-sdk
https://hopsworks.atlassian.net/browse/FSTORE-2036 Post-merge formatting fixup for the CI ruff format check. Signed-off-by: Jim Dowling <jim@logicalclocks.com> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…hopsworks-api into fstore-2036-uc-oauth-m2m-sdk
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR 4 of 4 for FSTORE-2099, hopsworks-api half. Extends
UnityCatalogConnectorso the Python SDK round-trips the new OAuth fields the backend (#3032 / #3033) and frontend (logicalclocks/hopsworks-front#1919) added. Legacy PAT-only construction keeps working unchanged.Companion loadtest PR: logicalclocks/loadtest (link in this thread once the loadtest PR is opened).
Spec:
uc-oauth2/uc-oauth2.mdin the per-feature workspace. Ticket: https://hopsworks.atlassian.net/browse/FSTORE-2099What changes
auth_method,client_id,client_secret,oauth_endpoint,account_id,account_host,has_access_token,has_client_secret.auth_methoddefaults to"PAT"when absent so existing code paths (and downstream fixtures) keep producing PAT connectors.OAUTH_M2Mwithout an explicitoauth_endpointdefaults to"WORKSPACE", matching the frontend default.has_access_tokenandhas_client_secretcome from the server (hasAccessToken/hasClientSecretin camelCase). When a caller builds a connector locally with a secret in hand, the booleans fall back to "is the secret non-None" so client code that constructs in-process still reports correct state.from_response_jsonis unchanged — it already useshumps.decamelize+**kwargssplat, which picks up the new fields by name once they're declared on the constructor.get_unity_catalogfixture updated to match the post-PR-1 backend wire format (hasAccessToken: true; no decryptedaccess_tokenin GET responses). Two new fixtures (get_unity_catalog_oauth_workspace,get_unity_catalog_oauth_account) cover the OAuth modes.TestUnityCatalogConnector— round-trip for both OAuth modes; legacy construction defaulting to PAT; OAuth construction defaultingoauth_endpointtoWORKSPACE.Test plan
uv run pytest python/tests/test_storage_connector.py::TestUnityCatalogConnector— 8/8 passing.uv run ruff check— clean.uv run docsig python/hsfs/storage_connector.py— clean.🤖 Generated with Claude Code