Fall back to remote runtime on Spark Connect when the legacy namespace is unavailable#1469
Open
Divyansh-db wants to merge 5 commits into
Open
Fall back to remote runtime on Spark Connect when the legacy namespace is unavailable#1469Divyansh-db wants to merge 5 commits into
Divyansh-db wants to merge 5 commits into
Conversation
…e is unavailable On a Databricks shared-access-mode (Spark Connect) cluster, importing databricks.sdk.runtime (which happens when WorkspaceClient.__init__ eagerly builds dbutils) materializes a legacy SparkContext via UserNamespaceInitializer.get_namespace_globals() and raises CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT (a PySparkRuntimeError, not ImportError). The surrounding 'except ImportError' does not catch it, so the error escapes the import and crashes WorkspaceClient construction. Treat a namespace-materialization failure the same as 'not in a classic runtime': log a warning and fall back to the existing OSS/remote implementation, which is Spark Connect-compatible (DatabricksSession + RemoteDbUtils). Fixes #1463 Signed-off-by: Shubham Dhal <shubham.dhal@databricks.com>
Add a sentence to the inline comment explaining why the catch is broad rather than typed on PySparkRuntimeError — avoids importing pyspark at SDK import time and keeps unexpected runtime-namespace errors surfaced as a warning + safe fallback instead of a constructor crash.
3d8e600 to
61cc935
Compare
Use monkeypatch.setitem for the sys.modules injection (auto-teardown instead of manual save/restore), move the runtime reload into the fixture so test bodies stay focused on the assertion, inline the fake initializer, and strengthen the first assertion to isinstance( RemoteDbUtils) so it explicitly proves the Spark Connect fallback path was taken rather than just that some dbutils exists.
…tests test_notebook_oauth.py caches a fake ``databricks.sdk.runtime`` directly in ``sys.modules`` without going through the import machinery, which leaves the ``runtime`` attribute on ``databricks.sdk`` unset. The previous fixture's ``import databricks.sdk.runtime`` then hit the cached fake (skipping the loader), and the follow-up ``importlib.reload( databricks.sdk.runtime)`` died with AttributeError when CI happened to run test_notebook_oauth.py first. Drop the eager import + reload from the fixture; just delitem the stale ``sys.modules`` entry via monkeypatch so the next ``import`` in the test body triggers a fresh load (which correctly sets both ``sys.modules`` and the parent attribute). Verified locally that the suite passes both in isolation and when ordered after test_notebook_oauth.py.
|
If integration tests don't run automatically, an authorized user can run them manually by following the instructions below: Trigger: Inputs:
Checks will be approved automatically on success. |
Divyansh-db
added a commit
that referenced
this pull request
Jun 9, 2026
Regenerated ``databricks/sdk/__init__.py`` with the updated template (imports ``functools.cached_property``, drops the eager ``self._dbutils = _make_dbutils(self._config)`` from ``__init__``, emits ``dbutils`` as a ``@cached_property`` that calls ``_make_dbutils`` on first access). Adds four ``tests/test_client.py`` tests that lock in the contract: - ``dbutils`` is a ``functools.cached_property`` descriptor on ``WorkspaceClient``. - ``WorkspaceClient.__init__`` does not invoke ``_make_dbutils``. - The first ``ws.dbutils`` read invokes ``_make_dbutils`` once; subsequent reads return the cached value without re-invoking. - Constructing ``WorkspaceClient`` on a faked Spark Connect runtime (whose ``dbruntime`` raises ``CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT`` on any namespace materialization) succeeds without importing ``databricks.sdk.runtime`` at all — the durable sidestep of databricks/dbt-databricks#1252. Complements #1469 (which catches the same failure at runtime-module import time as a defense-in-depth fallback).
hectorcast-db
approved these changes
Jun 10, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Importing
databricks.sdk.runtimeon a Spark Connect runtime (e.g. shared-access-mode clusters) no longer raisesCONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENTat import time. When the legacy user namespace cannot be materialized, the import now logs a warning and falls back to the existing Spark Connect-compatible remote implementation, soWorkspaceClient()construction succeeds on such clusters.Fixes #1463. Carries forward @sd-db's work from #1464 (closed because fork PRs in this repo cannot run tests).
Why
WorkspaceClient.__init__eagerly buildsdbutilsvia_make_dbutils, which on a cluster doesfrom databricks.sdk.runtime import dbutils. That import callsUserNamespaceInitializer.getOrCreate().get_namespace_globals(), materializing a legacySparkContext. On a Spark Connect cluster this raisesCONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT— apyspark.errors.PySparkRuntimeError, not anImportError— so the existingexcept ImportError:does not catch it and the error escapes the import, crashingWorkspaceClientconstruction before any API call. This is what databricks/dbt-databricks#1252 hits in Python models on shared clusters.The existing
except ImportErrorbranch is already the Spark Connect-compatible path (it buildssparkviaDatabricksSessionanddbutilsviaRemoteDbUtils), so this PR routes the materialization failure there.A complementary follow-up — making
WorkspaceClient.dbutilslazy via acached_propertyso consumers that never touch it skip the build entirely — is noted in #1463 as a separate discussion since it touches generated code. Related issue #986 (off-cluster eagerRemoteDbUtilsauth failure) is the symmetric case and is intentionally not addressed here; the lazy-dbutils follow-up would unify both.What changed
Behavioral changes
On a Spark Connect runtime, importing
databricks.sdk.runtimenow logs aWARNINGand uses the remote implementation instead of raising at import time. Whendbruntimeis absent (off-cluster) or the namespace materializes successfully (classic runtime), behavior is unchanged.Internal changes
databricks/sdk/runtime/__init__.py: the runtime-namespace block is restructured into a singletrywith siblingexcept ImportError(existing — "not in a classic runtime") andexcept Exception(new — Spark Connect /CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT, logged) clauses, plus anif not _use_runtime_namespace:guard over the existing — unchanged — OSS/remote block. The catch is intentionally broad rather than typed onPySparkRuntimeErrorto avoid pullingpysparkin at SDK import time just to narrow the exception type; the inline comment notes this.How is this tested?
New
tests/test_runtime.pysimulates a Spark Connect runtime by injecting a fakedbruntimewhoseget_namespace_globals()raisesCONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT, and asserts that:databricks.sdk.runtimesurvives the failure and falls back (is_local_implementation is True,dbutils is not None)WorkspaceClient(config=…)constructs without raising — the direct reproduction of the reported failureVerified red→green locally. Full unit test suite (2098 tests) passes with no regressions.