Skip to content

Fall back to remote runtime on Spark Connect when the legacy namespace is unavailable#1469

Open
Divyansh-db wants to merge 5 commits into
mainfrom
fix/runtime-spark-connect-import
Open

Fall back to remote runtime on Spark Connect when the legacy namespace is unavailable#1469
Divyansh-db wants to merge 5 commits into
mainfrom
fix/runtime-spark-connect-import

Conversation

@Divyansh-db

@Divyansh-db Divyansh-db commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Summary

Importing databricks.sdk.runtime on a Spark Connect runtime (e.g. shared-access-mode clusters) no longer raises CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT at import time. When the legacy user namespace cannot be materialized, the import now logs a warning and falls back to the existing Spark Connect-compatible remote implementation, so WorkspaceClient() construction succeeds on such clusters.

Fixes #1463. Carries forward @sd-db's work from #1464 (closed because fork PRs in this repo cannot run tests).

Why

WorkspaceClient.__init__ eagerly builds dbutils via _make_dbutils, which on a cluster does from databricks.sdk.runtime import dbutils. That import calls UserNamespaceInitializer.getOrCreate().get_namespace_globals(), materializing a legacy SparkContext. On a Spark Connect cluster this raises CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT — a pyspark.errors.PySparkRuntimeError, not an ImportError — so the existing except ImportError: does not catch it and the error escapes the import, crashing WorkspaceClient construction before any API call. This is what databricks/dbt-databricks#1252 hits in Python models on shared clusters.

The existing except ImportError branch is already the Spark Connect-compatible path (it builds spark via DatabricksSession and dbutils via RemoteDbUtils), so this PR routes the materialization failure there.

A complementary follow-up — making WorkspaceClient.dbutils lazy via a cached_property so consumers that never touch it skip the build entirely — is noted in #1463 as a separate discussion since it touches generated code. Related issue #986 (off-cluster eager RemoteDbUtils auth failure) is the symmetric case and is intentionally not addressed here; the lazy-dbutils follow-up would unify both.

What changed

Behavioral changes

On a Spark Connect runtime, importing databricks.sdk.runtime now logs a WARNING and uses the remote implementation instead of raising at import time. When dbruntime is absent (off-cluster) or the namespace materializes successfully (classic runtime), behavior is unchanged.

Internal changes

databricks/sdk/runtime/__init__.py: the runtime-namespace block is restructured into a single try with sibling except ImportError (existing — "not in a classic runtime") and except Exception (new — Spark Connect / CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT, logged) clauses, plus an if not _use_runtime_namespace: guard over the existing — unchanged — OSS/remote block. The catch is intentionally broad rather than typed on PySparkRuntimeError to avoid pulling pyspark in at SDK import time just to narrow the exception type; the inline comment notes this.

How is this tested?

New tests/test_runtime.py simulates a Spark Connect runtime by injecting a fake dbruntime whose get_namespace_globals() raises CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT, and asserts that:

  • reloading databricks.sdk.runtime survives the failure and falls back (is_local_implementation is True, dbutils is not None)
  • WorkspaceClient(config=…) constructs without raising — the direct reproduction of the reported failure

Verified red→green locally. Full unit test suite (2098 tests) passes with no regressions.

sd-db and others added 2 commits June 8, 2026 14:08
…e is unavailable

On a Databricks shared-access-mode (Spark Connect) cluster, importing databricks.sdk.runtime (which happens when WorkspaceClient.__init__ eagerly builds dbutils) materializes a legacy SparkContext via UserNamespaceInitializer.get_namespace_globals() and raises CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT (a PySparkRuntimeError, not ImportError). The surrounding 'except ImportError' does not catch it, so the error escapes the import and crashes WorkspaceClient construction.

Treat a namespace-materialization failure the same as 'not in a classic runtime': log a warning and fall back to the existing OSS/remote implementation, which is Spark Connect-compatible (DatabricksSession + RemoteDbUtils).

Fixes #1463

Signed-off-by: Shubham Dhal <shubham.dhal@databricks.com>
Add a sentence to the inline comment explaining why the catch is broad
rather than typed on PySparkRuntimeError — avoids importing pyspark at
SDK import time and keeps unexpected runtime-namespace errors surfaced
as a warning + safe fallback instead of a constructor crash.
@Divyansh-db Divyansh-db force-pushed the fix/runtime-spark-connect-import branch from 3d8e600 to 61cc935 Compare June 9, 2026 15:14
@Divyansh-db Divyansh-db temporarily deployed to test-trigger-is June 9, 2026 15:14 — with GitHub Actions Inactive
@Divyansh-db Divyansh-db temporarily deployed to test-trigger-is June 9, 2026 15:15 — with GitHub Actions Inactive
Use monkeypatch.setitem for the sys.modules injection (auto-teardown
instead of manual save/restore), move the runtime reload into the
fixture so test bodies stay focused on the assertion, inline the fake
initializer, and strengthen the first assertion to isinstance(
RemoteDbUtils) so it explicitly proves the Spark Connect fallback path
was taken rather than just that some dbutils exists.
…tests

test_notebook_oauth.py caches a fake ``databricks.sdk.runtime`` directly
in ``sys.modules`` without going through the import machinery, which
leaves the ``runtime`` attribute on ``databricks.sdk`` unset. The
previous fixture's ``import databricks.sdk.runtime`` then hit the cached
fake (skipping the loader), and the follow-up ``importlib.reload(
databricks.sdk.runtime)`` died with AttributeError when CI happened to
run test_notebook_oauth.py first.

Drop the eager import + reload from the fixture; just delitem the stale
``sys.modules`` entry via monkeypatch so the next ``import`` in the test
body triggers a fresh load (which correctly sets both ``sys.modules``
and the parent attribute). Verified locally that the suite passes both
in isolation and when ordered after test_notebook_oauth.py.
@Divyansh-db Divyansh-db temporarily deployed to test-trigger-is June 9, 2026 16:45 — with GitHub Actions Inactive
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/sdk-py

Inputs:

  • PR number: 1469
  • Commit SHA: 655a577e8fdc5d5b298c0fedc66d95de38761bb7

Checks will be approved automatically on success.

@Divyansh-db Divyansh-db temporarily deployed to test-trigger-is June 9, 2026 16:46 — with GitHub Actions Inactive
@Divyansh-db Divyansh-db requested a review from hectorcast-db June 9, 2026 18:17
Divyansh-db added a commit that referenced this pull request Jun 9, 2026
Regenerated ``databricks/sdk/__init__.py`` with the updated template
(imports ``functools.cached_property``, drops the eager
``self._dbutils = _make_dbutils(self._config)`` from ``__init__``,
emits ``dbutils`` as a ``@cached_property`` that calls
``_make_dbutils`` on first access).

Adds four ``tests/test_client.py`` tests that lock in the contract:

- ``dbutils`` is a ``functools.cached_property`` descriptor on
  ``WorkspaceClient``.
- ``WorkspaceClient.__init__`` does not invoke ``_make_dbutils``.
- The first ``ws.dbutils`` read invokes ``_make_dbutils`` once;
  subsequent reads return the cached value without re-invoking.
- Constructing ``WorkspaceClient`` on a faked Spark Connect runtime
  (whose ``dbruntime`` raises ``CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT``
  on any namespace materialization) succeeds without importing
  ``databricks.sdk.runtime`` at all — the durable sidestep of
  databricks/dbt-databricks#1252.

Complements #1469 (which catches the same failure at runtime-module
import time as a defense-in-depth fallback).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WorkspaceClient construction fails on Spark Connect clusters: CONTEXT_UNAVAILABLE_FOR_REMOTE_CLIENT when importing databricks.sdk.runtime

3 participants