Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions .github/ISSUE_TEMPLATE/bug-report.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,25 @@ description: Report a bug or an issue you've found
title: "[Bug] <title>"
labels: ["type:bug", "triage:product"]
body:
- type: markdown
attributes:
value: |
## Issues and bug bashing

> ⚠️ Here's how you can get your issue triaged and fixed ASAP

In the age of AI, we should be innovating and shipping daily.
So in this adapter, we try to fix bugs and ship features fast - and keep a high bar of test coverage.

Once you open an issue, please - try to be **as descriptive as possible** to give our human/AI maintainers the necessary details to reproduce the issue.

For example - if you can - create a dummy repro dbt project - so the maintainer can reproduce your problem ASAP - see an example a well-written GitHub issue [here](https://github.com/microsoft/dbt-fabricspark/issues/172#issuecomment-4426907640).

Once we have a fix identified, to gain more confidence, we might ask you install the adapter right from a PR branch to ensure the repro is gone, like so:

```bash
pip install git+https://github.com/microsoft/dbt-fabricspark.git@dev/somebranch/172
```
- type: markdown
attributes:
value: Thanks for taking the time to fill out this bug report!
Expand Down
20 changes: 20 additions & 0 deletions .github/ISSUE_TEMPLATE/regression-report.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,26 @@ description: Report a regression you've observed in a newer version
title: "[Regression] <title>"
labels: ["type:regression", "triage:product"]
body:
- type: markdown
attributes:
value: |
## Issues, bug-bashing, help us help you

> ⚠️ Here's how you can get your issue triaged and fixed ASAP

In the age of AI, we should be innovating and shipping **high-quality** software daily.

So in this adapter, we try to fix bugs and ship features fast - and keep an extremely high bar for test coverage in CI before PRs merge to `main`.

Once you open an issue, please - try to be **as descriptive as possible** to give our human/AI maintainers the necessary details to reproduce the issue rapidly.

For example - if you can - create a dummy repro dbt project in your GitHub account - so the maintainers can reproduce your problem ASAP - see an example a well-written GitHub issue [here](https://github.com/microsoft/dbt-fabricspark/issues/172#issuecomment-4426907640).

If the issue is complex - once we have a fix identified, to gain more confidence, we might ask you install the adapter right from a PR branch to ensure the repro is gone in your setup as well, like so:

```bash
pip install git+https://github.com/microsoft/dbt-fabricspark.git@dev/somebranch/123
```
- type: markdown
attributes:
value: Thanks for taking the time to fill out this regression report!
Expand Down
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,11 @@
# Changelog

## v1.12.2

### Bug Fixes

- Fixed every fresh Livy session being forced onto an on-demand Spark cluster instead of a Fabric starter pool. `session_idle_timeout` no longer defaults to `"30m"`; the adapter now omits `spark.livy.session.idle.timeout` from the session `conf` unless the user explicitly sets a value. Fabric treats that key as session-immutable, so its mere presence (even matching the pool's own default) emitted `FallbackReasons: UserSparkConfigMismatch` and added ~3 min of cold-start per cold session (vs ~40 s on a warm starter pool). Existing profiles that set `session_idle_timeout` explicitly continue to behave as before, with the same starter-pool trade-off (#184)

## v1.12.1

### Bug Fixes
Expand Down
8 changes: 4 additions & 4 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,10 +62,10 @@ Therefore, if the tests pass locally, they are highly likely to pass in CI as we

1. Install recommended developer tooling (optional):

```bash
curl -fsSL https://gh.io/copilot-install | bash
$HOME/.local/bin/copilot --yolo
```
```bash
curl -fsSL https://gh.io/copilot-install | bash
$HOME/.local/bin/copilot --yolo
```

1. Login to github and ensure to authorize `Microsoft` if you're an employee (optional):

Expand Down
98 changes: 59 additions & 39 deletions README.md

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion src/dbt/adapters/fabricspark/__version__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
version = "1.12.1"
version = "1.12.2"
8 changes: 7 additions & 1 deletion src/dbt/adapters/fabricspark/credentials.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,13 @@ class FabricSparkCredentials(Credentials):
environmentId: Optional[str] = None
session_id_file: Optional[str] = None
reuse_session: bool = False # When True, Fabric sessions are kept alive and reused across runs
session_idle_timeout: str = "30m" # Livy session idle timeout (e.g. "30m", "1h")
# Default ``None`` so the adapter does NOT send
# ``spark.livy.session.idle.timeout`` in the session ``conf``. Fabric
# treats that key as session-immutable, so its presence (even when the
# value matches the pool default) disqualifies starter-pool matching and
# forces on-demand cold start. Set explicitly (e.g. ``"30m"``) only when
# the trade-off is acceptable.
session_idle_timeout: Optional[str] = None

# High-concurrency Livy. When True (default), each dbt thread acquires
# its own REPL inside a single underlying Livy session shared via a
Expand Down
43 changes: 43 additions & 0 deletions tests/unit/test_concurrent_livy.py
Original file line number Diff line number Diff line change
Expand Up @@ -409,3 +409,46 @@ def test_execute_strips_trailing_semicolon(self):
with patch.object(HighConcurrencyCursor, "execute") as mock_exec:
wrapper.execute("SELECT 1;")
mock_exec.assert_called_once_with("SELECT 1")


# --------------------------------------------------------------------------- #
# _build_acquire_payload — session_idle_timeout injection #
# --------------------------------------------------------------------------- #


class TestBuildAcquirePayloadIdleTimeout:
"""Guard rails for the starter-pool fallback bug.

Fabric treats ``spark.livy.session.idle.timeout`` as a session-immutable
SparkConf; its mere presence in the acquire ``conf`` disqualifies
starter-pool matching. The adapter must therefore omit the key unless
the user has explicitly opted in by setting ``session_idle_timeout``.
"""

def test_default_credentials_omit_idle_timeout(self):
creds = _make_creds()
hc = HighConcurrencySession(creds, creds.spark_config)
payload = hc._build_acquire_payload()
assert "spark.livy.session.idle.timeout" not in payload.get("conf", {})

def test_empty_string_idle_timeout_omits_key(self):
creds = _make_creds(session_idle_timeout="")
hc = HighConcurrencySession(creds, creds.spark_config)
payload = hc._build_acquire_payload()
assert "spark.livy.session.idle.timeout" not in payload.get("conf", {})

def test_explicit_idle_timeout_injects_key(self):
creds = _make_creds(session_idle_timeout="45m")
hc = HighConcurrencySession(creds, creds.spark_config)
payload = hc._build_acquire_payload()
assert payload["conf"]["spark.livy.session.idle.timeout"] == "45m"

def test_environment_id_still_injects_when_idle_timeout_omitted(self):
creds = _make_creds(environmentId="11111111-2222-3333-4444-555555555555")
hc = HighConcurrencySession(creds, creds.spark_config)
payload = hc._build_acquire_payload()
assert (
payload["conf"]["spark.fabric.environment.id"]
== "11111111-2222-3333-4444-555555555555"
)
assert "spark.livy.session.idle.timeout" not in payload["conf"]
18 changes: 18 additions & 0 deletions tests/unit/test_credentials.py
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,24 @@ def test_credentials_type() -> None:
assert credentials.type == "fabricspark"


def test_credentials_session_idle_timeout_defaults_to_none() -> None:
"""Default ``session_idle_timeout`` must be falsy.

A truthy default would inject ``spark.livy.session.idle.timeout`` into
the Livy session ``conf``, which Fabric treats as session-immutable —
disqualifying starter-pool matching and forcing on-demand cold start.
"""
credentials = FabricSparkCredentials(
method="livy",
authentication="CLI",
lakehouse="tests",
workspaceid="1de8390c-9aca-4790-bee8-72049109c0f4",
lakehouseid="8c5bc260-bc3a-4898-9ada-01e433d461ba",
spark_config={"name": "test-session"},
)
assert not credentials.session_idle_timeout


def test_credentials_database_defaults_to_lakehouse() -> None:
"""Test that database is always derived from lakehouse name."""
credentials = FabricSparkCredentials(
Expand Down
57 changes: 55 additions & 2 deletions tests/unit/test_livysession.py
Original file line number Diff line number Diff line change
Expand Up @@ -490,9 +490,9 @@ def test_custom_session_file_path(self):
assert credentials.resolved_session_id_file == "/custom/path/my-session.txt"


def _make_fabric_credentials(reuse_session=False, session_id_file=None):
def _make_fabric_credentials(reuse_session=False, session_id_file=None, **overrides):
"""Helper to create Fabric mode credentials for tests."""
return FabricSparkCredentials(
base = dict(
method="livy",
livy_mode="fabric",
authentication="CLI",
Expand All @@ -504,6 +504,8 @@ def _make_fabric_credentials(reuse_session=False, session_id_file=None):
reuse_session=reuse_session,
session_id_file=session_id_file,
)
base.update(overrides)
return FabricSparkCredentials(**base)


class TestFabricSessionReuseMode:
Expand Down Expand Up @@ -672,6 +674,57 @@ def test_connect_fabric_routes_to_fresh_when_flag_not_set(self, mock_headers):
mock_reuse.assert_not_called()


class TestCreateFabricSessionIdleTimeout:
"""Guard rails for the starter-pool fallback bug on the singleton path.

Fabric treats ``spark.livy.session.idle.timeout`` as a session-immutable
SparkConf, so its mere presence in the POST ``/sessions`` ``conf``
disqualifies starter-pool matching and forces an on-demand cold start.
"""

def setup_method(self):
LivySessionManager.livy_global_session = None

def _capture_spark_config(self, credentials):
captured: dict = {}

def _capture(spark_config):
captured["spark_config"] = spark_config

with patch(
"dbt.adapters.fabricspark.livysession.LivySession.create_session",
side_effect=_capture,
):
LivySessionManager._create_fabric_session(credentials, {"name": "test-session"})
return captured["spark_config"]

def test_default_credentials_omit_idle_timeout(self):
credentials = _make_fabric_credentials()
spark_config = self._capture_spark_config(credentials)
assert "spark.livy.session.idle.timeout" not in spark_config.get("conf", {})

def test_empty_string_idle_timeout_omits_key(self):
credentials = _make_fabric_credentials(session_idle_timeout="")
spark_config = self._capture_spark_config(credentials)
assert "spark.livy.session.idle.timeout" not in spark_config.get("conf", {})

def test_explicit_idle_timeout_injects_key(self):
credentials = _make_fabric_credentials(session_idle_timeout="45m")
spark_config = self._capture_spark_config(credentials)
assert spark_config["conf"]["spark.livy.session.idle.timeout"] == "45m"

def test_environment_id_still_injects_when_idle_timeout_omitted(self):
credentials = _make_fabric_credentials(
environmentId="11111111-2222-3333-4444-555555555555"
)
spark_config = self._capture_spark_config(credentials)
assert (
spark_config["conf"]["spark.fabric.environment.id"]
== "11111111-2222-3333-4444-555555555555"
)
assert "spark.livy.session.idle.timeout" not in spark_config["conf"]


class TestFixBinding:
"""Tests for LivySessionConnectionWrapper._fix_binding."""

Expand Down