Skip to content

fix: cross-workspace view/snapshot/MLV materializations omit workspace prefix (#172)#182

Merged
mdrakiburrahman merged 6 commits into
mainfrom
dev/mdrrahman/172
May 15, 2026
Merged

fix: cross-workspace view/snapshot/MLV materializations omit workspace prefix (#172)#182
mdrakiburrahman merged 6 commits into
mainfrom
dev/mdrrahman/172

Conversation

@mdrakiburrahman

Copy link
Copy Markdown
Collaborator

Summary

Fixes #172.

The cross-workspace 4-part naming feature (introduced in #167/#168) forwards
workspace_name from a model's config() onto target_relation for table
and incremental materializations only. View, snapshot, and
materialized_lake_view materializations built their target relation via
paths that drop the workspace, emitting 3-part DDL against a 4-part target:

create or replace view `OtherLakehouse`.`dbt`.test_model as ...
--> Database Error: Artifact not found: `MainWorkspace`.`OtherLakehouse`

Manual reproduction (issue scenario)

Verified on a live Fabric tenant with two schema-enabled lakehouses (one in
WS1, one in WS2). Repro project mirrors @cheyney-w's
dbt-fabricspark-cross-workspace-demo
models:

Before the fix:

1 of 1 START sql view model dbt Fabric Spark 2.<WS2_LH>.repro.view_in_other  [RUN]
1 of 1 ERROR creating sql view model dbt Fabric Spark 2.<WS2_LH>.repro.view_in_other  [ERROR]
  Runtime Error: Artifact not found: `dbt Fabric Spark 1`.`<WS2_LH>`

Same error for dbt snapshot against the same cross-workspace target.

After the fix:

1 of 1 OK created sql view model dbt Fabric Spark 2.<WS2_LH>.repro.view_in_other  [OK in 2.12s]
1 of 1 OK snapshotted dbt Fabric Spark 2.<WS2_LH>.repro.snap_in_other  [OK in 9.30s]

Rendered DDL is now correct:

create or replace view `dbt Fabric Spark 2`.`<WS2_LH>`.`repro`.view_in_other as ...
merge into `dbt Fabric Spark 2`.`<WS2_LH>`.`repro`.snap_in_other as DBT_INTERNAL_DEST
  using `dbt Fabric Spark 2`.`<WS2_LH>`.`repro`.snap_in_other__dbt_tmp as DBT_INTERNAL_SOURCE
  on DBT_INTERNAL_SOURCE.dbt_scd_id = DBT_INTERNAL_DEST.dbt_scd_id ...

Note that the snapshot staging view (__dbt_tmp) is also workspace-qualified
now — this is the second snapshot.sql change.

Root causes

Materialization How target_relation was built Why workspace was lost
view Default global create_or_replace_view() api.Relation.create(identifier, schema, database, type='view') — no workspace kwarg
snapshot get_or_create_relation(...) Default dispatch: cache hit returns un-workspaced relation; fallback api.Relation.create(...) also drops workspace
snapshot staging view spark_build_snapshot_staging_table api.Relation.create(... type='view') copied schema+database from target but not workspace
materialized_lake_view api.Relation.create(... type='materialized_view') Same pattern as view

The table and incremental materializations avoid the bug because they
either pass workspace=config.get('workspace_name') explicitly (table.sql) or
use this, which is built via FabricSparkRelation.create_from and auto-pulls
workspace_name.

Fix

  • mv.sql — full fabricspark view materialization mirroring table.sql:
    builds target_relation with workspace=config.get('workspace_name').
    Preserves the existing fabricspark__handle_existing_table cleanup for
    stale tables.
  • snapshot.sql — re-incorporates workspace onto target_relation after
    get_or_create_relation (uses incorporate(workspace=workspace_name),
    which round-trips through FabricSparkRelation.from_dict and preserves the
    field). Staging tmp_relation also forwards target_relation.workspace
    so MERGE INTO sees a 4-part staging view.
  • materialized_lake_view.sql — adds workspace=workspace_name to the
    target_relation build. (The MLV REST API lakehouse-id resolution still
    uses the profile workspace; cross-workspace MLV via REST is a separate
    concern and is not in this PR's scope.)

Tests

Adds two functional test classes to tests/functional/adapter/test_cross_workspace.py:

  • TestCrossWorkspace4PartWriteView — compile + first-run + idempotent
    re-run, with runtime 4-part SELECT verification against WS2. Primary
    regression signal is test_cross_workspace_view_executes which fails
    outright on the un-fixed code path.
  • TestCrossWorkspace4PartWriteSnapshot — exercises both the CTAS path
    (first run) and the MERGE INTO path (mutated source between runs). Asserts
    SCD2 history (4 → 5 rows with dbt_valid_to populated on the closed-out
    row). The MERGE-into-WS2 path validates the staging-view workspace fix.

Both classes skip in no_schema mode (Fabric Livy 4-part naming is
schema-enabled-only). Reuses the existing cross_ws_write schema in WS2.

Validation

Run against a live Fabric tenant on this branch (WS1 + WS2 both
schema-enabled):

Target Result
npx nx run dbt-fabricspark:lint
npx nx run dbt-fabricspark:build
npx nx run dbt-fabricspark:test:unit ✅ 216 passed, 11 skipped
pytest tests/functional/adapter/test_cross_workspace.py (all 17 tests)
pytest tests/functional/adapter/basic/test_snapshot_* (regression)
pytest tests/functional/adapter/basic/test_base.py (view regression)
npx nx run dbt-fabricspark:test:local-e2e

Out of scope

  • Cross-workspace MLV REST API — the MLV materialization here is fixed at
    the relation-rendering layer for parity, but the MLV REST API path resolves
    lakehouse IDs against the profile workspace. Cross-workspace MLV via REST
    is a larger change and is intentionally not addressed here.
  • Workspace-aware adapter.get_relation — the existing cross-workspace
    incremental MERGE INTO tests already prove the cache flow works
    cross-workspace, so no Python-side change is needed for snapshot
    existence detection.

Closes #172.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

Copilot AI and others added 6 commits May 14, 2026 12:15
…e prefix (#172)

The cross-workspace 4-part naming feature (introduced in #167/#168) forwards
`workspace_name` from the model config onto `target_relation` for table and
incremental materializations only. View, snapshot, and materialized_lake_view
materializations built their target relation via paths that drop the workspace,
emitting 3-part DDL against a 4-part target:

    create or replace view `OtherLakehouse`.`dbt`.test_model as ...
    --> Database Error: Artifact not found: `MainWorkspace`.`OtherLakehouse`

Root causes (per materialization):

* view (`mv.sql`): delegated to default `create_or_replace_view()` which builds
  `api.Relation.create(...)` without forwarding workspace.
* snapshot (`snapshot.sql`): `get_or_create_relation(...)` returns a relation
  with no `workspace` field — both the cache hit and the
  `api.Relation.create` fallback drop it. Staging view in
  `spark_build_snapshot_staging_table` had the same omission.
* materialized_lake_view (`materialized_lake_view.sql`): built `target_relation`
  via `api.Relation.create(... type='materialized_view')` without workspace.

Fix:

* `mv.sql` — full fabricspark view materialization mirroring `table.sql`:
  builds `target_relation` with `workspace=config.get('workspace_name')`.
* `snapshot.sql` — re-incorporates workspace onto the target after
  `get_or_create_relation`; staging `tmp_relation` also forwards
  `target_relation.workspace` so MERGE INTO sees a 4-part staging view.
* `materialized_lake_view.sql` — adds `workspace=workspace_name` to the
  `target_relation` build. (MLV REST API lakehouse-id resolution is
  unaffected; cross-workspace MLV via REST is a separate concern.)

Tests (`tests/functional/adapter/test_cross_workspace.py`):

* `TestCrossWorkspace4PartWriteView` — compile + first-run + idempotent
  re-run, with runtime 4-part SELECT verification against WS2.
* `TestCrossWorkspace4PartWriteSnapshot` — CTAS path (first run) plus
  MERGE INTO path (mutated source between runs); asserts SCD2 history
  (4 → 5 rows with `dbt_valid_to` populated on the closed-out row).

Both classes skip in `no_schema` mode (Fabric Livy 4-part naming is
schema-enabled-only). Reuses the existing `cross_ws_write` schema in WS2.

Repro repo: https://github.com/cheyney-w/dbt-fabricspark-cross-workspace-demo

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The new fabricspark view materialization called `persist_docs` which
emits `ALTER TABLE ... CHANGE COLUMN` for column comments — invalid
against a view (Spark error: 'expects a table'). The pre-fix mv.sql
delegated to dbt-core's `create_or_replace_view()` which does NOT
call persist_docs; restoring that behavior unblocks the
TestPersistDocsDeltaView::test_delta_comments functional test.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@mdrakiburrahman

Copy link
Copy Markdown
Collaborator Author

Repro validation against @cheyney-w's demo repo

Ran the issue author's dbt-fabricspark-cross-workspace-demo end-to-end against this branch (8706820) on a live Fabric tenant — no repro, all steps green.

Setup

  • Cloned cheyney-w/dbt-fabricspark-cross-workspace-demo@main verbatim.
  • Installed this branch in-place (uv run --directory /workspaces/dbt-fabricspark --no-sync dbt …) so the adapter under test is the one from this PR.
  • Substituted hardcoded identifiers to match my tenant (workspace renames aren't scriptable):
    • MainLakehousedbt_e671998b_r0_1778757570_WithSchema (in dbt Fabric Spark 1)
    • OtherLakehousedbt_e671998b_r0_1778757586_CrossWs (in dbt Fabric Spark 2)
    • OtherWorkspacedbt Fabric Spark 2
  • All model SQL bodies, materialization configs, source/seed/snapshot YAML, and dbt-project structure are unchanged from the demo.

Steps (per the demo's README)

Step Command Result
1 dbt seed PASS=2 ERROR=0 (incl. cross-workspace seed into WS2)
2 dbt run PASS=19 ERROR=0 (6 incremental + 7 table + 6 view, mix of WS1 ↔ WS2)
3 dbt run (incremental re-run, MERGE/APPEND path) PASS=19 ERROR=0
4 dbt snapshot PASS=4 ERROR=0 (incl. the two snapshots called out in the issue)

The originally failing models — now green

12 of 19 OK created sql view model dbt Fabric Spark 2.dbt_e671998b_r0_1778757586_CrossWs.dbt.view_in_other_using_source_from_main  [OK in 8.13s]
13 of 19 OK created sql view model dbt Fabric Spark 2.dbt_e671998b_r0_1778757586_CrossWs.dbt.view_in_other_using_source_from_other  [OK in 4.96s]
3 of 4 OK snapshotted   dbt Fabric Spark 2.dbt_e671998b_r0_1778757586_CrossWs.dbt.snapshot_in_other_using_source_from_main   [OK in 14.70s]
4 of 4 OK snapshotted   dbt Fabric Spark 2.dbt_e671998b_r0_1778757586_CrossWs.dbt.snapshot_in_other_using_source_from_other  [OK in 20.43s]

cc @cheyney-w — happy to hand over the substituted project for your own verification if useful.

@mdrakiburrahman

mdrakiburrahman commented May 15, 2026

Copy link
Copy Markdown
Collaborator Author

@cheyney-w - the above was posted by AI, it validated your project repro is fixed 🙂

@mdrakiburrahman mdrakiburrahman merged commit 6cd2e64 into main May 15, 2026
2 checks passed
@mdrakiburrahman mdrakiburrahman deleted the dev/mdrrahman/172 branch May 17, 2026 01:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Cross-workspace view models fail with "Artifact not found" error

2 participants