Skip to content

Commit b13ee99

Browse files
authored
Merge pull request #576 from sdebruyn/feature/fabric-support
Add Microsoft Fabric Data Warehouse support (+ SQL Server and Synapse)
2 parents e413547 + 3117153 commit b13ee99

57 files changed

Lines changed: 380 additions & 199 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,14 +24,16 @@ Currently, the following adapters are supported:
2424
- AWS Athena (tested manually)
2525
- Greenplum (tested manually)
2626
- ClickHouse (tested manually)
27+
- Microsoft Fabric Data Warehouse (tested manually)
28+
- Microsoft Fabric Spark (tested manually)
2729

2830
## Using This Package
2931

3032
### Cloning via dbt Package Hub
3133

3234
Check [dbt Hub](https://hub.getdbt.com/dbt-labs/dbt_project_evaluator/latest/) for the latest installation instructions, or [read the docs](https://docs.getdbt.com/docs/package-management) for more information on installing packages.
3335

34-
### Additional setup for Databricks/Spark/DuckDB/Redshift/ClickHouse
36+
### Additional setup for Databricks/Spark/DuckDB/Redshift/ClickHouse/Fabric
3537

3638
In your `dbt_project.yml`, add the following config:
3739

dbt_project.yml

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -26,17 +26,17 @@ dispatch:
2626

2727
models:
2828
dbt_project_evaluator:
29-
+materialized: "{{ 'table' if target.type in ['duckdb'] else 'view' }}"
29+
+materialized: "{{ 'table' if target.type in ['duckdb', 'fabric'] else 'view' }}"
3030
marts:
3131
core:
3232
int_all_graph_resources:
3333
+materialized: table
3434
int_direct_relationships:
35-
# required for BigQuery and Redshift for performance/memory reasons
36-
+materialized: "{{ 'table' if target.type in ['bigquery', 'redshift', 'databricks'] else 'view' }}"
35+
# required for BigQuery, Redshift, Databricks, and Fabric for performance/memory reasons
36+
+materialized: "{{ 'table' if target.type in ['bigquery', 'redshift', 'databricks', 'fabric'] else 'view' }}"
3737
int_all_dag_relationships:
38-
# required for BigQuery, Redshift, and Databricks for performance/memory reasons
39-
+materialized: "{{ 'table' if target.type in ['bigquery', 'redshift', 'databricks', 'clickhouse'] else 'view' }}"
38+
# required for BigQuery, Redshift, Databricks, Clickhouse, and Fabric for performance/memory reasons
39+
+materialized: "{{ 'table' if target.type in ['bigquery', 'redshift', 'databricks', 'clickhouse', 'fabric'] else 'view' }}"
4040
dag:
4141
+materialized: table
4242
staging:
@@ -45,11 +45,11 @@ models:
4545
+materialized: table
4646
variables:
4747
stg_naming_convention_folders:
48-
# required for Redshift because listagg runs only on tables
49-
+materialized: "{{ 'table' if target.type == 'redshift' else 'view' }}"
48+
# required for Redshift and Fabric because listagg runs only on tables
49+
+materialized: "{{ 'table' if target.type in ['redshift', 'fabric'] else 'view' }}"
5050
stg_naming_convention_prefixes:
51-
# required for Redshift because listagg runs only on tables
52-
+materialized: "{{ 'table' if target.type == 'redshift' else 'view' }}"
51+
# required for Redshift and Fabric because listagg runs only on tables
52+
+materialized: "{{ 'table' if target.type in ['redshift', 'fabric'] else 'view' }}"
5353

5454

5555
vars:
@@ -89,7 +89,7 @@ vars:
8989

9090
# -- Execution variables --
9191
insert_batch_size: "{{ 500 if target.type in ['athena', 'bigquery'] else 10000 }}"
92-
max_depth_dag: "{{ 9 if target.type in ['bigquery', 'spark', 'databricks'] else 4 if target.type in ['athena', 'trino', 'clickhouse'] else -1 }}"
92+
max_depth_dag: "{{ 9 if target.type in ['bigquery', 'spark', 'databricks', 'fabric'] else 4 if target.type in ['athena', 'trino', 'clickhouse'] else -1 }}"
9393

9494
# -- Code complexity variables --
9595
comment_chars: ["--"]

docs/customization/overriding-variables.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -103,14 +103,14 @@ vars:
103103

104104
| variable | description | default |
105105
| ----------- | ----------- | ----------- |
106-
| `max_depth_dag` | limits the maximum distance between nodes calculated in `int_all_dag_relationships` | 9 for bigquery and spark, -1 for other adatpters |
106+
| `max_depth_dag` | limits the maximum distance between nodes calculated in `int_all_dag_relationships` | 9 for bigquery, spark, and fabric, -1 for other adapters |
107107
| `insert_batch_size` | number of records inserted per batch when unpacking the graph into models | 10000 |
108108

109109
**Note on max_depth_dag**
110110

111111
The default behavior for limiting the relationships calculated in the `int_all_dag_relationships` model differs depending on your adapter.
112112

113-
- For Bigquery & Spark/Databricks the maximum distance between two nodes in your DAG, calculated in `int_all_dag_relationships`, is set by the `max_depth_dag` variable, which is defaulted to 9. So by default, `int_all_dag_relationships` contains a row for every path less than or equal to 9 nodes in length between two nodes in your DAG. This is because these adapters do not currently support recursive SQL, and queries often fail on more than 9 recursive joins.
113+
- For BigQuery, Spark/Databricks, and Microsoft Fabric Data Warehouse the maximum distance between two nodes in your DAG, calculated in `int_all_dag_relationships`, is set by the `max_depth_dag` variable, which is defaulted to 9. So by default, `int_all_dag_relationships` contains a row for every path less than or equal to 9 nodes in length between two nodes in your DAG. This is because these adapters do not currently support recursive SQL, and queries often fail on more than 9 recursive joins.
114114
- For all other adapters `int_all_dag_relationships` by default contains a row for every single path between two nodes in your DAG. If you experience long runtimes for the `int_all_dag_relationships` model, you may consider limiting the length of your generated DAG paths. To do this, set `max_depth_dag: {{ whatever limit you want to enforce }}`. The value of `max_depth_dag` must be greater than 2 for all DAG tests to work, and greater than `chained_views_threshold` to ensure your performance tests to work. By default, the value of this variable for these adapters is -1, which the package interprets as "no limit".
115115

116116
```yaml title="dbt_project.yml"

docs/index.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,14 +25,16 @@ Currently, the following adapters are supported:
2525
- AWS Athena (tested manually)
2626
- Greenplum (tested manually)
2727
- ClickHouse (tested manually)
28+
- Microsoft Fabric Data Warehouse (tested manually)
29+
- Microsoft Fabric Spark (tested manually)
2830

2931
## Using This Package
3032

3133
### Cloning via dbt Package Hub
3234

3335
Check [dbt Hub](https://hub.getdbt.com/dbt-labs/dbt_project_evaluator/latest/) for the latest installation instructions, or [read the docs](https://docs.getdbt.com/docs/package-management) for more information on installing packages.
3436

35-
### Additional setup for Databricks/Spark/DuckDB/Redshift
37+
### Additional setup for Databricks/Spark/DuckDB/Redshift/Fabric
3638

3739
In your `dbt_project.yml`, add the following config:
3840

@@ -64,8 +66,8 @@ Each test warning indicates the presence of a type of misalignment. To troublesh
6466

6567
## Limitations
6668

67-
### BigQuery and Databricks
69+
### BigQuery, Databricks, and Microsoft Fabric Data Warehouse
6870

69-
BigQuery current support for recursive CTEs is limited and Databricks SQL doesn't support recursive CTEs.
71+
BigQuery has limited support for recursive CTEs, while Databricks SQL and Microsoft Fabric Data Warehouse do not support them.
7072

7173
For those Data Warehouses, the model `int_all_dag_relationships` needs to be created by looping CTEs instead. The number of loops is configured with `max_depth_dag` and defaulted to 9. This means that dependencies between models of more than 9 levels of separation won't show in the model `int_all_dag_relationships` but tests on the DAG will still be correct. With a number of loops higher than 9 BigQuery sometimes raises an error saying the query is too complex.
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
{# Convert a Python boolean to a SQL boolean literal appropriate for the target adapter #}
2+
{% macro bool_literal(value) %}
3+
{{ return(adapter.dispatch('bool_literal', 'dbt_project_evaluator')(value)) }}
4+
{% endmacro %}
5+
6+
{% macro default__bool_literal(value) %}{{ value | trim }}{% endmacro %}
7+
8+
{% macro fabric__bool_literal(value) %}{% if value %}1{% else %}0{% endif %}{% endmacro %}
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
{% macro spark__escape_single_quotes(expression) -%}
2+
{{ expression | replace("'","\\'") }}
3+
{%- endmacro %}
4+
5+
{% macro fabric__escape_single_quotes(expression) -%}
6+
{{ expression | replace("'","''") }}
7+
{%- endmacro %}
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
{% macro quote_identifier(name) %}
2+
{{ return(adapter.dispatch('quote_identifier', 'dbt_project_evaluator')(name)) }}
3+
{% endmacro %}
4+
5+
{% macro default__quote_identifier(name) %}{{ name }}{% endmacro %}
6+
7+
{% macro fabric__quote_identifier(name) %}[{{ name }}]{% endmacro %}

macros/cross_db_shim/spark_shims.sql

Lines changed: 0 additions & 3 deletions
This file was deleted.

macros/cross_db_shim/type_string.sql

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,7 @@
99
{%- macro redshift__type_string_dpe() -%}
1010
{{ return(api.Column.string_type(600)) }}
1111
{%- endmacro -%}
12+
13+
{%- macro fabric__type_string_dpe() -%}
14+
{{ return("varchar(8000)") }}
15+
{%- endmacro -%}

macros/get_directory_pattern.sql

Lines changed: 16 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,10 @@
2323
{% endmacro %}
2424

2525
{% macro get_dbtreplace_directory_pattern() %}
26+
{{ return(adapter.dispatch('get_dbtreplace_directory_pattern', 'dbt_project_evaluator')()) }}
27+
{% endmacro %}
28+
29+
{% macro default__get_dbtreplace_directory_pattern() %}
2630
{% if execute %}
2731
{%- set on_mac_or_linux = dbt_project_evaluator.is_os_mac_or_linux() -%}
2832
{%- if on_mac_or_linux -%}
@@ -31,4 +35,15 @@
3135
{{ dbt.replace("file_path", "regexp_replace(file_path,'.*\\\\\\\\','')", "''") }}
3236
{% endif %}
3337
{% endif %}
34-
{% endmacro %}
38+
{% endmacro %}
39+
40+
{% macro fabric__get_dbtreplace_directory_pattern() %}
41+
{% if execute %}
42+
{%- set on_mac_or_linux = dbt_project_evaluator.is_os_mac_or_linux() -%}
43+
{%- if on_mac_or_linux -%}
44+
left(file_path, len(file_path) - charindex('/', reverse(file_path)))
45+
{%- else -%}
46+
left(file_path, len(file_path) - charindex('\', reverse(file_path)))
47+
{%- endif -%}
48+
{% endif %}
49+
{% endmacro %}

0 commit comments

Comments
 (0)