Skip to content

Snapshots Error: Correlated subqueries are not supported in JOINs yet #574

@enqueue

Description

@enqueue

Describe the bug

After the initial dbt snapshot run, an error occurs:

DB::Exception: Correlated subqueries are not supported in JOINs yet,
but found in expression: snapshotted_data

Steps to reproduce

  1. Create model foo
  2. Create snapshot foo_snapshot (see YAML below)
  3. Run dbt snapshot
  4. Run dbt snapshot

Expected behaviour

Snapshot table being updated.

Code examples, such as models or profile settings

I have tried to obfuscate our table names manually. I hope I did not break anything else while doing so.

Source SQL

{{
  config(
    materialized = 'table'
  )
}}


with foos as (
    select *
    from {{ ref('stg_foos') }}
)

select
    foo_id,
    foo_data,
    toDateTime64(coalesce(last_modified_time, '1970-01-01 00:00:00'), 3)
        as last_update,
    toStartOfDay(last_update) as last_update_day
from foos

Snapshot YAML

snapshots:
  - name: foo_snapshot
    relation: ref('sourceschema', 'foo')
    config:
      database: targetschema
      unique_key: foo_id
      strategy: timestamp
      updated_at: last_update
      dbt_valid_to_current: "toDate('2149-06-06')"
      snapshot_meta_column_names:
        dbt_valid_from: valid_from
        dbt_valid_to: valid_to
        dbt_is_deleted: is_locked

dbt and/or ClickHouse server logs

Error message

  Code: 48.
  DB::Exception: Correlated subqueries are not supported in JOINs yet, but found in expression:  snapshotted_data. Stack trace:
  
  0. ? @ 0x0000000013c42d9f
  1. ? @ 0x000000000c89690e
  2. ? @ 0x000000000c8963c0
  3. ? @ 0x000000000c895f2b
  4. ? @ 0x0000000017ddaee8
  5. ? @ 0x0000000017dd6378
  6. ? @ 0x0000000017da69ef
  7. ? @ 0x0000000017db1161
  8. ? @ 0x0000000017dd6977
  9. ? @ 0x0000000017da69ef
  10. ? @ 0x0000000017dac084
  11. ? @ 0x0000000017da41cd
  12. ? @ 0x0000000017da381c
  13. ? @ 0x0000000017df9736
  14. ? @ 0x0000000018949c8d
  15. ? @ 0x0000000018947baa
  16. ? @ 0x000000001894a0ea
  17. ? @ 0x000000001886b9f3
  18. ? @ 0x00000000188747e5
  19. ? @ 0x000000001888a778
  20. ? @ 0x0000000018d1fe44
  21. ? @ 0x0000000018d19c73
  22. ? @ 0x000000001a4837b7
  23. ? @ 0x000000001a4a5d59
  24. ? @ 0x000000001f561c07
  25. ? @ 0x000000001f562099
  26. ? @ 0x000000001f528847
  27. ? @ 0x000000001f526c41
  28. ? @ 0x000000000009698b
  29. ? @ 0x000000000011a9cc
16:00:59  
16:00:59  Done. PASS=0 WARN=0 ERROR=1 SKIP=0 NO-OP=0 TOTAL=1

Generated SQL statement

create table `targetschema`.`foo_snapshot__dbt_tmp`



  engine = MergeTree()

      order by (tuple())




                    -- end_of_sql


            empty
          as (


        with snapshot_query as (

        select * from `sourceschema`.`foo`

    ),

    snapshotted_data as (

        select *,
            foo_id as dbt_unique_key

        from `targetschema`.`foo_snapshot`
        where dbt_valid_to is null

    ),

    insertions_source_data as (

        select
            *,
            foo_id as dbt_unique_key,
            last_update_day as dbt_updated_at,
            last_update_day as dbt_valid_from,
            nullif(last_update_day, last_update_day) as dbt_valid_to,
            halfMD5(coalesce(cast(foo_id as varchar ), '')
     || '|' || coalesce(cast(last_update_day as varchar ), '')
    ) as dbt_scd_id

        from snapshot_query
    ),

    updates_source_data as (

        select
            *,
            foo_id as dbt_unique_key,
            last_update_day as dbt_updated_at,
            last_update_day as dbt_valid_from,
            last_update_day as dbt_valid_to

        from snapshot_query
    ),

    insertions as (

        select
            'insert' as dbt_change_type,
            source_data.*

        from insertions_source_data as source_data
        left outer join snapshotted_data on snapshotted_data.dbt_unique_key = source_data.dbt_unique_key
        where snapshotted_data.dbt_unique_key is null
           or (
                snapshotted_data.dbt_unique_key is not null
            and (
                (snapshotted_data.valid_from < source_data.last_update_day)
            )
        )

    ),

    updates as (

        select
            'update' as dbt_change_type,
            source_data.*,
            snapshotted_data.dbt_scd_id

        from updates_source_data as source_data
        join snapshotted_data on snapshotted_data.dbt_unique_key = source_data.dbt_unique_key
        where (
            (snapshotted_data.valid_from < source_data.last_update_day)
        )
    )

    select * from insertions
    union all
    select * from updates


          )

Configuration

Environment

  • dbt version: 1.10.15
  • dbt-clickhouse version: 1.9.6
  • Python version: 3.13.7
  • Operating system: Linux

ClickHouse server

  • ClickHouse Server version: 25.10.2
  • ClickHouse Server non-default settings, if any: none
  • CREATE TABLE statements for tables involved: I hope you can figure out without me constructing full test case, but let me know if you need complete reproducer from me

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions