Skip to content

Reference Satellite V0

Ricardo Rodríguez edited this page Mar 4, 2025 · 3 revisions

This node was designed to create a reference V0 satellite that hangs on a reference hub. On top of this V0 satellite, a V1 satellite should be created to then create a reference table. The V1 satellite will extend the V0 reference satellite by virtually creating a load end date timestamp. Each reference satellite will be loaded by one source only, and just like the reference hub, it won't contain any hash keys (except for the Hash diff column).

To create a V0 reference satellite, right click on your Data Vault stage, go to Add Node and select "Reference Satellite V0" (yellow color):

image

The reference satellite should contain the reference keys of the parent reference hub, the attributes from the reference data as well as the technical attributes, and a hash-diff column generated from the attributes: image

In the Data Vault config panel, the reference keys must be selected, as well as the hash-diff calculated column: image

A Disable High-Water-Mark option is given, just like in the Standard V0 satellite. Please reference here for more information on the High-Water-Mark option.

After this configuration is set, you can create and run your reference satellite V0 node.

Generated SQL Code of Example

Create Table Script

CREATE OR REPLACE TABLE "COALESCE_WORKSHOP"."CORE"."REF_SATV0_NATION" (
  "N_NATIONKEY" NUMBER(38, 0) NOT NULL,
  "N_NAME" VARCHAR(25) NOT NULL,
  "N_REGIONKEY" NUMBER(38, 0) NOT NULL,
  "N_COMMENT" VARCHAR(152),
  "LDTS" TIMESTAMP COMMENT 'The Load Date Timestamp (LDTS) describes when this data first arrived in the Data Warehouse.',
  "RSRC" STRING COMMENT 'The Record Source (RSRC) describes the source of this data.',
  "HD_NATION_RS" STRING
) COMMENT = 'Nation data as defined by TPC-H'

Run Script

INSERT INTO
  "COALESCE_WORKSHOP"."CORE"."REF_SATV0_NATION"
WITH
  latest_entries_in_sat AS (
    /* get current rows from satellite */
    SELECT
      "N_NATIONKEY",
      "HD_NATION_RS"
    FROM
      "COALESCE_WORKSHOP"."CORE"."REF_SATV0_NATION"
    QUALIFY
      ROW_NUMBER() OVER (
        PARTITION BY
          "N_NATIONKEY"
        ORDER BY
          "LDTS" DESC
      ) = 1
  ),
  deduplicated_numbered_source AS (
    SELECT
      "DVSTG_NATION"."N_NATIONKEY" AS N_NATIONKEY,
      "DVSTG_NATION"."N_NAME" AS N_NAME,
      "DVSTG_NATION"."N_REGIONKEY" AS N_REGIONKEY,
      "DVSTG_NATION"."N_COMMENT" AS N_COMMENT,
      "DVSTG_NATION"."LDTS" AS LDTS,
      "DVSTG_NATION"."RSRC" AS RSRC,
      "DVSTG_NATION"."HD_NATION_RS" AS HD_NATION_RS,
      ROW_NUMBER() OVER (
        PARTITION BY
          "N_NATIONKEY"
        ORDER BY
          "LDTS"
      ) as rn
    FROM
      "COALESCE_WORKSHOP"."CORE"."DVSTG_NATION" "DVSTG_NATION"
    WHERE
      "LDTS" > (
        SELECT
          COALESCE(
            MAX("LDTS"),
            TO_TIMESTAMP('0001-01-01T00:00:01', 'YYYY-MM-DDTHH24:MI:SS')
          )
        FROM
          "COALESCE_WORKSHOP"."CORE"."REF_SATV0_NATION"
        WHERE
          "LDTS" != TO_TIMESTAMP('8888-12-31T23:59:59', 'YYYY-MM-DDTHH24:MI:SS')
      )
    QUALIFY
      CASE
        WHEN "HD_NATION_RS" = LAG("HD_NATION_RS") OVER (
          PARTITION BY
            "N_NATIONKEY"
          ORDER BY
            "LDTS"
        ) THEN FALSE
        ELSE TRUE
      END
  )
SELECT DISTINCT
  N_NATIONKEY,
  N_NAME,
  N_REGIONKEY,
  N_COMMENT,
  LDTS,
  RSRC,
  HD_NATION_RS
FROM
  deduplicated_numbered_source
WHERE
  NOT EXISTS (
    SELECT
      1
    FROM
      latest_entries_in_sat
    WHERE
      deduplicated_numbered_source."HD_NATION_RS" = latest_entries_in_sat."HD_NATION_RS"
      AND deduplicated_numbered_source."N_NATIONKEY" = latest_entries_in_sat."N_NATIONKEY"
      AND deduplicated_numbered_source.rn = 1
  )

Clone this wiki locally