Skip to content

Support for External Tables in dbt-clickhouse #585

@BentsiLeviav

Description

@BentsiLeviav

Description

External tables in ClickHouse (table engines like Kafka, Postgres, S3) are technically infra objects that handle credentials and have different lifecycles (compared to regular data modeling objects). This creates a conceptual difference with dbt's core design principles, which are more focused on data transformation rather than data ingestion or infrastructure management.

Current Situation

We currently have partial support for some table engines (S3, Memory) through the IGNORED_SETTINGS mechanism that excludes MergeTree-specific settings like replicated_deduplication_window. However, this solution is limited and doesn't account for the broader questions around:

  1. Whether external tables belong in the adapter at all
  2. How to handle credentials securely
  3. What should be the proper abstraction

Questions to address

1. Implementation Approach

We need to decide between:

  • Native adapter support - Extend the adapter to fully support table engines as a materialization type
  • External tables package - Use dbt-labs' dbt-external-tables package
  • Hybrid approach - Document recommendations for managing external tables outside dbt (e.g., via Terraform?)

2. Credential Management

External tables require credentials, which creates security issues:

  • Named Collections: Named collections are supported on the OSS version, and will be available in the ClickHouse cloud as well. It would allow referencing credentials without hardcoding them in dbt config files
  • Environment variables: Current workarounds require managing credentials through environment variables
  • Terraform integration: Some users prefer managing infrastructure tables separately using IaC tools

3. Alignment with dbt Core

See: dbt-labs/dbt-core#11265

Next Steps (TBD)

  1. Research how other adapters (Snowflake, BigQuery, Databricks) handle external tables
  2. Evaluate the dbt-external-tables package with ClickHouse
  3. Document recommended patterns for external table management (potentially using Terraform + dbt)
  4. Once Named Collections are available in ClickHouse Cloud, re-evaluate security concerns

Related Issues


We would love to get the community input on how it is being done today, what requirements you might need, and how overall your usage is with external tables with ClickHouse

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions