Skip to content

Add data-diff Support for Clickhouse #1223

@terzioglub

Description

@terzioglub

Clickhouse is missing GetTableSummary method implementation, preventing it from using the bruin data-diff command. Currently only postgres, bigquery, snowflake, and duckdb support data-diff.

The GetTableSummary method analyzes a table's schema and data statistics to enable comparison between tables. It returns column information (types, constraints) and statistical data (row counts, null counts, min/max values, averages) that bruin data-diff uses to identify differences between tables.

More about data-diff on : https://getbruin.com/docs/bruin/commands/data-diff.html

Implementation

Add GetTableSummary method for the database . The implementation of getTableSummary already exists for bigquery , duckdb , snowflake and Postgres . Take a look at the functions in their packages db.go clickhouse should support the same functionality.

func (db *DB) GetTableSummary(ctx context.Context, tableName string, schemaOnly bool) (*diff.TableSummaryResult, error) {
    // Follow pattern from pkg/postgres/db.go or pkg/duckdb/db.go
    // Return table statistics for data comparison
}

Acceptance Criteria

  • GetTableSummary method implemented in pkg/clickhouse/db.go
  • bruin data-diff works for blockhouse
  • Schema comparison works (column types, nullable, constraints)
  • Data comparison works (row counts, column statistics)
  • --schema-only flag works
  • Test cases

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions