Skip to content

Support for RowBinaryWithNames #107

@avx-wlauer

Description

@avx-wlauer

Use case

As a pipeline author, I need the ability for my code to reliably write to a clickhouse table without having to provide values for all the columns AND without having to know about all the columns in the table. My schema is regularly updated with new columns independent of the deploy of a new pipeline version (schema updates are often required to be applied prior to code that actually populates the data). As long as the new columns have reasonable defaults, the pipeline should be able to continue to function, writing data without error, without being updated.

Describe the solution you'd like

Support writing a subset of the columns in a table using a RowBinary-based format.

For performance reasons, I'd like to use a RowBinary-based format instead of converting data to JSON. Unfortunately, I don't have complete control over when new columns are added to my schema, so its possible that new columns will be added sometime while my pipeline is currently running. The default RowBinary format requires that ALL columns are included in the data, or misaligment and other errors occur. By allowing the use of RowBinaryWithNames format, Clickhouse should be able to continue to map data generate by the pipeline to the correct columns even after the addition of new columns to a table's schema.

Describe the alternatives you've considered

It's possible to make this work with other formats, including JSON, which implicitly declare the column mapping, but I don't want to deal with the performance penalty of converting to/from JSON. This can probably also be managed with AVRO or Parquet formats, but the RowBinary and its associated derivatives are the native formats supported by ClickHouse and should have the best performance.

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions