fix: deserialize parquet error when stream's base table modify column type #18828

zhyass · 2025-10-12T15:00:06Z

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

This PR fixes an issue where reading or deserializing Parquet data from a stream fails if the base table’s column type has been modified after the stream was created.

When a base table column’s data type changes, the system now drops the old column and re-adds it to regenerate a new column_id. This ensures consistency between the schema definition and column identifiers, preventing mismatched schemas during Parquet deserialization.

Modifying columns while Change Tracking is active should be avoided. Changing schema definitions during change tracking may break consistency between tracked changes and the current table schema, leading to incorrect or incomplete change records.

Fixes: #18827

Tests

Unit Test
Logic Test
Benchmark Test
No Test - Explain why

Type of change

Bug Fix (non-breaking change which fixes an issue)
New Feature (non-breaking change which adds functionality)
Breaking Change (fix or feature that could cause existing functionality not to work as expected)
Documentation Update
Refactoring
Performance Improvement
Other (please describe):

This change is

github-actions · 2025-10-12T15:59:24Z

🤖 Smart Auto-retry Analysis

Workflow: 18447157252

📊 Summary

Total Jobs: 83
Failed Jobs: 2
Retryable: 0
Code Issues: 2

❌ NO RETRY NEEDED

All failures appear to be code/test issues requiring manual fixes.

🔍 Job Details

❌ linux / sqllogic / cluster_with_minio_and_nginx (http_handler, ttc-go): Not retryable (Code/Test)
❌ linux / sqllogic / ee (parquet): Not retryable (Code/Test)

🤖 About

Automated analysis using job annotations to distinguish infrastructure issues (auto-retried) from code/test issues (manual fixes needed).

zhyass added 2 commits October 12, 2025 02:25

fix

81776b0

add test

a529363

github-actions bot added the pr-bugfix this PR patches a bug in codebase label Oct 12, 2025

zhyass requested review from TCeason and dantengsky October 12, 2025 15:06

zhyass marked this pull request as draft October 12, 2025 16:00

zhyass added 2 commits October 13, 2025 00:36

fix

b80c769

fix

96e6ed3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: deserialize parquet error when stream's base table modify column type #18828

fix: deserialize parquet error when stream's base table modify column type #18828

zhyass commented Oct 12, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 12, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix: deserialize parquet error when stream's base table modify column type #18828

Are you sure you want to change the base?

fix: deserialize parquet error when stream's base table modify column type #18828

Conversation

zhyass commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Tests

Type of change

Uh oh!

github-actions bot commented Oct 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🤖 Smart Auto-retry Analysis

📊 Summary

❌ NO RETRY NEEDED

🔍 Job Details

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

zhyass commented Oct 12, 2025 •

edited

Loading

github-actions bot commented Oct 12, 2025 •

edited

Loading