Fix $files when partition evolution using truncate and bucket on same column in Iceberg connector #27380

krvikash · 2025-11-20T04:25:44Z

Description

Additional context and related issues

Stack trace:

io.trino.testing.QueryFailedException: Invalid schema: multiple fields for name partition.part_trunc: 1000 and 1001

	at io.trino.testing.AbstractTestingTrinoClient.execute(AbstractTestingTrinoClient.java:138)
	at io.trino.testing.DistributedQueryRunner.executeInternal(DistributedQueryRunner.java:587)
	at io.trino.testing.DistributedQueryRunner.execute(DistributedQueryRunner.java:570)
	at io.trino.sql.query.QueryAssertions$QueryAssert.lambda$new$1(QueryAssertions.java:317)
	at com.google.common.base.Suppliers$NonSerializableMemoizingSupplier.get(Suppliers.java:201)
	at io.trino.sql.query.QueryAssertions$QueryAssert.result(QueryAssertions.java:436)
	at io.trino.sql.query.QueryAssertions$QueryAssert.matches(QueryAssertions.java:357)
	at io.trino.plugin.iceberg.BaseIcebergSystemTables.testFilesPartitionEvolutionUsingTruncateOnSameColumn(BaseIcebergSystemTables.java:574)
	at java.base/java.lang.reflect.Method.invoke(Method.java:565)
	at java.base/java.util.concurrent.ForkJoinTask.doExec$$$capture(ForkJoinTask.java:511)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1450)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:2019)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:187)
	Suppressed: java.lang.Exception: SQL: SELECT partition FROM "test_files_tableu9g13hdwhk$files"
		at io.trino.testing.DistributedQueryRunner.executeInternal(DistributedQueryRunner.java:594)
		... 12 more
Caused by: org.apache.iceberg.exceptions.ValidationException: Invalid schema: multiple fields for name partition.part_trunc: 1000 and 1001
	at org.apache.iceberg.exceptions.ValidationException.check(ValidationException.java:49)
	at org.apache.iceberg.types.IndexByName.addField(IndexByName.java:200)
	at org.apache.iceberg.types.IndexByName.field(IndexByName.java:161)
	at org.apache.iceberg.types.IndexByName.field(IndexByName.java:34)
	at org.apache.iceberg.types.TypeUtil.visit(TypeUtil.java:663)
	at org.apache.iceberg.types.TypeUtil.visit(TypeUtil.java:659)
	at org.apache.iceberg.types.TypeUtil.indexNameById(TypeUtil.java:173)
	at org.apache.iceberg.Schema.lazyIdToName(Schema.java:225)
	at org.apache.iceberg.Schema.<init>(Schema.java:154)
	at org.apache.iceberg.Schema.<init>(Schema.java:111)
	at org.apache.iceberg.Schema.<init>(Schema.java:99)
	at org.apache.iceberg.Schema.<init>(Schema.java:95)
	at org.apache.iceberg.BaseFilesTable.schema(BaseFilesTable.java:49)
	at org.apache.iceberg.FilesTable.schema(FilesTable.java:24)
	at io.trino.plugin.iceberg.system.FilesTable.splitSource(FilesTable.java:141)
	at io.trino.plugin.base.classloader.ClassLoaderSafeSystemTable.splitSource(ClassLoaderSafeSystemTable.java:123)
	at io.trino.connector.system.SystemSplitManager.getSplits(SystemSplitManager.java:74)
	at io.trino.split.SplitManager.getSplits(SplitManager.java:89)
	at io.trino.sql.planner.SplitSourceFactory$Visitor.createSplitSource(SplitSourceFactory.java:191)
	at io.trino.sql.planner.SplitSourceFactory$Visitor.visitTableScan(SplitSourceFactory.java:158)
	at io.trino.sql.planner.SplitSourceFactory$Visitor.visitTableScan(SplitSourceFactory.java:132)
	at io.trino.sql.planner.plan.TableScanNode.accept(TableScanNode.java:219)
	at io.trino.sql.planner.SplitSourceFactory$Visitor.visitOutput(SplitSourceFactory.java:368)
	at io.trino.sql.planner.SplitSourceFactory$Visitor.visitOutput(SplitSourceFactory.java:132)
	at io.trino.sql.planner.plan.OutputNode.accept(OutputNode.java:82)
	at io.trino.sql.planner.SplitSourceFactory.createSplitSources(SplitSourceFactory.java:112)
	at io.trino.execution.scheduler.PipelinedQueryScheduler$DistributedStagesScheduler.createStageScheduler(PipelinedQueryScheduler.java:1075)
	at io.trino.execution.scheduler.PipelinedQueryScheduler$DistributedStagesScheduler.create(PipelinedQueryScheduler.java:949)
	at io.trino.execution.scheduler.PipelinedQueryScheduler.createDistributedStagesScheduler(PipelinedQueryScheduler.java:328)
	at io.trino.execution.scheduler.PipelinedQueryScheduler.start(PipelinedQueryScheduler.java:311)
	at io.trino.execution.SqlQueryExecution.start(SqlQueryExecution.java:441)
	at io.trino.execution.SqlQueryManager.createQuery(SqlQueryManager.java:284)
	at io.trino.dispatcher.LocalDispatchQuery.startExecution(LocalDispatchQuery.java:150)
	at io.trino.dispatcher.LocalDispatchQuery.lambda$waitForMinimumWorkers$1(LocalDispatchQuery.java:134)
	at io.airlift.concurrent.MoreFutures.lambda$addSuccessCallback$0(MoreFutures.java:570)
	at io.airlift.concurrent.MoreFutures$3.onSuccess(MoreFutures.java:545)
	at com.google.common.util.concurrent.Futures$CallbackListener.run(Futures.java:1132)
	at io.trino.$gen.Trino_testversion____20251120_073657_1.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1090)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:614)
	at java.base/java.lang.Thread.run(Thread.java:1474)

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(X) Release notes are required, with the following suggested text:

## Iceberg
* Fix $files when partition evolution using truncate and bucket on same column. ({issue}`issuenumber`)

ebyhr · 2025-11-21T02:21:50Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/PartitionFields.java

                    String column = fromIdentifierToColumn(match.group(1));
-                    builder.bucket(column, parseInt(match.group(2)), column + "_bucket" + suffix);
+                    int numBuckets = parseInt(match.group(2));
+                    builder.bucket(column, numBuckets, column + "_bucket_" + numBuckets + suffix);


This change causes behavior change even without partition evolution.

CREATE TABLE test(a varchar) WITH (partitioning = ARRAY['truncate(a, 1)']); INSERT INTO test VALUES 'abc'; SELECT "$partition" FROM test;

The bottom SELECT returned a_trunc=a before this PR, it returns a_trunc_1=a now. It it possible to keep the original behavior as much as possible? Spark doesn't append the number unless partition happens if I remember correctly.

… column

cla-bot bot added the cla-signed label Nov 20, 2025

github-actions bot added the iceberg Iceberg connector label Nov 20, 2025

krvikash force-pushed the krvikash/fix-iceberg-file-system-table branch 2 times, most recently from e7364d6 to 95ab895 Compare November 20, 2025 10:49

ebyhr reviewed Nov 21, 2025

View reviewed changes

krvikash force-pushed the krvikash/fix-iceberg-file-system-table branch from 95ab895 to 180a6a9 Compare November 21, 2025 13:41

Fix $files when partition evolution using truncate and bucket on same…

0607069

… column

krvikash force-pushed the krvikash/fix-iceberg-file-system-table branch from 180a6a9 to 0607069 Compare November 21, 2025 17:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix $files when partition evolution using truncate and bucket on same column in Iceberg connector #27380

Fix $files when partition evolution using truncate and bucket on same column in Iceberg connector #27380

krvikash commented Nov 20, 2025 •

edited

Loading

Uh oh!

ebyhr Nov 21, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

Fix $files when partition evolution using truncate and bucket on same column in Iceberg connector #27380

Are you sure you want to change the base?

Fix $files when partition evolution using truncate and bucket on same column in Iceberg connector #27380

Conversation

krvikash commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Additional context and related issues

Release notes

Uh oh!

ebyhr Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

2 participants

krvikash commented Nov 20, 2025 •

edited

Loading

ebyhr Nov 21, 2025 •

edited

Loading