Skip to content

Conversation

@abhishekrb19
Copy link
Contributor

@abhishekrb19 abhishekrb19 commented Dec 16, 2025

When the serialized size of a column exceeds 2 GB (the smoosher’s maximum) due to large blobs in a specific column, the ingestion task fails with the following stack trace - it doesn't indicate the problematic column:

java.lang.RuntimeException: org.apache.druid.java.util.common.IAE: Asked to add buffers[2,171,458,617] larger than configured max[2,147,483,647]
	at org.apache.druid.segment.realtime.appenderator.StreamAppenderator.mergeAndPush(StreamAppenderator.java:1015) ~[druid-server-32.0.1.jar:32.0.1]
	at org.apache.druid.segment.realtime.appenderator.StreamAppenderator.lambda$push$1(StreamAppenderator.java:826) ~[druid-server-32.0.1.jar:32.0.1]
	at com.google.common.util.concurrent.AbstractTransformFuture$TransformFuture.doTransform(AbstractTransformFuture.java:252) ~[guava-32.0.1-jre.jar:?]
	at com.google.common.util.concurrent.AbstractTransformFuture$TransformFuture.doTransform(AbstractTransformFuture.java:242) ~[guava-32.0.1-jre.jar:?]
	at com.google.common.util.concurrent.AbstractTransformFuture.run(AbstractTransformFuture.java:123) [guava-32.0.1-jre.jar:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.base/java.lang.Thread.run(Thread.java:840) [?:?]
Caused by: org.apache.druid.java.util.common.IAE: Asked to add buffers[2,171,458,617] larger than configured max[2,147,483,647]
	at org.apache.druid.java.util.common.io.smoosh.FileSmoosher.addWithSmooshedWriter(FileSmoosher.java:176) ~[druid-processing-32.0.1.jar:32.0.1]
	at org.apache.druid.segment.IndexMergerV9.makeColumn(IndexMergerV9.java:788) ~[druid-processing-32.0.1.jar:32.0.1]
	at org.apache.druid.segment.IndexMergerV9.makeIndexFiles(IndexMergerV9.java:291) ~[druid-processing-32.0.1.jar:32.0.1]
	at org.apache.druid.segment.IndexMergerV9.merge(IndexMergerV9.java:1359) ~[druid-processing-32.0.1.jar:32.0.1]
	at org.apache.druid.segment.IndexMergerV9.multiphaseMerge(IndexMergerV9.java:1177) ~[druid-processing-32.0.1.jar:32.0.1]
	at org.apache.druid.segment.IndexMergerV9.mergeQueryableIndex(IndexMergerV9.java:1119) ~[druid-processing-32.0.1.jar:32.0.1]
	at org.apache.druid.segment.realtime.appenderator.StreamAppenderator.mergeAndPush(StreamAppenderator.java:957) ~[druid-server-32.0.1.jar:32.0.1]

With this patch, the DruidException message includes the column name along with a few remediation suggestions as follows:

Serialized buffer size[10] for column[foo] exceeds the maximum[5]. Consider adjusting the tuningConfig - for example, reduce maxRowsPerSegment, or partition your data further.
CleanShot 2025-12-18 at 20 03 32@2x

This PR has:

  • been self-reviewed.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.

- Include column name and a suggestion on how to remediate
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant