table scan reports OOM while the system memory utilization is only 64% and has 22G free memory available
org.apache.gluten.exception.GlutenException: org.apache.gluten.exception.GlutenException: Error during calling Java code from native code: org.apache.gluten.exception.GlutenException: org.apache.gluten.exception.GlutenException: Exception: VeloxRuntimeError
Error Source: RUNTIME
Error Code: INVALID_STATE
Reason: Operator::addInput failed for [operator: PartialAggregation, plan node ID: 2]: Error during calling Java code from native code: org.apache.gluten.memory.memtarget.ThrowOnOomMemoryTarget$OutOfMemoryException: Not enough spark off-heap execution memory. Acquired: 8.0 MiB, granted: 2.1 MiB. Try tweaking config option spark.memory.offHeap.size to get larger space to run this application (if spark.gluten.memory.dynamic.offHeap.sizing.enabled is not enabled).
Current config settings:
spark.memory.offHeap.enabled=True
spark.gluten.memory.dynamic.offHeap.sizing.enabled=N/A
spark.gluten.memory.offHeap.size.in.bytes=17.2 GiB
spark.gluten.memory.task.offHeap.size.in.bytes=1098.1 MiB
spark.gluten.memory.conservative.task.offHeap.size.in.bytes=549.1 MiB
Memory consumer stats:
Task.96: Current used bytes: 1096.0 MiB, peak bytes: N/A
\- Gluten.Tree.95: Current used bytes: 1096.0 MiB, peak bytes: 1098.1 MiB
\- Capacity[8.0 EiB].94: Current used bytes: 1096.0 MiB, peak bytes: 1098.1 MiB
+- NativePlanEvaluator-93.0: Current used bytes: 1096.0 MiB, peak bytes: 1098.1 MiB
| \- single: Current used bytes: 1096.0 MiB, peak bytes: 1096.0 MiB
| +- root: Current used bytes: 1088.0 MiB, peak bytes: 1089.0 MiB
| | +- task.Gluten_Stage_3_TID_96_VTID_93: Current used bytes: 1088.0 MiB, peak bytes: 1089.0 MiB
| | | +- node.0: Current used bytes: 1087.9 MiB, peak bytes: 1088.0 MiB
| | | | +- op.0.0.0.TableScan: Current used bytes: 1087.9 MiB, peak bytes: 1088.0 MiB
| | | | \- op.0.0.0.TableScan.test-hive: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | | +- node.2: Current used bytes: 64.4 KiB, peak bytes: 1024.0 KiB
| | | | \- op.2.0.0.PartialAggregation: Current used bytes: 64.4 KiB, peak bytes: 64.4 KiB
| | | \- node.1: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | | \- op.1.0.0.FilterProject: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B
| \- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B
| \- default: Current used bytes: 0.0 B, peak bytes: 0.0 B
+- ShuffleWriter.93: Current used bytes: 0.0 B, peak bytes: 0.0 B
| \- single: Current used bytes: 0.0 B, peak bytes: 0.0 B
| +- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | \- default: Current used bytes: 0.0 B, peak bytes: 0.0 B
| \- root: Current used bytes: 0.0 B, peak bytes: 0.0 B
| \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B
+- VeloxBatchResizer.97.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 0.0 B
+- IteratorMetrics.92: Current used bytes: 0.0 B, peak bytes: 0.0 B
| \- single: Current used bytes: 0.0 B, peak bytes: 0.0 B
| +- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B
| | \- default: Current used bytes: 0.0 B, peak bytes: 0.0 B
| \- root: Current used bytes: 0.0 B, peak bytes: 0.0 B
| \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B
+- NativePlanEvaluator-93.0.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 0.0 B
+- ShuffleWriter.93.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 0.0 B
+- IteratorMetrics.92.OverAcquire.0: Current used bytes: 0.0 B, peak bytes: 0.0 B
\- VeloxBatchResizer.97: Current used bytes: 0.0 B, peak bytes: 0.0 B
\- single: Current used bytes: 0.0 B, peak bytes: 0.0 B
+- root: Current used bytes: 0.0 B, peak bytes: 0.0 B
| \- default_leaf: Current used bytes: 0.0 B, peak bytes: 0.0 B
\- gluten::MemoryAllocator: Current used bytes: 0.0 B, peak bytes: 0.0 B
\- default: Current used bytes: 0.0 B, peak bytes: 0.0 B
Backend
VL (Velox)
Bug description
It's a very simple query:
table scan reports OOM while the system memory utilization is only 64% and has 22G free memory available
Gluten version
No response
Spark version
None
Spark configurations
No response
System information
No response
Relevant logs