-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Bug description
Background
Velox executes SQL queries as pipelines of operators. For this bug, we are concerned with TableScan which drives the reading of the data.
-
TableScanfetches data in batches by repeatedly calling into a connector-providedDataSource, and feeds the resulting rows into the rest of the query pipeline. -
DataSourceis the connector-side interface thatTableScancalls to actually read data. Each connector (Hive, CLP, etc.) implements this interface. The lifecycle is:
- TableScan calls addSplit(split) to assign a "split" to the
DataSource. - TableScan calls next(size, future) repeatedly to fetch rows in batches of 1024 rows for a given
DataSource. - next returns a RowVector for each batch that contains matching rows.
- next returns nullptr to signal that the split is fully consumed, at which point TableScan moves on to the next split.
Note, A split must be fully drained by next before another split can be added.
Problem Statement
Bug 1: Premature split termination
ClpDataSource::next returns nullptr whenever the current batch has zero matching rows:
auto rowsScanned = cursor_->fetchNext(size);
auto rowsFiltered = cursor_->getNumFilteredRows();
if (rowsFiltered == 0) {
return nullptr;
}
Per the DataSource contract, nullptr signals that the split is fully consumed. TableScan then stops calling next and moves on to the next split. However, rowsFiltered == 0 only means the current batch had no matches — later batches in the same split may still contain matching rows.
Impact: This compromises the correctness of the returned result! Matching rows are silently dropped whenever a non-matching batch precedes them within the same split.
Example: An IR split has 2048 log events, batch size 1024. Events 1–1024 match nothing; events 1025–1100 has 1 match. The first next call scans batch 1, gets 0 filtered rows, returns nullptr. The matching row in batch 2 is never returned.
Bug 2: Inflated completedRows_
ClpIrCursor::fetchNext returns irDeserializer_->get_num_log_events_deserialized(), which is a cumulative count across all invocations, not a per-batch count. However, ClpDataSource::next adds this directly to completedRows_:
completedRows_ += rowsScanned; // rowsScanned is cumulative
This causes completedRows_ to grow quadratically.
Example: An IR split contains 3000 log events with a batch size of 1024. Every batch has matching rows.
| Call | fetchNext returns |
completedRows_ += |
completedRows_ |
Correct |
|---|---|---|---|---|
| 1 | 1024 | 1024 | 1024 | 1024 |
| 2 | 2048 | 2048 | 3072 | 2048 |
| 3 | 3000 | 3000 | 6072 | 3000 |
Impact: TableScan reads completedRows_ via getCompletedRows() and records it as
rawInputPositions in operator statistics. The inflated count causes rawInputPositions
to over-report the number of rows scanned (~2x in this example), making query performance
stats unreliable.
System information
Partial result is returned, incorrect functionality