You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This PR honor kPreferredOutputBatchRows config. #7051
Now there is the constraint that single row output should be into single batch.
But for this case, an input row has a very large nested array+struct, the output batch size is also large.
So we need to respect kPreferredOutputBatchRows strictly.
There is several strategies:
Split the row only if one row output batch size is more than maxOutputBatchSize.
Always split the last row to match the output batch size .
I would prefer the second way, it can lead to accurate batch size.
We could add a benchmark to test the performance if we always split the end row.
It would be nice to figure out how to produce batches of a specified "size in bytes", rather than "number of rows". This doesn't have happen right away, but something to keep in mind.
Description
This PR honor kPreferredOutputBatchRows config.
#7051
Now there is the constraint that single row output should be into single batch.
But for this case, an input row has a very large nested array+struct, the output batch size is also large.
So we need to respect kPreferredOutputBatchRows strictly.
There is several strategies:
I would prefer the second way, it can lead to accurate batch size.
We could add a benchmark to test the performance if we always split the end row.
#7051 (comment)
The text was updated successfully, but these errors were encountered: