-
Notifications
You must be signed in to change notification settings - Fork 3.8k
enhance: [2.6] Reduce memory allocations and copies in data loading #47088
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 2.6
Are you sure you want to change the base?
Conversation
- JsonFlatIndex: reuse scratch buffer for nested JSON serialization - ChunkWriter: cache parsed Json objects to avoid double parsing - StringChunkWriter: reuse string_view vector across writes Signed-off-by: Buqian Zheng <[email protected]>
- json_stats/utils: CreateArrowBuilders, CreateArrowSchema, CreateParquetKVMetadata - FunctionFactory: RegisterFilterFunction - IndexMeta: loop variables in ToString and constructor Signed-off-by: Buqian Zheng <[email protected]>
- Add reserve() calls before vector population loops - Use std::move() for rvalue parameters and return values - Move buffer allocations outside hot loops (SubSearchResult) Affected files: FieldData, ArrayOffsets, Schema, Slice, IndexStats, JsonContainsExpr, RescoresNode, PlanProto, SubSearchResult, bson_inverted, parquet_writer, segment_c Signed-off-by: Buqian Zheng <[email protected]>
- ReduceUtils: assign directly to protobuf field without temporary variable - RemoteInputStream: move fsync() outside loop for better I/O efficiency Signed-off-by: Buqian Zheng <[email protected]>
- Call FillFieldData(arrow::StringArray) directly in FieldData.cpp - Fix null handling in FillFieldData by using GetView() instead of iterator Signed-off-by: Buqian Zheng <[email protected]>
Replace dynamic allocation (new uint8_t[]) with stack-allocated arrays to prevent memory leaks when test assertions fail before delete[]. Signed-off-by: Buqian Zheng <[email protected]>
Signed-off-by: Buqian Zheng <[email protected]>
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: zhengbuqian The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
[ci-v2-notice] To rerun ci-v2 checks, comment with:
If you have any questions or requests, please contact @zhikunyao. |
|
[INFO] PR Label Summary by Default
Use /refresh-label to update related check and label manually |
|
@zhengbuqian go-sdk check failed, comment |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## 2.6 #47088 +/- ##
===========================================
+ Coverage 42.57% 76.76% +34.19%
===========================================
Files 12 1910 +1898
Lines 1893 298852 +296959
===========================================
+ Hits 806 229427 +228621
- Misses 1035 62022 +60987
- Partials 52 7403 +7351
🚀 New features to boost your workflow:
|
issue: #44452
pr: #46847