[perf][cp] Optimize deserialize UnsafeRows to RowVector#385
Open
guhaiyan0221 wants to merge 1 commit intobytedance:mainfrom
Open
[perf][cp] Optimize deserialize UnsafeRows to RowVector#385guhaiyan0221 wants to merge 1 commit intobytedance:mainfrom
guhaiyan0221 wants to merge 1 commit intobytedance:mainfrom
Conversation
Summary: The old API extracts the single data to std::string_view, which is not efficient, This new approach aligns with CompactRow deserialization, the benchmark shows it accelerates 0% to 5x times depending on the row type. For backward compatibility, old UnsafeRowDeserializer API calls the new API. The benchmark is not stable, the benchmark result is different depending on the machine status. ``` /mnt/DP_disk1/code/velox/build/velox/row/benchmarks# ./velox_unsafe_row_serialize_benchmark ============================================================================ [...]marks/UnsafeRowSerializeBenchmark.cpp relative time/iter iters/s ============================================================================ unsafe_deserialize_fixedWidth5 112.69us 8.87K unsafe_deserialize_fast_fixedWidth5 488.34% 23.08us 43.33K unsafe_deserialize_fixedWidth10 124.26us 8.05K unsafe_deserialize_fast_fixedWidth10 175.44% 70.83us 14.12K unsafe_deserialize_fixedWidth20 182.74us 5.47K unsafe_deserialize_fast_fixedWidth20 99.864% 182.98us 5.46K decimalsSerialize 49.70us 20.12K decimalsDeserialize 70.44us 14.20K unsafe_deserialize_strings1 61.31us 16.31K unsafe_deserialize_fast_strings1 180.84% 33.90us 29.50K unsafe_deserialize_strings5 114.08us 8.77K unsafe_deserialize_fast_strings5 113.19% 100.78us 9.92K unsafe_deserialize_arrays 126.41us 7.91K unsafe_deserialize_fast_arrays 373.10% 33.88us 29.51K unsafe_deserialize_nestedArrays 350.44us 2.85K unsafe_deserialize_fast_nestedArrays 259.99% 134.79us 7.42K unsafe_deserialize_maps 536.18us 1.87K unsafe_deserialize_fast_maps 500.97% 107.03us 9.34K unsafe_deserialize_structs 87.50us 11.43K unsafe_deserialize_fast_structs 147.26% 59.42us 16.83K ``` Corresponding PR: facebookincubator/velox#11936 But 11936 is reverted by 12978, which has been reverted introduces a core dump when the schema is `Row(MAP(INTEGER(), ROW({ARRAY(INTEGER()), INTEGER()})))` and the nested row is null, we should check the null buffer first, and then slice the ARRAY's buffer. Corresponding PR: facebookincubator/velox#12841
There was a problem hiding this comment.
Pull request overview
This PR updates the UnsafeRow deserialization path to avoid extracting rows as std::string_view and instead pass raw row pointers into a new UnsafeRowFast::deserialize implementation, aiming to improve performance and fix a reported crash in certain nested-null schemas.
Changes:
- Switch UnsafeRow vector serde and benchmarks to deserialize via
UnsafeRowFast::deserialize(std::vector<char*>). - Introduce
UnsafeRowFast::deserialize(...)andserializedRowSizes(...)APIs and implement a new multi-row deserializer inUnsafeRowFast.cpp. - Update/replace UnsafeRow tests and adjust build targets accordingly.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| bolt/serializers/UnsafeRowSerializer.cpp | Uses RowDeserializer<char*> and UnsafeRowFast::deserialize for vector serde deserialization. |
| bolt/serializers/RowSerializer.h | Extends RowDeserializer to emit either std::string_view or raw pointers. |
| bolt/row/tests/UnsafeRowTest.cpp | Reworks fuzz/roundtrip tests and adds a nested map/row regression test case. |
| bolt/row/tests/CMakeLists.txt | Renames the UnsafeRow test source in the test target. |
| bolt/row/benchmark/UnsafeRowSerializeBenchmark.cpp | Updates benchmark to deserialize using UnsafeRowFast. |
| bolt/row/benchmark/DynamicRowVectorDeserializeBenchmark.cpp | Removes an older benchmark. |
| bolt/row/UnsafeRowFast.h | Adds new APIs for batch size computation and batch deserialization. |
| bolt/row/UnsafeRowFast.cpp | Implements new batch deserializer logic for UnsafeRow into vectors. |
| bolt/row/UnsafeRowDeserializers.h | Replaces legacy implementation with a compatibility wrapper over UnsafeRowFast. |
| bolt/row/CompactRow.cpp | Minor shared typedef alignment (TRowSize). |
Comments suppressed due to low confidence (2)
bolt/row/tests/UnsafeRowTest.cpp:131
- In testRoundTrip(),
serialized.push_back(rawBuffer + offset)relies on implicit conversion fromchar*toT. If T isstd::optional<std::string_view>, this will use thestring_view(const char*)ctor and compute length viastrlen, which is incorrect for binary UnsafeRow data (and can read past the buffer). Construct the string_view with an explicit length (size) when testing the legacyUnsafeRowDeserializerpath.
bolt/row/tests/UnsafeRowTest.cpp:81 - doFuzzTest() currently generates randomized input vectors but never exercises serialization/deserialization or asserts equality (e.g., via testRoundTrip(inputVector)). As written, TEST_F(UnsafeRowTest, fast) can pass without validating anything, which weakens coverage for the new deserializer logic.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
178
to
+183
| serializedBuffers.emplace_back(std::move(serializedBuffer)); | ||
| serializedRows.push_back(std::string_view( | ||
| serializedBuffers.back().data(), serializedBuffers.back().size())); | ||
| if constexpr (std::is_same_v<SerializeView, std::string_view>) { | ||
| serializedRows.push_back(std::string_view( | ||
| serializedBuffers.back().data(), serializedBuffers.back().size())); | ||
| } else { | ||
| serializedRows.push_back(serializedBuffers.back().data()); |
Comment on lines
+1073
to
+1076
| auto* serializedNulls = readNulls(data[row] + offsets[row]); | ||
| const auto isNull = | ||
| (rawNulls != nullptr && bits::isBitNull(rawNulls, row)) || | ||
| bits::isBitSet(serializedNulls, i); |
Comment on lines
+67
to
+72
| std::vector<char*> dataPtrs; | ||
| dataPtrs.reserve(data.size()); | ||
| for (const auto& row : data) { | ||
| const char* ptr = row.value().data(); | ||
| dataPtrs.emplace_back(const_cast<char*>(ptr)); | ||
| } |
Comment on lines
+69
to
74
| for (const auto& row : data) { | ||
| const char* ptr = row.value().data(); | ||
| dataPtrs.emplace_back(const_cast<char*>(ptr)); | ||
| } | ||
| return UnsafeRowFast::deserialize(dataPtrs, asRowType(type), pool); | ||
| } |
Comment on lines
72
to
+73
| *result = std::dynamic_pointer_cast<RowVector>( | ||
| bolt::row::UnsafeRowDeserializer::deserialize( | ||
| serializedRows, type, pool)); | ||
| bolt::row::UnsafeRowFast::deserialize(serializedRows, type, pool)); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: close #191
Type of Change
Description
Summary:
The old API extracts the single data to std::string_view, which is not efficient, This new approach aligns with CompactRow deserialization, the benchmark shows it accelerates 0% to 5x times depending on the row type. For backward compatibility, old UnsafeRowDeserializer API calls the new API. The benchmark is not stable, the benchmark result is different depending on the machine status.
Corresponding PR: facebookincubator/velox#11936
But 11936 is reverted by 12978, which has been reverted introduces a core dump when the schema is
Row(MAP(INTEGER(), ROW({ARRAY(INTEGER()), INTEGER()})))and the nested row is null, we should check the null buffer first, and then slice the ARRAY's buffer.Corresponding PR: facebookincubator/velox#12841
Performance Impact
Release Note
Please describe the changes in this PR
Release Note:
Checklist (For Author)
Breaking Changes
No
Yes (Description: ...)
Click to view Breaking Changes