Conversation
bf73a19 to
aded22e
Compare
czentgr
left a comment
There was a problem hiding this comment.
Looks pretty good. I suppose we need to see the subsequent changes to move them to the output buffer in the PartitionedOutput operator. Is this code available somewhere too?
velox/vector/PartitionedVector.h
Outdated
| BufferPtr beginPartitionOffsets; | ||
|
|
||
| /// Optional reusable buffer for in-place row swapping. | ||
| BufferPtr swappingBuffer; |
There was a problem hiding this comment.
This needs initialization. Given it is optional I expect that this member won't always be set and you'd run into compiler errors on newer compilers.
velox/vector/PartitionedVector.cpp
Outdated
| void initializeBeginPartitionOffsets( | ||
| BufferPtr& beginPartitionOffsets, | ||
| const BufferPtr& endPartitionOffsets, | ||
| int32_t numPartitions, |
There was a problem hiding this comment.
vector_size_t?
In the test I also see unit32_t being used for numPartitions. But isn't that always the same and as such should have the same type?
The PartitionedVector uses numPartitions uint32_t as storage but sets it from a const uint32_t. Why not use vector_size_t?
There was a problem hiding this comment.
vector_size_t?
Sure.
The PartitionedVector uses numPartitions uint32_t as storage but sets it from a const uint32_t. Why not use vector_size_t?
In my understanding, this is a history paradox. The PartitionFunction interface defines the partitions in uint32_t
virtual std::optional<uint32_t> partition(
const RowVector& input,
std::vector<uint32_t>& partitions) = 0;
But Velox is trying to use vector_size_t as size units and vector index everywhere. vector_size_t is defined as int32_t today. But I've seen places it overflows and may need to be int64_t instead. I think the right way is to unify the partition representation with Vector index, and both shall use vector_size_t. vector_size_t can be easily expanded or overriden in the future by defining using vector_size_t = int64_t; . But for now, I changed all numPartitions to vector_size_t. Another option is to make them all uint32_t. @czentgr Which one do you think makes more sense?
velox/vector/PartitionedVector.h
Outdated
| // TODO: This was copied from dwio::common::BufferUtil.h. However the vector | ||
| // module should not depend on dwio. Move this to a common place | ||
| template <typename T> | ||
| void ensureCapacity( |
There was a problem hiding this comment.
Should go into the cpp file and declared for use in the test.
Or as you suggest a new utility?
There was a problem hiding this comment.
It is a template function and the definition needs to be in the header file. IMHO it's not worthwhile to separate the declaration and definition for this very simple function. My original thought was to extract it from dwio::common and move it to common, but it's better to be done in a separate PR or commit. And to avoid future rebase conflicts I left it in this .h file because this file is new. But I just moved it to VectorUtil.h for now.
velox/vector/PartitionedVector.h
Outdated
| BufferPtr swappingBuffer; | ||
|
|
||
| /// Optional starting row offset (used when partitioning a subset of rows). | ||
| vector_size_t firstRow{0}; |
There was a problem hiding this comment.
I suppose for future use? This and the other member? Currently they are not used at all.
There was a problem hiding this comment.
Yes they are for complex types. I can remove them now but that makes it's unjustifiable to have a PartitionBuildContext. Do you prefer removing them for now?
velox/vector/PartitionedVector.cpp
Outdated
| std::memcpy( | ||
| &beginPartitionOffsets->asMutable<vector_size_t>()[1], | ||
| endPartitionOffsets->as<vector_size_t>(), | ||
| sizeof(uint32_t) * (numPartitions - 1)); |
There was a problem hiding this comment.
This should be sizeof(vector_size_t). In the next line it uses sizeof(vector_size_t). As mentioned before I don't know why we switch between the types. It could cause problems.
There was a problem hiding this comment.
Yes agree. I updated the unit of partition to uint32_t and unit of rows to vector_size_t
02e249e to
667d8b5
Compare
|
@czentgr @xin-zhang2 Thank you very much for reviewing this PR! I have addressed your comments and did the following improvements:
Your second review is much appreciated! |
velox/vector/PartitionedVector.h
Outdated
| velox::memory::MemoryPool* pool); | ||
|
|
||
| /// Allow move constructor and move assignment operator. | ||
| PartitionedVector(PartitionedVector&& other) = default; |
There was a problem hiding this comment.
The declarations of move constructor and move assigment can be placed below the deleted copy assignment.
velox/vector/PartitionedVector.h
Outdated
| velox::memory::MemoryPool* pool); | ||
|
|
||
| PartitionedVector( | ||
| VectorPtr vector, |
There was a problem hiding this comment.
vector can be passed as const reference.
| velox::memory::MemoryPool* pool) | ||
| : PartitionedVector(flatVector, numPartitions, partitionOffsets, pool) {} | ||
|
|
||
| void partition( |
There was a problem hiding this comment.
It might make more sense to declare partition in PartitionedVector and override it here.
There was a problem hiding this comment.
Sure. Previously it was not in base class because the arguments were not the same for different type of vectors. Since I wrapped them up in PartitionBuildContext in this version, we can make it virtual.
| } | ||
| } | ||
|
|
||
| inline void prefixSum(vector_size_t* offsets, uint32_t numPartitions) { |
There was a problem hiding this comment.
countPartitionSizes and prefixSum are always called together, so it might be better to combine them into a single function. We could also call ensureCapacity for endPartitionOffsets inside the function to prevent potential out-of-bound issues.
There was a problem hiding this comment.
@xin-zhang2 prefixSum is used separately for Dictionary vectors. So I think we can use it this way for now.
| // partition by repeatedly swapping elements until the current element belongs | ||
| // to the current partition | ||
| template <typename T> | ||
| void partitionFixedWidthValuesInPlace( |
There was a problem hiding this comment.
We can also include the specialization for bool in this PR.
There's already an implementation in my draft PR for the benchmark.
There was a problem hiding this comment.
yeah we can, but it will make this PR longer to merge. How about send a new PR with this one as the first commit, then rebase after this one is merged?
velox/vector/PartitionedVector.cpp
Outdated
|
|
||
| vector_size_t destinationAddr = destinationOffset >> 3; | ||
| int8_t destinationBitInByte = destinationOffset & 7; | ||
| vector_size_t fromAddr = offset / kBitsPerByte; |
There was a problem hiding this comment.
Is there any reason to use kBitsPerByte here? Since it's always 8, we can also consider using bit operations when computing fromAddr and fromBitInByte.
| case VectorEncoding::Simple::BIASED: | ||
| case VectorEncoding::Simple::SEQUENCE: | ||
| case VectorEncoding::Simple::MAP: | ||
| case VectorEncoding::Simple::LAZY: |
There was a problem hiding this comment.
CONSTANT can be included in the unsupported encodings.
velox/vector/PartitionedVector.cpp
Outdated
| case VectorEncoding::Simple::MAP: | ||
| case VectorEncoding::Simple::LAZY: | ||
| VELOX_UNSUPPORTED( | ||
| "Unsupported vector encoding for OptimizedPartitionedOutput: {}", |
There was a problem hiding this comment.
OptimizedPartitionedOutput can be modified to PartitionedVector.
velox/vector/PartitionedVector.cpp
Outdated
| mapSimpleToName(encoding)); | ||
| default: | ||
| VELOX_UNREACHABLE( | ||
| "Invalid vector encoding for OptimizedPartitionedOutput: {}", |
velox/vector/PartitionedVector.cpp
Outdated
| case VectorEncoding::Simple::FLAT: { | ||
| // Print the addresses of vector's values and nulls buffers for debugging | ||
| auto nulls = vector->rawNulls(); | ||
| auto values = vector->values()->as<char>(); |
There was a problem hiding this comment.
nulls and values are not used can be removed.
velox/vector/PartitionedVector.cpp
Outdated
| PartitionBuildContext& ctx) { | ||
| auto valuesBuffer = vector_->as<FlatVector<T>>()->values(); | ||
|
|
||
| Byte* rawNulls = (Byte*)vector_->rawNulls(); |
There was a problem hiding this comment.
It would be better to call mutableRawNulls() as rawNulls is modified in partitionBitsInPlace, and use reinterpret_cast for the cast.
xin-zhang2
left a comment
There was a problem hiding this comment.
@yingsu00 Left a few comments. Please take a look. Thanks!
667d8b5 to
fc7b2d5
Compare
|
@xin-zhang2 Thank you for your thorough review! Comments addressed, please take a look. |
fc7b2d5 to
ff120fd
Compare
This commit is the first PR for optimized PartitionedOutput. It introduces the PartitionedVector, in which the values are partitioned according to a given partitionId list. It uses in place swapping algorithm and has very high throughput. It can also be used in aggregation, sorting, etc.