Introducing PartitionedVector by yingsu00 · Pull Request #1596 · IBM/velox

yingsu00 · 2026-01-14T08:06:28Z

This commit is the first PR for optimized PartitionedOutput. It introduces the PartitionedVector, in which the values are partitioned according to a given partitionId list. It uses in place swapping algorithm and has very high throughput. It can also be used in aggregation, sorting, etc.

velox/vector/tests/PartitionedVectorTest.cpp

velox/vector/PartitionedVector.cpp

czentgr

Looks pretty good. I suppose we need to see the subsequent changes to move them to the output buffer in the PartitionedOutput operator. Is this code available somewhere too?

czentgr · 2026-02-17T16:56:07Z

velox/vector/PartitionedVector.h

+  BufferPtr beginPartitionOffsets;
+
+  /// Optional reusable buffer for in-place row swapping.
+  BufferPtr swappingBuffer;


This needs initialization. Given it is optional I expect that this member won't always be set and you'd run into compiler errors on newer compilers.

velox/vector/PartitionedVector.cpp

czentgr · 2026-02-17T19:48:19Z

velox/vector/PartitionedVector.cpp

+void initializeBeginPartitionOffsets(
+    BufferPtr& beginPartitionOffsets,
+    const BufferPtr& endPartitionOffsets,
+    int32_t numPartitions,


vector_size_t?
In the test I also see unit32_t being used for numPartitions. But isn't that always the same and as such should have the same type?

The PartitionedVector uses numPartitions uint32_t as storage but sets it from a const uint32_t. Why not use vector_size_t?

vector_size_t?

Sure.

The PartitionedVector uses numPartitions uint32_t as storage but sets it from a const uint32_t. Why not use vector_size_t?
In my understanding, this is a history paradox. The PartitionFunction interface defines the partitions in uint32_t

virtual std::optional<uint32_t> partition( const RowVector& input, std::vector<uint32_t>& partitions) = 0;

But Velox is trying to use vector_size_t as size units and vector index everywhere. vector_size_t is defined as int32_t today. But I've seen places it overflows and may need to be int64_t instead. I think the right way is to unify the partition representation with Vector index, and both shall use vector_size_t. vector_size_t can be easily expanded or overriden in the future by defining using vector_size_t = int64_t; . But for now, I changed all numPartitions to vector_size_t. Another option is to make them all uint32_t. @czentgr Which one do you think makes more sense?

czentgr · 2026-02-17T19:49:39Z

velox/vector/PartitionedVector.h

+// TODO: This was copied from dwio::common::BufferUtil.h. However the vector
+// module should not depend on dwio. Move this to a common place
+template <typename T>
+void ensureCapacity(


Should go into the cpp file and declared for use in the test.
Or as you suggest a new utility?

It is a template function and the definition needs to be in the header file. IMHO it's not worthwhile to separate the declaration and definition for this very simple function. My original thought was to extract it from dwio::common and move it to common, but it's better to be done in a separate PR or commit. And to avoid future rebase conflicts I left it in this .h file because this file is new. But I just moved it to VectorUtil.h for now.

czentgr · 2026-02-17T19:50:56Z

velox/vector/PartitionedVector.h

+  BufferPtr swappingBuffer;
+
+  /// Optional starting row offset (used when partitioning a subset of rows).
+  vector_size_t firstRow{0};


I suppose for future use? This and the other member? Currently they are not used at all.

Yes they are for complex types. I can remove them now but that makes it's unjustifiable to have a PartitionBuildContext. Do you prefer removing them for now?

czentgr · 2026-02-17T23:49:20Z

velox/vector/PartitionedVector.cpp

+  std::memcpy(
+      &beginPartitionOffsets->asMutable<vector_size_t>()[1],
+      endPartitionOffsets->as<vector_size_t>(),
+      sizeof(uint32_t) * (numPartitions - 1));


This should be sizeof(vector_size_t). In the next line it uses sizeof(vector_size_t). As mentioned before I don't know why we switch between the types. It could cause problems.

Yes agree. I updated the unit of partition to uint32_t and unit of rows to vector_size_t

velox/vector/PartitionedVector.h

yingsu00 · 2026-02-23T07:29:24Z

@czentgr @xin-zhang2 Thank you very much for reviewing this PR! I have addressed your comments and did the following improvements:

Added VELOX_CHECKs for constructors and public facing functions
Renamed beginPartitionOffsets to cursorPartitionOffsets
Unified unit of partitions to uint32_t and unit of rows to vector_size_t
Cleaned up the constructors and static create functions. Added a new create function that is public facing and does not need to take endPartitionOffsets. The old one is moved to protected section for future complex types.
Enforced const whenever possible
Fixed a bug in test where the write of exepectedVector could write out of bound.

Your second review is much appreciated!

xin-zhang2 · 2026-02-26T13:49:27Z

velox/vector/PartitionedVector.h

+      velox::memory::MemoryPool* pool);
+
+  /// Allow move constructor and move assignment operator.
+  PartitionedVector(PartitionedVector&& other) = default;


The declarations of move constructor and move assigment can be placed below the deleted copy assignment.

xin-zhang2 · 2026-02-26T13:51:09Z

velox/vector/PartitionedVector.h

+      velox::memory::MemoryPool* pool);
+
+  PartitionedVector(
+      VectorPtr vector,


vector can be passed as const reference.

xin-zhang2 · 2026-02-26T13:54:30Z

velox/vector/PartitionedVector.h

+      velox::memory::MemoryPool* pool)
+      : PartitionedVector(flatVector, numPartitions, partitionOffsets, pool) {}
+
+  void partition(


It might make more sense to declare partition in PartitionedVector and override it here.

Sure. Previously it was not in base class because the arguments were not the same for different type of vectors. Since I wrapped them up in PartitionBuildContext in this version, we can make it virtual.

xin-zhang2 · 2026-02-26T14:11:10Z

velox/vector/PartitionedVector.cpp

+  }
+}
+
+inline void prefixSum(vector_size_t* offsets, uint32_t numPartitions) {


countPartitionSizes and prefixSum are always called together, so it might be better to combine them into a single function. We could also call ensureCapacity for endPartitionOffsets inside the function to prevent potential out-of-bound issues.

@xin-zhang2 prefixSum is used separately for Dictionary vectors. So I think we can use it this way for now.

xin-zhang2 · 2026-02-26T14:20:36Z

velox/vector/PartitionedVector.cpp

+// partition by repeatedly swapping elements until the current element belongs
+// to the current partition
+template <typename T>
+void partitionFixedWidthValuesInPlace(


We can also include the specialization for bool in this PR.
There's already an implementation in my draft PR for the benchmark.

yeah we can, but it will make this PR longer to merge. How about send a new PR with this one as the first commit, then rebase after this one is merged?

xin-zhang2 · 2026-02-26T14:27:39Z

velox/vector/PartitionedVector.cpp

+
+        vector_size_t destinationAddr = destinationOffset >> 3;
+        int8_t destinationBitInByte = destinationOffset & 7;
+        vector_size_t fromAddr = offset / kBitsPerByte;


Is there any reason to use kBitsPerByte here? Since it's always 8, we can also consider using bit operations when computing fromAddr and fromBitInByte.

xin-zhang2 · 2026-02-26T14:38:08Z

velox/vector/PartitionedVector.cpp

+    case VectorEncoding::Simple::BIASED:
+    case VectorEncoding::Simple::SEQUENCE:
+    case VectorEncoding::Simple::MAP:
+    case VectorEncoding::Simple::LAZY:


CONSTANT can be included in the unsupported encodings.

xin-zhang2 · 2026-02-26T14:38:51Z

velox/vector/PartitionedVector.cpp

+    case VectorEncoding::Simple::MAP:
+    case VectorEncoding::Simple::LAZY:
+      VELOX_UNSUPPORTED(
+          "Unsupported vector encoding for OptimizedPartitionedOutput: {}",


OptimizedPartitionedOutput can be modified to PartitionedVector.

xin-zhang2 · 2026-02-26T14:38:59Z

velox/vector/PartitionedVector.cpp

+          mapSimpleToName(encoding));
+    default:
+      VELOX_UNREACHABLE(
+          "Invalid vector encoding for OptimizedPartitionedOutput: {}",


xin-zhang2 · 2026-02-26T14:42:02Z

velox/vector/PartitionedVector.cpp

+    case VectorEncoding::Simple::FLAT: {
+      // Print the addresses of vector's values and nulls buffers for debugging
+      auto nulls = vector->rawNulls();
+      auto values = vector->values()->as<char>();


nulls and values are not used can be removed.

xin-zhang2 · 2026-02-26T14:53:54Z

velox/vector/PartitionedVector.cpp

+    PartitionBuildContext& ctx) {
+  auto valuesBuffer = vector_->as<FlatVector<T>>()->values();
+
+  Byte* rawNulls = (Byte*)vector_->rawNulls();


It would be better to call mutableRawNulls() as rawNulls is modified in partitionBitsInPlace, and use reinterpret_cast for the cast.

xin-zhang2

@yingsu00 Left a few comments. Please take a look. Thanks!

yingsu00 · 2026-02-27T08:43:44Z

@xin-zhang2 Thank you for your thorough review! Comments addressed, please take a look.

xin-zhang2

LGTM, Thanks!

yingsu00 requested a review from xin-zhang2 January 14, 2026 08:06

xin-zhang2 reviewed Feb 11, 2026

View reviewed changes

yingsu00 force-pushed the PartitionedOutput1.0 branch 3 times, most recently from bf73a19 to aded22e Compare February 13, 2026 05:42

yingsu00 marked this pull request as ready for review February 13, 2026 05:47

yingsu00 requested a review from majetideepak as a code owner February 13, 2026 05:47

yingsu00 removed the request for review from majetideepak February 13, 2026 05:47

czentgr reviewed Feb 18, 2026

View reviewed changes

xin-zhang2 reviewed Feb 18, 2026

View reviewed changes

velox/vector/PartitionedVector.h Outdated Show resolved Hide resolved

xin-zhang2 reviewed Feb 18, 2026

View reviewed changes

velox/vector/PartitionedVector.h Outdated Show resolved Hide resolved

yingsu00 force-pushed the PartitionedOutput1.0 branch 2 times, most recently from 02e249e to 667d8b5 Compare February 23, 2026 07:13

xin-zhang2 reviewed Feb 26, 2026

View reviewed changes

yingsu00 force-pushed the PartitionedOutput1.0 branch from 667d8b5 to fc7b2d5 Compare February 27, 2026 08:42

yingsu00 force-pushed the PartitionedOutput1.0 branch from fc7b2d5 to ff120fd Compare February 27, 2026 08:48

xin-zhang2 approved these changes Mar 3, 2026

View reviewed changes

prestodb-ci mentioned this pull request Mar 14, 2026

Rebase branch staging-1e3b969d8-rebase with staging-1e3b969d8-head (1e3b969) #1812

Closed

1 task

Conversation

yingsu00 commented Jan 14, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

czentgr left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

yingsu00 commented Feb 23, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xin-zhang2 left a comment

Choose a reason for hiding this comment

Uh oh!

yingsu00 commented Feb 27, 2026

Uh oh!

xin-zhang2 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants