Skip to content

Add column-based stripe boundary cutting to VeloxWriter#563

Open
tanjialiang wants to merge 2 commits intofacebookincubator:mainfrom
tanjialiang:export-D96025164
Open

Add column-based stripe boundary cutting to VeloxWriter#563
tanjialiang wants to merge 2 commits intofacebookincubator:mainfrom
tanjialiang:export-D96025164

Conversation

@tanjialiang
Copy link
Contributor

Summary:
For user sequence storage with cluster index, having one user per stripe
reduces read amplification by ~21x. Currently, callers must manually split
input batches at column value boundaries and call flush() — this is fragile
and duplicates logic across every caller.

This diff embeds stripe boundary detection into VeloxWriter via a new
stripeBoundaryColumnCount option. When set, the writer automatically
detects value transitions in the leading N index columns during write()
and flushes stripes at boundaries. This handles both intra-batch transitions
(slicing within a single write() call) and cross-batch transitions (between
consecutive write() calls).

The normal write path has zero overhead — when no boundary columns are
configured, it calls writeBatch() directly without any boundary checks.

Differential Revision: D96025164

…ations (facebookincubator#562)

Summary:

X-link: facebookincubator/velox#16695

MallocAllocator currently uses mmap/munmap for contiguous allocations despite
its name suggesting it delegates everything to malloc. The mmap/munmap syscall
pair on every contiguous alloc/free cycle adds overhead.

This diff:
1. Adds an `Options` struct to `MallocAllocator` (mirroring `MmapAllocator::Options`)
   for cleaner configuration.
2. Adds a `mallocContiguousEnabled` option that, when true, uses
   `aligned_alloc`/`free` for contiguous allocations instead of `mmap`/`munmap`.
3. Wires the option through `MemoryManager::Options`.
4. Updates `CachedBufferedInputTest` to use the new `Options` struct.

The option defaults to false to preserve existing behavior. When enabled,
contiguous allocations use `aligned_alloc(kPageSize, maxBytes)` which
immediately commits physical memory (vs mmap's lazy page faulting), but this
is acceptable for MallocAllocator's use case.

`growContiguous` requires no changes since both mmap and malloc allocate
`maxPages` upfront, so growth stays within already-allocated memory.

Reviewed By: duxiao1212

Differential Revision: D95875673
Summary:
For user sequence storage with cluster index, having one user per stripe
reduces read amplification by ~21x. Currently, callers must manually split
input batches at column value boundaries and call flush() — this is fragile
and duplicates logic across every caller.

This diff embeds stripe boundary detection into VeloxWriter via a new
`stripeBoundaryColumnCount` option. When set, the writer automatically
detects value transitions in the leading N index columns during write()
and flushes stripes at boundaries. This handles both intra-batch transitions
(slicing within a single write() call) and cross-batch transitions (between
consecutive write() calls).

The normal write path has zero overhead — when no boundary columns are
configured, it calls writeBatch() directly without any boundary checks.

Differential Revision: D96025164
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 11, 2026
@meta-codesync
Copy link

meta-codesync bot commented Mar 11, 2026

@tanjialiang has exported this pull request. If you are a Meta employee, you can view the originating Diff in D96025164.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant