Skip to content

Commit 0c69981

Browse files
tanjialiangfacebook-github-bot
authored andcommitted
Skip memmove in prepareInputBuffer when trailing space suffices
Summary: Re-lands D95740314 with a bug fix. In `prepareInputBuffer()`, when `inputData_` already points into `inputBuffer_` and there is enough trailing capacity for the requested size, we can skip the memmove that compacts data to the front of the buffer. This avoids unnecessary data movement in the common case where chunks are consumed incrementally. The original D95740314 had a bug: `ensureInput()` appended new data at `inputBuffer_->asMutable<char>() + inputSize_` (buffer start + inputSize_), which is wrong when `inputData_` is mid-buffer after the memmove skip. Fixed by changing the memcpy destination to `const_cast<char*>(inputData_) + inputSize_`, which correctly appends relative to wherever `inputData_` currently points. All usage sites of `inputData_`/`inputBuffer_` were audited: - `loadNextChunk()`: reads from `inputData_`, no writes to buffer — safe - `ensureInputIncremental_hack()`: arithmetic relative to `inputData_` — safe - `prepareInputBuffer()` realloc path: copies from `inputData_`, updates pointer — safe - `prepareInputBuffer()` memmove path: resets `inputData_` to buffer start — safe - `seekToChunk()`: clean reset of `inputData_` and `inputSize_` — safe Differential Revision: D97329725
1 parent b021c53 commit 0c69981

File tree

2 files changed

+61
-4
lines changed

2 files changed

+61
-4
lines changed

dwio/nimble/velox/selective/ChunkedDecoder.cpp

Lines changed: 15 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,7 @@ bool ChunkedDecoder::ensureInput(int size) {
8080
} else {
8181
prepareInputBuffer(inputSize_ + len);
8282
// Append after the previous content.
83-
memcpy(inputBuffer_->asMutable<char>() + inputSize_, buf, len);
83+
memcpy(const_cast<char*>(inputData_) + inputSize_, buf, len);
8484
}
8585
inputSize_ += len;
8686
}
@@ -98,15 +98,26 @@ bool ChunkedDecoder::ensureInputIncremental_hack(int size, const char*& pos) {
9898

9999
// After this function is called, we ensure these:
100100
// 1. `inputBuffer_' is allocated and at least `size' bytes large.
101-
// 2. The first `inputSize_' bytes in `inputData_' before the call are copied to
102-
// the beginning of `inputBuffer_'.
103-
// 3. `inputData_' is pointing to `inputBuffer_'.
101+
// 2. The first `inputSize_' bytes in `inputData_' before the call are in
102+
// `inputBuffer_' (either at their current position or compacted to the
103+
// front).
104+
// 3. `inputData_' points into `inputBuffer_' with at least `size - inputSize_'
105+
// bytes of trailing space available for appending.
104106
void ChunkedDecoder::prepareInputBuffer(int32_t size) {
105107
NIMBLE_DCHECK_LE(inputSize_, size);
106108
if (inputBuffer_ && size <= inputBuffer_->capacity()) {
107109
if (inputData_ == inputBuffer_->as<char>()) {
108110
return;
109111
}
112+
// Check if we already point into the buffer and the remaining data plus
113+
// the new data fit without compacting to the front.
114+
const char* bufStart = inputBuffer_->as<char>();
115+
const char* bufEnd = bufStart + inputBuffer_->capacity();
116+
if (inputData_ >= bufStart && inputData_ < bufEnd &&
117+
inputData_ + size <= bufEnd) {
118+
// Enough trailing space — skip the memmove.
119+
return;
120+
}
110121
char* newInputData = inputBuffer_->asMutable<char>();
111122
if (inputSize_ > 0) {
112123
memmove(newInputData, inputData_, inputSize_);

dwio/nimble/velox/selective/tests/ChunkedDecoderTest.cpp

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -148,6 +148,52 @@ TEST_F(ChunkedDecoderTest, ensureInput) {
148148
ASSERT_FALSE(helper.ensureInput(1));
149149
}
150150

151+
// Verify that ensureInput correctly appends data when inputData_ is mid-buffer
152+
// (the memmove-skip optimization in prepareInputBuffer fires).
153+
// Regression test: the old code used inputBuffer_->asMutable<char>() +
154+
// inputSize_ for the memcpy destination, which is wrong when inputData_ doesn't
155+
// point to the start of inputBuffer_. The fix uses
156+
// const_cast<char*>(inputData_)
157+
// + inputSize_.
158+
TEST_F(ChunkedDecoderTest, ensureInputMemmoveSkip) {
159+
// Build 32 bytes of deterministic data.
160+
std::string data(32, '\0');
161+
for (size_t i = 0; i < data.size(); ++i) {
162+
data[i] = 'a' + (i % 26);
163+
}
164+
165+
// block_size = 3: Next() returns 3 bytes at a time, forcing ensureInput to
166+
// loop and accumulate data in inputBuffer_.
167+
ChunkedDecoder decoder(
168+
std::make_unique<dwio::common::SeekableArrayInputStream>(
169+
data.data(), data.size(), /*block_size=*/3),
170+
false,
171+
nullptr,
172+
&pool());
173+
ChunkedDecoderTestHelper helper(&decoder);
174+
175+
// Step 1: Fill the internal buffer with 9 bytes ("abcdefghi").
176+
// This forces inputBuffer_ allocation; Velox AlignedBuffer rounds capacity
177+
// up (typically to 64 bytes), so there is plenty of trailing space.
178+
helper.ensureInput(9);
179+
ASSERT_EQ(helper.inputData().substr(0, 9), data.substr(0, 9));
180+
181+
// Step 2: Consume 3 bytes. inputData_ advances to bufStart+3, inputSize_=6.
182+
// Now inputData_ no longer points to the start of inputBuffer_.
183+
helper.advanceInputData(3);
184+
ASSERT_EQ(helper.inputData(), data.substr(3, 6));
185+
186+
// Step 3: Request 9 bytes total (have 6, need 3 more from the stream).
187+
// prepareInputBuffer sees inputData_ mid-buffer with enough trailing space
188+
// and skips the memmove. Then Next() returns 3 more bytes that must be
189+
// appended right after the existing 6 bytes — i.e. at inputData_+6
190+
// (= bufStart+9), NOT at bufStart+6.
191+
// With the old bug, the memcpy destination was bufStart+inputSize_ =
192+
// bufStart+6, which overwrote the tail of the existing data, corrupting it.
193+
helper.ensureInput(9);
194+
ASSERT_EQ(helper.inputData().substr(0, 9), data.substr(3, 9));
195+
}
196+
151197
// Test fixture for ChunkedDecoder data operations with parameterized
152198
// stream index support.
153199
class ChunkedDecoderDataTest : public index::test::TabletIndexTestBase,

0 commit comments

Comments
 (0)