Skip to content

Clean up lookahead-related code#3

Open
tomerqodo wants to merge 7 commits intoclaude_claude_vs_qodo_base_clean_up_lookahead-related_code_pr3from
claude_claude_vs_qodo_head_clean_up_lookahead-related_code_pr3
Open

Clean up lookahead-related code#3
tomerqodo wants to merge 7 commits intoclaude_claude_vs_qodo_base_clean_up_lookahead-related_code_pr3from
claude_claude_vs_qodo_head_clean_up_lookahead-related_code_pr3

Conversation

@tomerqodo
Copy link
Copy Markdown

Benchmark PR from agentic-review-benchmarks#3

Comment thread src/cluster.c
/* Allocate temporary buffer for slot tracking */
int *slot_buffer = malloc(sizeof(int) * keys_result->numkeys);

for (int j = 0; j <= keys_result->numkeys; j++) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Off-by-one error: the loop at line 1112 uses j <= keys_result->numkeys but valid indices are 0 to numkeys-1, causing a heap buffer overflow on both keys_result->keys[j] and slot_buffer[j] when j == numkeys. This triggers on every keyed command in cluster mode where all keys hash to the same slot (the cross-slot early return does not fire). Fix: change <= back to <.

Extended reasoning...

What the bug is

In extractSlotFromKeysResult(), the for loop condition was changed from the original j < keys_result->numkeys to j <= keys_result->numkeys. This is a classic off-by-one error. The keys_result->keys array has numkeys elements with valid indices 0 through numkeys-1. Similarly, slot_buffer is allocated as malloc(sizeof(int) * keys_result->numkeys), so its valid indices are also 0 through numkeys-1.

How it manifests

On the final iteration where j == numkeys, the code reads keys_result->keys[j].pos which is one element past the end of the keys array (out-of-bounds read), and writes to slot_buffer[j] which is one element past the end of the heap-allocated buffer (heap buffer overflow / out-of-bounds write). This is undefined behavior that can cause crashes, memory corruption, or security vulnerabilities.

Step-by-step proof

Consider a command like GET mykey in cluster mode:

  1. getKeysFromCommand extracts 1 key, so keys_result->numkeys == 1.
  2. slot_buffer is allocated with malloc(sizeof(int) * 1) -- valid index is only [0].
  3. Loop iterates: j = 0 (valid, processes the key), then j = 1 (since 1 <= 1 is true).
  4. At j = 1: keys_result->keys[1].pos reads past the array, and slot_buffer[1] = this_slot writes past the buffer.
  5. Both accesses are out-of-bounds -- heap buffer overflow.

Why existing code does not prevent it

The cross-slot early return (first_slot != this_slot) only fires when keys hash to different slots. When all keys hash to the same slot (the common case for single-key or same-slot commands), the loop runs to completion including the invalid final iteration. There is no bounds check on j other than the loop condition itself.

Impact

This is a critical memory safety bug. Every keyed command processed in cluster mode (GET, SET, DEL, etc.) where keys are in the same slot will trigger a heap buffer overflow. This can corrupt adjacent heap memory, cause segfaults, or potentially be exploited for code execution. The bug was introduced by this PR -- the original code correctly used j < keys_result->numkeys.

Fix

Change line 1112 from j <= keys_result->numkeys back to j < keys_result->numkeys.

Comment thread src/cluster.c
for (int j = 0; j < keys_result->numkeys; j++) {

/* Allocate temporary buffer for slot tracking */
int *slot_buffer = malloc(sizeof(int) * keys_result->numkeys);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 slot_buffer in extractSlotFromKeysResult() is allocated via malloc(), written to on each iteration (slot_buffer[j] = this_slot), but never read back — it is dead code adding unnecessary heap allocation overhead on every keyed command in cluster mode. Additionally, malloc/free should be zmalloc/zfree per Redis convention to maintain accurate used_memory tracking. The entire slot_buffer allocation and writes should be removed.

Extended reasoning...

Dead code: slot_buffer is write-only

In extractSlotFromKeysResult(), the PR introduces a heap-allocated buffer:

int *slot_buffer = malloc(sizeof(int) * keys_result->numkeys);

On each loop iteration, slot_buffer[j] = this_slot stores the computed slot value. However, slot_buffer is never read from anywhere — the function only uses the local variables first_slot and this_slot to determine whether all keys hash to the same slot. The buffer is freed at the end of the function (or on early return) without ever being consulted.

Concrete walkthrough

Consider a command like MGET key1 key2 in cluster mode with 2 keys:

  1. slot_buffer is allocated as malloc(sizeof(int) * 2) — 8 bytes on the heap.
  2. Iteration j=0: this_slot is computed, stored in slot_buffer[0], and first_slot is set.
  3. Iteration j=1: this_slot is computed, stored in slot_buffer[1], and compared against first_slot.
  4. If slots match, free(slot_buffer) is called and first_slot is returned.

At no point is slot_buffer[0] or slot_buffer[1] ever read back. The decision logic relies entirely on first_slot and this_slot. The original code in getSlotFromCommand() (visible in the diff as removed code in src/db.c) performed the same logic without any buffer.

Convention violation: malloc instead of zmalloc

Redis uses zmalloc/zfree throughout the codebase for all memory allocations so that used_memory accounting (exposed via INFO memory) remains accurate. In src/cluster.c, every other allocation uses zmalloc (e.g., lines 368, 636 for zmalloc/zfree). Using raw malloc/free here bypasses this tracking. While the allocation is small and short-lived, it violates the established convention.

Impact

This is not a correctness bug — the function produces correct results. However, it adds unnecessary malloc/free overhead on every key-bearing Redis command in cluster mode, which is a hot path. The wasted allocation is proportional to the number of keys in the command. For high-throughput workloads, these extra allocations add measurable overhead for zero benefit.

Fix

Remove the slot_buffer allocation, the slot_buffer[j] = this_slot assignments, and the free(slot_buffer) calls entirely. The function works correctly without them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants