Redis 8.2.1#37
Conversation
…s#14274) Fix redis#14267 This bug was introduced by redis#13495 ### Summary When a replica clears a large database, it periodically calls processEventsWhileBlocked() in the replicationEmptyDbCallback() callback during the key deletion process. If defragmentation is enabled, this means that active defrag can be triggered while the database is being deleted. The defragmentation process may also modify the database at this time, which could lead to crashes when the database is accessed after defragmentation. Code Path: ``` replicationEmptyDbCallback() -> processEventsWhileBlocked() -> whileBlockedCron() -> defragWhileBlocked() ``` ### Solution This PR temporarily disables active defrag before emptying the database, then restores the active defrag setting after the empty is complete. --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…fter reload (redis#14276) This bug was introduced by redis#14130 found by @oranagra ### Summary Because `s->cgroup_ref` is created at runtime the first time a consumer group is linked with a message, but it is not released when all references are removed. However, after `debug reload` or restart, if the PEL is empty (meaning no consumer group is referencing any message), `s->cgroup_ref` will not be recreated. As a result, when executing XADD or XTRIM with `ACKED` option and checking whether a message that is being read but has not been ACKed can be deleted, the cgroup_ref being NULL will cause a crash. ### Code Path ``` xaddCommand -> streamTrim -> streamEntryIsReferenced ``` ### Solution Check if `s->cgroup_ref` is NULL in streamEntryIsReferenced().
Review Summary by QodoRedis 8.2.1 - Bug fixes for defrag and stream operations
WalkthroughsDescription• Prevent active defrag triggering during replica database flush • Fix crash when cgroups_ref is null in stream operations • Refactor test utilities for reply discarding efficiency • Add comprehensive test coverage for defrag and stream scenarios Diagramflowchart LR
A["Replica DB Flush"] -->|"Disable defrag"| B["emptyData()"]
B -->|"Restore defrag"| C["Flush Complete"]
D["Stream Entry Check"] -->|"Check cgroups_ref"| E["Prevent Null Crash"]
F["Test Utilities"] -->|"Extract helper"| G["discard_replies_every()"]
File Changes1. src/replication.c
|
Code Review by Qodo
1. version.h lacks tri-license
|
| /* Version information */ | ||
| #define REDIS_VERSION "8.2.1" | ||
| #define REDIS_VERSION_NUM 0x00080201 |
There was a problem hiding this comment.
1. version.h lacks tri-license 📘 Rule violation ✓ Correctness
src/version.h does not contain the required Redis tri-license header text, creating a licensing attribution compliance gap. This violates the requirement that all src/*.h files include the standard tri-license notice.
Agent Prompt
## Issue description
`src/version.h` is missing the required Redis tri-license header text.
## Issue Context
Compliance requires every `.h` file under `src/` to include the standard tri-license notice (RSALv2/SSPLv1/AGPLv3).
## Fix Focus Areas
- src/version.h[1-3]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| /* Version information */ | ||
| #define REDIS_VERSION "8.2.1" | ||
| #define REDIS_VERSION_NUM 0x00080201 |
There was a problem hiding this comment.
2. version.h missing include guards 📘 Rule violation ⛯ Reliability
src/version.h has no include guards, violating the required double-underscore include-guard convention. This can lead to multiple-inclusion issues and fails the header guard standardization requirement.
Agent Prompt
## Issue description
`src/version.h` lacks include guards required by the project convention.
## Issue Context
Compliance requires `.h` files to use the standardized double-underscore include guard pattern (e.g., `#ifndef __FILENAME_H` / `#define __FILENAME_H` / `#endif`).
## Fix Focus Areas
- src/version.h[1-3]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| int orig_active_defrag = server.active_defrag_enabled; | ||
| server.active_defrag_enabled = 0; | ||
|
|
||
| emptyData(-1, empty_db_flags, replicationEmptyDbCallback); | ||
|
|
||
| /* Restore the original active defragmentation setting. */ | ||
| server.active_defrag_enabled = 1; |
There was a problem hiding this comment.
3. orig_active_defrag unused variable 📘 Rule violation ⛯ Reliability
The new variable orig_active_defrag is assigned but never used, which typically triggers -Wunused-variable and can fail builds when -Werror is enabled. This violates the requirement to compile cleanly with -Werror.
Agent Prompt
## Issue description
`orig_active_defrag` is introduced but not used, which can trigger `-Wunused-variable` and fail compilation with `-Werror`.
## Issue Context
The code comment states the original active defrag setting should be restored; restoring should either use the saved value or the saved variable should be removed to avoid warnings.
## Fix Focus Areas
- src/replication.c[1956-1962]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| return 1; | ||
|
|
||
| /* Check if the message is in any consumer group's PEL */ | ||
| if (!s->cgroups_ref) return 1; |
There was a problem hiding this comment.
4. Acked trim always blocked 🐞 Bug ✓ Correctness
streamEntryIsReferenced() returns 1 when s->cgroups_ref is NULL, incorrectly treating entries as referenced even though there is no PEL reference index. This breaks MAXLEN ... ACKED trimming because callers only delete entries when streamEntryIsReferenced()==0.
Agent Prompt
### Issue description
`streamEntryIsReferenced()` currently returns `1` when `s->cgroups_ref` is NULL, which makes ACKED-based trimming treat entries as always referenced and prevents MAXLEN+ACKED deletions.
### Issue Context
`cgroups_ref` is created on demand only when linking PEL references; NULL implies there is no PEL reference index.
### Fix Focus Areas
- src/t_stream.c[2707-2711]
### Expected change
- Change `if (!s->cgroups_ref) return 1;` to `if (!s->cgroups_ref) return 0;` (so NULL means 'not referenced in PEL').
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| proc discard_replies_every {rd count frequency discard_num} { | ||
| if {$count % $frequency != 0} { | ||
| for {set k 0} {$k < $discard_num} {incr k} { | ||
| $rd read ; # Discard replies | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
5. Reply discard condition inverted 🐞 Bug ⛯ Reliability
discard_replies_every drains replies when count % frequency != 0, which triggers immediately (e.g., at count=1) and tries to read thousands of replies when only 1–2 commands were issued. This will block/hang multiple unit tests using redis_deferring_client.
Agent Prompt
### Issue description
`discard_replies_every` currently discards replies when `count % frequency != 0`, which causes immediate blocking because it tries to read far more replies than have been queued.
### Issue Context
Call sites increment `count` once per loop iteration and call this helper each iteration; with the current condition, iteration 1 will attempt to read `discard_num` replies.
### Fix Focus Areas
- tests/unit/memefficiency.tcl[70-76]
### Expected change
- Change the condition to `if {$count % $frequency == 0}` so replies are drained only every `frequency` iterations.
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
Benchmark PR from agentic-review-benchmarks#6