Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 26 additions & 22 deletions src/object.c
Original file line number Diff line number Diff line change
Expand Up @@ -1202,48 +1202,51 @@ size_t streamRadixTreeMemoryUsage(rax *rax) {
return size;
}

/* Returns the size in bytes consumed by the key's value in RAM.
/* Returns the size in bytes consumed by the object header, key and value in RAM.
* Note that the returned value is just an approximation, especially in the
* case of aggregated data types where only "sample_size" elements
* are checked and averaged to estimate the total size. */
#define OBJ_COMPUTE_SIZE_DEF_SAMPLES 5 /* Default sample size. */
size_t objectComputeSize(robj *key, robj *o, size_t sample_size, int dbid) {
size_t kvobjComputeSize(robj *key, kvobj *o, size_t sample_size, int dbid) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. kvobjcomputesize not static 📘 Rule violation ⛯ Reliability

kvobjComputeSize is a file-local helper but is declared without static, unnecessarily exporting
the symbol. This violates the requirement to declare internal helper functions as static to avoid
symbol pollution.
Agent Prompt
## Issue description
`kvobjComputeSize` is a helper function defined in `src/object.c` but it is not declared `static`, which exports the symbol unnecessarily.

## Issue Context
Per the compliance requirement, helper/internal functions that are only used within a single source file should be `static` to prevent symbol pollution.

## Fix Focus Areas
- src/object.c[1210-1210]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

dict *d;
dictIterator di;
struct dictEntry *de;
size_t asize = 0, elesize = 0, elecount = 0, samples = 0;
size_t elesize = 0, elecount = 0, samples = 0;

/* All kv-objects has at least kvobj header and embedded key */
size_t asize = malloc_usable_size((void *)o);
Comment on lines +1216 to +1217
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

2. Non-portable usable-size call 🐞 Bug ⛯ Reliability

kvobjComputeSize() uses malloc_usable_size(o) directly, bypassing Redis’ allocator abstraction;
under USE_JEMALLOC/USE_TCMALLOC this can be undefined (wrong symbol / wrong allocator) or fail to
compile on platforms without malloc_usable_size.
Agent Prompt
### Issue description
`kvobjComputeSize()` calls `malloc_usable_size()` directly, bypassing the project allocator abstraction. This can break builds (symbol not available) or return invalid sizes / crash when memory was allocated via jemalloc/tcmalloc APIs.

### Issue Context
This repo uses `zmalloc_size()` / `zmalloc_usable_size()` to map to `je_malloc_usable_size`, `tc_malloc_size`, `malloc_size`, etc. `kvobjComputeSize()` should use that abstraction.

### Fix Focus Areas
- src/object.c[1216-1218]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


if (o->type == OBJ_STRING) {
if(o->encoding == OBJ_ENCODING_INT) {
asize = sizeof(*o);
/* Value already counted (reuse the "ptr" in header to store int) */
} else if(o->encoding == OBJ_ENCODING_RAW) {
asize = sdsZmallocSize(o->ptr)+sizeof(*o);
asize += sdsZmallocSize(o->ptr);
} else if(o->encoding == OBJ_ENCODING_EMBSTR) {
asize = zmalloc_size((void *)o);
/* Value already counted (Value embedded in the object as well) */
} else {
serverPanic("Unknown string encoding");
}
} else if (o->type == OBJ_LIST) {
if (o->encoding == OBJ_ENCODING_QUICKLIST) {
quicklist *ql = o->ptr;
quicklistNode *node = ql->head;
asize = sizeof(*o)+sizeof(quicklist);
asize += sizeof(quicklist);
do {
elesize += sizeof(quicklistNode)+zmalloc_size(node->entry);
elecount += node->count;
samples++;
} while ((node = node->next) && samples < sample_size);
asize += (double)elesize/elecount*ql->count;
asize += (double)elesize/samples*ql->count;
Comment on lines 1234 to +1239
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

3. Quicklist average miscomputed 🐞 Bug ✓ Correctness

For OBJ_LIST quicklist encoding, kvobjComputeSize() divides sampled bytes by sampled nodes (samples)
but multiplies by total elements (ql->count), overestimating list memory usage by roughly the
average elements-per-node factor.
Agent Prompt
### Issue description
The quicklist memory estimation computes an average per *node* (`elesize/samples`) but multiplies it by total *elements* (`ql->count`), causing inflated MEMORY USAGE for lists.

### Issue Context
Within the sampling loop, `samples` increments once per node, while `elecount` accumulates elements across sampled nodes (`node->count`). `ql->count` is the total number of elements.

### Fix Focus Areas
- src/object.c[1231-1240]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

} else if (o->encoding == OBJ_ENCODING_LISTPACK) {
asize = sizeof(*o)+zmalloc_size(o->ptr);
asize += zmalloc_size(o->ptr);
} else {
serverPanic("Unknown list encoding");
}
} else if (o->type == OBJ_SET) {
if (o->encoding == OBJ_ENCODING_HT) {
d = o->ptr;
dictInitIterator(&di, d);
asize = sizeof(*o)+sizeof(dict)+(sizeof(struct dictEntry*)*dictBuckets(d));
asize += sizeof(dict) + (sizeof(struct dictEntry*) * dictBuckets(d));
while((de = dictNext(&di)) != NULL && samples < sample_size) {
sds ele = dictGetKey(de);
elesize += dictEntryMemUsage(0) + sdsZmallocSize(ele);
Expand All @@ -1252,20 +1255,20 @@ size_t objectComputeSize(robj *key, robj *o, size_t sample_size, int dbid) {
dictResetIterator(&di);
if (samples) asize += (double)elesize/samples*dictSize(d);
} else if (o->encoding == OBJ_ENCODING_INTSET) {
asize = sizeof(*o)+zmalloc_size(o->ptr);
asize += zmalloc_size(o->ptr);
} else if (o->encoding == OBJ_ENCODING_LISTPACK) {
asize = sizeof(*o)+zmalloc_size(o->ptr);
asize += zmalloc_size(o->ptr);
} else {
serverPanic("Unknown set encoding");
}
} else if (o->type == OBJ_ZSET) {
if (o->encoding == OBJ_ENCODING_LISTPACK) {
asize = sizeof(*o)+zmalloc_size(o->ptr);
asize += zmalloc_size(o->ptr);
} else if (o->encoding == OBJ_ENCODING_SKIPLIST) {
d = ((zset*)o->ptr)->dict;
zskiplist *zsl = ((zset*)o->ptr)->zsl;
zskiplistNode *znode = zsl->header->level[0].forward;
asize = sizeof(*o)+sizeof(zset)+sizeof(zskiplist)+sizeof(dict)+
asize += sizeof(zset) + sizeof(zskiplist) + sizeof(dict) +
(sizeof(struct dictEntry*)*dictBuckets(d))+
zmalloc_size(zsl->header);
while(znode != NULL && samples < sample_size) {
Expand All @@ -1280,14 +1283,14 @@ size_t objectComputeSize(robj *key, robj *o, size_t sample_size, int dbid) {
}
} else if (o->type == OBJ_HASH) {
if (o->encoding == OBJ_ENCODING_LISTPACK) {
asize = sizeof(*o)+zmalloc_size(o->ptr);
asize += zmalloc_size(o->ptr);
} else if (o->encoding == OBJ_ENCODING_LISTPACK_EX) {
listpackEx *lpt = o->ptr;
asize = sizeof(*o) + zmalloc_size(lpt) + zmalloc_size(lpt->lp);
asize += zmalloc_size(lpt) + zmalloc_size(lpt->lp);
} else if (o->encoding == OBJ_ENCODING_HT) {
d = o->ptr;
dictInitIterator(&di, d);
asize = sizeof(*o)+sizeof(dict)+(sizeof(struct dictEntry*)*dictBuckets(d));
asize += sizeof(dict) + (sizeof(struct dictEntry*) * dictBuckets(d));
while((de = dictNext(&di)) != NULL && samples < sample_size) {
hfield ele = dictGetKey(de);
sds ele2 = dictGetVal(de);
Expand All @@ -1302,7 +1305,7 @@ size_t objectComputeSize(robj *key, robj *o, size_t sample_size, int dbid) {
}
} else if (o->type == OBJ_STREAM) {
stream *s = o->ptr;
asize = sizeof(*o)+sizeof(*s);
asize += sizeof(*s);
asize += streamRadixTreeMemoryUsage(s->rax);

/* Now we have to add the listpacks. The last listpack is often non
Expand All @@ -1312,7 +1315,8 @@ size_t objectComputeSize(robj *key, robj *o, size_t sample_size, int dbid) {
raxIterator ri;
raxStart(&ri,s->rax);
raxSeek(&ri,"^",NULL,0);
size_t lpsize = 0, samples = 0;
size_t lpsize = 0;
size_t samples = 0;
while(samples < sample_size && raxNext(&ri)) {
unsigned char *lp = ri.data;
/* Use the allocated size, since we overprovision the node initially. */
Expand All @@ -1323,7 +1327,7 @@ size_t objectComputeSize(robj *key, robj *o, size_t sample_size, int dbid) {
asize += lpsize;
} else {
if (samples) lpsize /= samples; /* Compute the average. */
asize += lpsize * (s->rax->numele-1);
asize += lpsize * s->rax->numele;
/* No need to check if seek succeeded, we enter this branch only
* if there are a few elements in the radix tree. */
raxSeek(&ri,"$",NULL,0);
Comment on lines 1327 to 1333
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

4. Stream listpacks double-counted 🐞 Bug ✓ Correctness

For streams with more listpacks than the sampling budget, kvobjComputeSize() adds average listpack
size multiplied by s->rax->numele and then adds the last listpack’s real size again, overcounting
stream memory usage.
Agent Prompt
### Issue description
The stream listpack estimation counts `numele` listpacks via the average and then adds the last listpack size again, inflating MEMORY USAGE for streams.

### Issue Context
The code explicitly adds the last listpack after doing an average-based multiplication, and the surrounding comment describes estimating N-1 and then adding the last.

### Fix Focus Areas
- src/object.c[1311-1337]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

Expand Down Expand Up @@ -1364,7 +1368,7 @@ size_t objectComputeSize(robj *key, robj *o, size_t sample_size, int dbid) {
raxStop(&ri);
}
} else if (o->type == OBJ_MODULE) {
asize = moduleGetMemUsage(key, o, sample_size, dbid);
asize += moduleGetMemUsage(key, o, sample_size, dbid);
} else {
serverPanic("Unknown object type");
}
Expand Down Expand Up @@ -1780,7 +1784,7 @@ NULL
addReplyNull(c);
return;
}
size_t usage = objectComputeSize(c->argv[2], (robj *)kv, samples, c->db->id);
size_t usage = kvobjComputeSize(c->argv[2], kv, samples, c->db->id);
addReplyLongLong(c,usage);
} else if (!strcasecmp(c->argv[1]->ptr,"stats") && c->argc == 2) {
struct redisMemOverhead *mh = getMemoryOverheadData();
Expand Down
2 changes: 1 addition & 1 deletion tests/unit/type/hash.tcl
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ start_server {tags {"hash"}} {
create_hash myhash $contents
assert_encoding $type myhash

# coverage for objectComputeSize
# coverage for kvobjComputeSize
assert_morethan [memory_usage myhash] 0

test "HRANDFIELD - $type" {
Expand Down
2 changes: 1 addition & 1 deletion tests/unit/type/list.tcl
Original file line number Diff line number Diff line change
Expand Up @@ -2241,7 +2241,7 @@ foreach {pop} {BLPOP BLMPOP_RIGHT} {
set k [r lrange k 0 -1]
set dump [r dump k]

# coverage for objectComputeSize
# coverage for kvobjComputeSize
assert_morethan [memory_usage k] 0

config_set sanitize-dump-payload no mayfail
Expand Down
27 changes: 27 additions & 0 deletions tests/unit/type/string.tcl
Original file line number Diff line number Diff line change
Expand Up @@ -684,6 +684,33 @@ if {[string match {*jemalloc*} [s mem_allocator]]} {
lappend res [r get bar]
} {12 12}

# coverage for kvobjComputeSize
test {MEMORY USAGE - STRINGS} {
set sizes {1 5 8 15 16 17 31 32 33 63 64 65 127 128 129 255 256 257}
set hdrsize [expr {[s arch_bits] == 32 ? 12 : 16}]

foreach ksize $sizes {
set key [string repeat "k" $ksize]
# OBJ_ENCODING_EMBSTR, OBJ_ENCODING_RAW
foreach vsize $sizes {
set value [string repeat "v" $vsize]
r set $key $value
set memory_used [r memory usage $key]
set min [expr $hdrsize + $ksize + $vsize]
assert_lessthan_equal $min $memory_used
set max [expr {32 > $min ? 64 : [expr $min * 2]}]
assert_morethan_equal $max $memory_used
}

# OBJ_ENCODING_INT
foreach value {1 100 10000 10000000} {
r set $key $value
set min [expr $hdrsize + $ksize]
assert_lessthan_equal $min [r memory usage $key]
}
}
}

if {[string match {*jemalloc*} [s mem_allocator]]} {
test {Check MEMORY USAGE for embedded key strings with jemalloc} {

Expand Down
Loading