Skip to content

Reachable assertion in ZSTD_rescaleFreqs (bitCost < scaleLog) when CDict built with optimal-parser strategy + tiny dictionary #4681

@OwenSanzas

Description

@OwenSanzas

Summary

ZSTD_rescaleFreqs() in lib/compress/zstd_opt.c contains the invariant
assert(bitCost < scaleLog) on the FSE-derived matchlength / literal-length /
offset frequency tables, with scaleLog = 10. When a caller builds a CDict
via the public API ZSTD_createCDict_advanced() from a small custom
dictionary buffer and a ZSTD_compressionParameters configuration that
selects an optimal-parser strategy (ZSTD_btopt, ZSTD_btultra, or
ZSTD_btultra2), the first compression call can reach a state where
FSE_getMaxNbBits() returns tableLog + 1 == 10 for at least one symbol
(documented as the "fake cost" path for zero-frequency symbols in
lib/common/fse.h:474). That value equals scaleLog, so the strict-less-than
assertion fires and abort() is called.

Affected version: zstd 1.6.0 / HEAD 5233c58e6ca0b1c4c6b353ad79649191ed195bdc.

PoC

Trigger file

crash_input — 446 bytes. The harness consumes 11 control bytes from the
tail of the input (flags, level, dict/source selectors, individual
ZSTD_compressionParameters fields). The remaining bytes are split into a
dictionary prefix and a source buffer. For this specific input the harness
ends up calling ZSTD_createCDict_advanced with cParams whose strategy
falls in the optimal-parser range and with a small dictionary buffer; the
first call to ZSTD_compress_usingCDict then enters ZSTD_rescaleFreqs and
aborts.

How to generate

Any test driver that exercises libzstd's public CDict API with:

  • a small dictionary buffer (tens to a few hundred bytes is sufficient — the
    point is that the dictionary's encoded FSE NCount has zero-frequency
    symbols within 0..MaxML),
  • ZSTD_compressionParameters with strategy in {ZSTD_btopt, ZSTD_btultra, ZSTD_btultra2} (or equivalently compressionLevel >= 19 via
    ZSTD_getCParams()),
  • a non-empty source buffer to actually trigger compression.

Equivalent minimal pseudocode:

ZSTD_compressionParameters cp = ZSTD_getCParams(22 /*=btultra2 territory*/,
                                                 /*srcSize=*/ src_len,
                                                 /*dictSize=*/ dict_len);
ZSTD_CDict* cdict = ZSTD_createCDict_advanced(dict, dict_len,
                                              ZSTD_dlm_byCopy, ZSTD_dct_auto,
                                              cp, ZSTD_defaultCMem);
ZSTD_CCtx* cctx = ZSTD_createCCtx();
ZSTD_compress_usingCDict(cctx, dst, dst_cap, src, src_len, cdict);
/* aborts inside ZSTD_rescaleFreqs */

The attached crash_input is a concrete 446-byte realization of this.


Trigger Method 1: AGF fuzz harness (libFuzzer driver, reliable repro)

dictionary_loader.c is the harness:

#define ZSTD_STATIC_LINKING_ONLY

#include <stddef.h>
#include <stdint.h>
#include <stdlib.h>

#include "zstd.h"

#define MAX_DICT_SIZE ((size_t)65536)
#define MAX_SOURCE_SIZE ((size_t)4096)
#define MAX_WINDOW_LOG 17U
#define MAX_TABLE_LOG 17U

static uint8_t readByteFromEnd(const uint8_t *data, size_t *end) {
    if (*end == 0) return 0;
    *end -= 1;
    return data[*end];
}
static unsigned read16FromEnd(const uint8_t *data, size_t *end) {
    unsigned b0 = readByteFromEnd(data, end);
    unsigned b1 = readByteFromEnd(data, end);
    return b0 | (b1 << 8);
}
static size_t minSize(size_t a, size_t b) { return a < b ? a : b; }
static unsigned boundedByte(unsigned v, unsigned lo, unsigned hi) {
    return lo + (v % (hi - lo + 1));
}

int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    size_t end = size;
    uint8_t emptyBuffer[1] = {0};

    unsigned flags            = readByteFromEnd(data, &end);
    unsigned levelByte        = readByteFromEnd(data, &end);
    unsigned dictSelector     = read16FromEnd(data, &end);
    unsigned sourceSelector   = read16FromEnd(data, &end);
    unsigned windowLogByte    = readByteFromEnd(data, &end);
    unsigned hashLogByte      = readByteFromEnd(data, &end);
    unsigned chainLogByte     = readByteFromEnd(data, &end);
    unsigned searchLogByte    = readByteFromEnd(data, &end);
    unsigned minMatchByte     = readByteFromEnd(data, &end);
    unsigned targetLengthByte = readByteFromEnd(data, &end);
    unsigned strategyByte     = readByteFromEnd(data, &end);

    size_t payloadSize     = end;
    size_t dictLimit       = minSize(payloadSize, MAX_DICT_SIZE);
    size_t dictSize        = dictLimit == 0 ? 0 : (size_t)(dictSelector % (unsigned)(dictLimit + 1));
    size_t sourceAvailable = payloadSize - dictSize;
    size_t sourceLimit     = minSize(sourceAvailable, MAX_SOURCE_SIZE);
    size_t sourceSize      = sourceLimit == 0 ? 0 : (size_t)(sourceSelector % (unsigned)(sourceLimit + 1));

    const uint8_t *dict   = data;
    const uint8_t *source = data + dictSize;
    const void *dictPtr   = dictSize   == 0 ? (const void *)emptyBuffer : (const void *)dict;
    const void *sourcePtr = sourceSize == 0 ? (const void *)emptyBuffer : (const void *)source;

    ZSTD_dictLoadMethod_e   dictLoadMethod  = (flags & 1U) ? ZSTD_dlm_byRef : ZSTD_dlm_byCopy;
    ZSTD_dictContentType_e  dictContentType = (ZSTD_dictContentType_e)((flags >> 1) % 3U);

    ZSTD_compressionParameters cParams;
    if ((flags & 8U) != 0) {
        int compressionLevel = (int)(levelByte % 16U) - 3;
        cParams = ZSTD_getCParams(compressionLevel, (unsigned long long)sourceSize, dictSize);
    } else {
        cParams.windowLog    = boundedByte(windowLogByte,    ZSTD_WINDOWLOG_MIN, MAX_WINDOW_LOG);
        cParams.hashLog      = boundedByte(hashLogByte,      ZSTD_HASHLOG_MIN,   MAX_TABLE_LOG);
        cParams.chainLog     = boundedByte(chainLogByte,     ZSTD_CHAINLOG_MIN,  MAX_TABLE_LOG);
        cParams.searchLog    = boundedByte(searchLogByte,    ZSTD_SEARCHLOG_MIN, 6U);
        cParams.minMatch     = boundedByte(minMatchByte,     ZSTD_MINMATCH_MIN,  ZSTD_MINMATCH_MAX);
        cParams.targetLength = targetLengthByte;
        cParams.strategy     = (ZSTD_strategy)(ZSTD_fast + (strategyByte % (unsigned)(ZSTD_btultra2 - ZSTD_fast + 1)));
        cParams = ZSTD_adjustCParams(cParams, (unsigned long long)sourceSize, dictSize);
    }

    ZSTD_CDict *cdict = ZSTD_createCDict_advanced(dictPtr, dictSize,
                                                  dictLoadMethod, dictContentType,
                                                  cParams, ZSTD_defaultCMem);
    if (cdict != NULL) {
        ZSTD_CCtx *cctx = ZSTD_createCCtx();
        if (cctx != NULL) {
            size_t dstCapacity = ZSTD_compressBound(sourceSize);
            void *dst = malloc(dstCapacity);
            if (dst != NULL) {
                (void)ZSTD_compress_usingCDict(cctx, dst, dstCapacity,
                                               sourcePtr, sourceSize, cdict);
                free(dst);
            }
            (void)ZSTD_freeCCtx(cctx);
        }
    }
    (void)ZSTD_freeCDict(cdict);
    return 0;
}

Build (libFuzzer driver, ASan, asserts enabled):

# In the zstd source tree, after building libzstd.a with -DDEBUGLEVEL=1
clang -O1 -g \
      -fsanitize=address,fuzzer \
      -I lib -DZSTD_STATIC_LINKING_ONLY \
      dictionary_loader.c lib/libzstd.a \
      -o fuzz_dict
./fuzz_dict crash_input

Output:

zstd_opt.c:195: void ZSTD_rescaleFreqs(...): Assertion `bitCost < scaleLog' failed.
==ERROR: libFuzzer: deadly signal
    #11 ZSTD_rescaleFreqs              lib/compress/zstd_opt.c:195:21
    #12 ZSTD_compressBlock_opt_generic lib/compress/zstd_opt.c:1115:5
    #13 ZSTD_compressBlock_opt2        lib/compress/zstd_opt.c:1451:12
    #14 ZSTD_buildSeqStore             lib/compress/zstd_compress.c:3444:26
    #19 LLVMFuzzerTestOneInput         dictionary_loader.c:138:23

This reproduces on every run.


Trigger Method 2: Standalone driver (no libFuzzer)

app.c is a minimal main() that forwards the crash file straight into
LLVMFuzzerTestOneInput:

#include <stdio.h>
#include <stdlib.h>
extern int LLVMFuzzerTestOneInput(const unsigned char*, size_t);
int main(int argc, char**argv) {
    FILE *f = fopen(argv[1], "rb");
    fseek(f, 0, SEEK_END); long n = ftell(f); fseek(f, 0, SEEK_SET);
    unsigned char *b = malloc(n); fread(b, 1, n, f); fclose(f);
    fprintf(stderr, "[*] %ld bytes\n", n);
    LLVMFuzzerTestOneInput(b, (size_t)n);
    fprintf(stderr, "[*] no crash\n");
    return 0;
}

Build (ASan must match how libzstd.a was built — our libzstd.a is
ASan-instrumented, so the standalone driver must be linked with
-fsanitize=address too):

clang -O1 -g \
      -fsanitize=address \
      -I lib -DZSTD_STATIC_LINKING_ONLY \
      dictionary_loader.c app.c lib/libzstd.a \
      -o repro
ASAN_OPTIONS=detect_leaks=0 ./repro crash_input

Output: identical assertion at zstd_opt.c:195. In our environment this
reproduces 10/10 runs.

Honesty note (carried over from VERIFY.md): an earlier internal verification
pass reported this standalone reproducer as intermittent. On a fresh
rebuild we cannot reproduce the intermittence; we believe it was a build /
state artifact. If maintainers see flakiness anyway, the following loop
should reliably surface the crash within a few iterations:

for i in $(seq 1 50); do
    ASAN_OPTIONS=detect_leaks=0 ./repro crash_input && break
done

Suggested Fix

Two complementary options; either alone would resolve the crash, both
together would be the cleanest.

(a) Tighten the guard, or clamp. Inside ZSTD_rescaleFreqs (zstd_opt.c:176-210),
the FSE branches iterate over the full symbol range even though
FSE_getMaxNbBits() is documented to return tableLog + 1 for
zero-frequency symbols. Since the max FSE tableLog is MLFSELog == 9 == scaleLog - 1, that "fake cost" exactly hits scaleLog. The minimal patch
is to clamp:

    for (ml=0; ml<=MaxML; ml++) {
        U32 const scaleLog = 10;
        U32 bitCost = FSE_getMaxNbBits(mlstate.symbolTT, ml);
        /* FSE_getMaxNbBits() returns tableLog+1 as a "fake cost" for
         * zero-frequency symbols; tableLog can be MLFSELog == scaleLog-1,
         * which violates the strict-less-than invariant of this scale. */
        if (bitCost >= scaleLog) bitCost = scaleLog - 1;
        optPtr->matchLengthFreq[ml] = bitCost ? 1 << (scaleLog-bitCost) : 1;
        optPtr->matchLengthSum += optPtr->matchLengthFreq[ml];
    }

Apply the same clamp at lines 183 (litLength) and 207 (offCode). The
behavior on bitCost == scaleLog becomes matchLengthFreq[ml] = 1 — i.e.
identical to today's release-build behavior, but now also valid in debug
builds.

(b) Gate the dict-derived FSE path on the per-stream repeat modes, not just HUF.
The outer guard at zstd_opt.c:158 only checks huf.repeatMode == HUF_repeat_valid. The FSE branches should additionally check
fse.matchlength_repeatMode == FSE_repeat_valid (and likewise for litlength
/ offcode) before iterating the FSE table. When the per-stream repeat mode
is FSE_repeat_check (i.e. the dict NCount had zero-frequency entries), the
code should fall through to the "no dictionary stats" branch at line 212
for that specific stream instead of trusting the FSE table to cover the
full symbol range.

Option (b) is the more principled fix because it matches the design intent
of ZSTD_dictNCountRepeat() in zstd_compress.c:5071-5083, which already
labels these tables as "not fully valid" — ZSTD_rescaleFreqs is just
ignoring that label today.

Happy to send a PR for either approach if maintainers indicate a preference.

Credit

Aisle Research (Ze Sheng, Dmitrijs Trizna, Luigino Camastra, Guido Vranken)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions