Skip to content

fuzz: add initial fuzzing harness to flux-security using AFL++ and fix detected issues#221

Merged
mergify[bot] merged 16 commits intoflux-framework:masterfrom
grondo:fuzzer
Apr 16, 2026
Merged

fuzz: add initial fuzzing harness to flux-security using AFL++ and fix detected issues#221
mergify[bot] merged 16 commits intoflux-framework:masterfrom
grondo:fuzzer

Conversation

@grondo
Copy link
Copy Markdown
Contributor

@grondo grondo commented Apr 12, 2026

This PR adds an initial AFL++ fuzzing infrastructure to flux-security and fixes bugs discovered during an initial fuzzing campaign.

Status: WIP to let the initial campaign run a bit longer and to ensure we want to actually merge all the infrastructure.

The initial fuzzing was performed on 4 high-value functions that are used in IMP privileged code with possibly untrusted user or system data:

  • flux_sign_unwrap() / flux_sign_unwrap_noverify() - IMP validates signatures from untrusted guest users
  • kv_decode() - IMP decodes data from unprivileged child processes
  • cf_*() (libutil/cf.h) - Configuration parser. While config files are permission-checked, defense-in-depth requires robustness against malformed input

What's Included

Fuzzing Infrastructure

  • AFL++ build integration with AddressSanitizer
  • fuzzing harnesses for parsers and crypto functions in src/fuzz
  • Orchestration tooling (scripts/fuzz.py) for managing multi-fuzzer campaigns
  • Documentation and corpus management

Security Fixes

  • 21 parser hang vectors eliminated (TOML validation). Where possible, fixes were made in the tomltk wrapper since the upstream libtomlc99 parser is no longer maintained.
  • 15 heap corruption bugs fixed (use-after-free, buffer overflows)
  • NULL pointer crash fixed (empty payload handling)
  • Additional hardening from static analysis

Results

The initial fuzzing campaign was run for >12 hrs and was stopped after new edges failed to be detected for several hours.

  • All discovered crashes fixed
  • All parser hangs eliminated
  • No API changes required
  • Continuous fuzzing infrastructure in place for ongoing testing

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 12, 2026

Codecov Report

❌ Patch coverage is 92.85714% with 13 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.75%. Comparing base (c54f1e7) to head (a0df671).
⚠️ Report is 18 commits behind head on master.

Files with missing lines Patch % Lines
src/libutil/tomltk.c 92.72% 12 Missing ⚠️
src/imp/privsep.c 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #221      +/-   ##
==========================================
+ Coverage   83.91%   84.75%   +0.83%     
==========================================
  Files          38       38              
  Lines        4975     5137     +162     
==========================================
+ Hits         4175     4354     +179     
+ Misses        800      783      -17     
Files with missing lines Coverage Δ
src/imp/exec/safe_popen.c 64.61% <ø> (ø)
src/lib/sign.c 94.44% <100.00%> (+0.11%) ⬆️
src/libca/ca.c 80.53% <100.00%> (ø)
src/libca/sigcert.c 82.29% <100.00%> (ø)
src/libtomlc99/toml.c 84.41% <100.00%> (+1.83%) ⬆️
src/imp/privsep.c 75.90% <50.00%> (ø)
src/libutil/tomltk.c 87.28% <92.72%> (+9.51%) ⬆️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread src/libutil/tomltk.c Fixed
Comment thread src/libutil/tomltk.c Fixed
Comment thread src/libutil/tomltk.c Fixed
Comment thread src/libutil/tomltk.c Fixed
Comment thread src/libutil/tomltk.c Fixed
@grondo grondo force-pushed the fuzzer branch 9 times, most recently from 334d65e to 67b14e7 Compare April 15, 2026 20:17
@grondo grondo changed the title WIP: add initial fuzzing harness to flux-security using AFL++ and fix detected issues fuzz: add initial fuzzing harness to flux-security using AFL++ and fix detected issues Apr 16, 2026
@grondo
Copy link
Copy Markdown
Contributor Author

grondo commented Apr 16, 2026

Fuzzers have run for 72 hours and the last new path was found >6 hrs ago, so I think it is probably time to call the first fuzzing campaign complete. This PR is now ready for a review.

Copy link
Copy Markdown
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really awesome. Great work!

grondo and others added 15 commits April 16, 2026 13:51
Problem: There is no mechanism to build AFL++ fuzzing harnesses
for security-critical parsers that process untrusted input in
privileged contexts.

Add --enable-fuzzing configure option to build AFL++ fuzzing
harnesses. Requires CC to be set to an AFL compiler
(afl-clang-fast, afl-gcc, etc.).

When enabled, adds src/fuzz/ subdirectory containing fuzzing
harnesses that target privilege escalation attack surfaces in
the IMP (Independent Multi-threaded Privileged executor).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Problem: The IMP processes untrusted data (KV format, signed
payloads) in privileged contexts, creating privilege escalation
attack surface. Parser bugs could allow unprivileged users to
compromise the system.

Add three AFL++ fuzzing harnesses targeting these surfaces:

- fuzz_sign_unwrap_noverify.c: Parser-only fuzzing of
  flux_sign_unwrap() with FLUX_SIGN_NOVERIFY flag. Speed: ~180k
  execs/sec. Primary fuzzer for HEADER.PAYLOAD.SIGNATURE format.

- fuzz_sign_unwrap.c: Full fuzzing with signature verification.
  Speed: ~20-50k execs/sec. Tests crypto integration paths.

- fuzz_kv.c: Direct fuzzing of KV format parser used in privsep
  pipe communication between unprivileged child and privileged
  parent. Speed: ~200k execs/sec.

- fuzz_cf.c: Fuzzing of the libutil/cf interface which exercises
  TOML config parsing and TOML to JSON conversion.
  Speed: 90-150k execs/sec

All harnesses use AFL++ persistent mode (__AFL_LOOP) for high
throughput and include configuration auto-detection with fallback
paths. Input size limited to 1MB to prevent memory exhaustion.

The sign.toml configuration enables the "none" mechanism for
parser-focused fuzzing without crypto overhead.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Problem: Running AFL++ requires manual setup of multiple parallel
fuzzer instances, corpus generation, monitoring dashboards, and
crash triage. This complexity creates barriers to effective
security testing.

Add comprehensive documentation for AFL++ fuzzing infrastructure:

- FUZZING.md: Reference to src/fuzz/README.md for getting started.

- src/fuzz/README.md: Full fuzzing documentation.

- src/fuzz/FUZZING-COVERAGE-ANALYSIS.md: Attack surface inventory
  analyzing all code paths where untrusted input reaches the
  privileged IMP parent. Documents that CLI arguments and environment
  variables are processed in unprivileged child and cross privilege
  boundary only as structured KV/signed data formats.

- src/fuzz/COVERAGE-NOTES.md: Explains AFL++ coverage metrics and
  why 3-5% coverage is correct for parser-only fuzzing (tests 100%
  of parser code, which is only a fraction of total linked code)

Add unified Python CLI tool (fuzz.py) for managing AFL++ fuzzing:

- fuzz.py start: Launch 4 parallel fuzzers in single tmux session.
  Generates corpus if missing.
- fuzz.py stop: Stop all fuzzers
- fuzz.py watch: Live dashboard showing fuzzer stats
- fuzz.py triage: Interactive crash triage with ASAN/UBSAN support

The tool auto-detects AFL++ installation and project root, requires
no external dependencies (Python 3.6+ stdlib only), and handles
fuzzer lifecycle including session management and crash analysis.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Problem: When parsing signed payloads with empty PAYLOAD section
(e.g., "HEADER..SIGNATURE" with double dots), the base64 decoder
is called with NULL pointer, causing UBSAN violation. This bug
is discovered by AFL++ fuzzing with input: "A..none"

In payload_decode_cpy(), when srclen=0, dstlen=0, grow_buf()
returns success without allocating, leaving *buf=NULL. Then
sodium_base642bin() is called with NULL destination pointer.

Similarly, header_decode() should reject empty headers since
they cannot contain required version and mechanism fields.

Handle empty payload sections by returning early before base64
decoding. This is valid behavior as payloads can be empty.
Reject empty headers as invalid since headers must contain
metadata.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Problem: There is no test coverage for empty PAYLOAD sections in
signed payloads, which previously caused a NULL pointer bug.

Add test case in test_badpayload() for empty PAYLOAD section
handling (double dots in "HEADER..SIGNATURE" format).

This regression test verifies the fix for a NULL pointer bug
discovered by AFL++ fuzzing, where empty base64 sections cause
sodium_base642bin() to be called with NULL destination pointer.

Test both verification and NOVERIFY code paths to ensure empty
payloads are handled correctly in all cases.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Problem: Code formatting in a couple spots in imp/privsep.c does not
conform to modern project coding norms.

Adjust whitespace to modernize code.
Problem: The header guard for config.h uses `#ifndef HAVE_CONFIG_H`
instead of `#if HAVE_CONFIG_H`.

Fix it.
Problem: File descriptors created by open() in ca_revoke() and
fopen_mode() do not have O_CLOEXEC set, allowing them to leak
to child processes across exec().

Add O_CLOEXEC flag to both open() calls. This prevents file
descriptor leakage of CA certificate files (which contain secrets)
and revocation files to exec'd processes, completing the defense-in-
depth improvements identified in the FD leak audit.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Problem: toml_rtoi() reads one byte before the allocated string
buffer without bounds checking. When parse_array() calls STRNDUP()
to allocate a new string and then valtype() calls toml_rtoi(), the
expression `s[-1]` can read before the heap buffer into allocator
metadata, triggering ASan errors.

The bug is at the start of toml_rtoi():

    if (s[-1] == '_') return -1;  // No bounds check

Add bounds check before accessing s[-1]:

    if (s > src && s[-1] == '_') return -1;

This ensures we don't read before the original src pointer.

This issue was discovered by AFL++ fuzzing of the cf interface.
7 unique crash inputs were found, all triggering SIGABRT.

Note: libtomlc99 is unmaintained. Long-term migration to tomlc99
(the maintained replacement) should be considered.
Problem: In both array_to_json() and table_to_json(), the code
incorrectly calls json_decref(val) after json_array_append_new()
or json_object_set_new() fails. These jansson functions take
ownership of the val parameter and decrement its reference count
whether they succeed or fail. Calling json_decref() after a failed
call results in a use-after-free.

Remove the incorrect json_decref(val) calls on the error path.
The jansson *_new functions handle cleanup automatically on both
success and failure paths.

This issue was discovered by AFL++ fuzzing, which found 6 unique
crash inputs triggering this bug with malformed TOML containing
duplicate keys or keys causing JSON object insertion failures.
Problem: The scan_string() function in toml.c has a heap buffer
overflow when parsing unquoted timestamps. When processing
timestamp literals like "1979-05-27T07:32:00Z", the parser has
two bugs:

1. At line 1661, the loop continues when *p is NUL (end of input):

   for ( ; strchr("0123456789.:+-T Z", toupper(*p)); p++);

   The bug: strchr(haystack, '\0') returns a pointer to the NUL
   terminator at the end of haystack, not NULL. So when *p == '\0',
   strchr() returns non-NULL, and the loop increments p past the
   buffer end, reading uninitialized memory.

2. At line 1663, the backward loop can read before buffer start:

   for ( ; p[-1] == ' '; p--);

   If there are no trailing spaces, this reads p[-1] before
   checking if p > orig, potentially reading before the buffer.

Add explicit bounds checks:
1. Check *p before calling strchr() to stop at NUL
2. Check p > orig before accessing p[-1] to prevent underflow

This issue was discovered by AFL++ fuzzing of the cf interface.
9 unique crash inputs were found.
Problem: AFL++ fuzzing uncovered inputs that cause libtomlc99 to hang
indefinitely (5+ seconds). The patterns include: embedded NULL bytes,
invalid UTF-8 sequences (0x80-0xFF not in valid multi-byte patterns),
control characters (0x01-0x1F except \t,\n,\r) outside strings, deeply
nested brackets (40+ levels), adjacent triple-quote sequences (''''''
or """"""), escaped quotes in multi-line strings, backslashes in
single-quote (literal) strings, and excessively large inputs (>10,000
lines).

Add validate_toml_syntax() function that scans input before passing to
libtomlc99. Track state for strings (single/double quote, multi-line),
brackets (nesting depth, balance), arrays, and UTF-8 validity. Reject
patterns that trigger hangs. Track backslash escapes only in
double-quote strings (basic strings), not single-quote strings (literal
strings) per TOML spec - literal strings don't support escapes. Limits:
MAX_NESTING=32, MAX_LINES=10000. Validation completes in <0.5ms even
for pathological inputs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Problem: AFL++ fuzzing discovered multiple inputs that cause
libtomlc99 to hang indefinitely. These patterns need regression
tests to ensure validation fixes remain effective.

Add comprehensive test function test_afl_hangs() with 14 test
cases covering all hang patterns discovered across both
findings-cf and fuzzer04 directories:

- Embedded NULL bytes (findings-cf id:000000, 000003, 000006)
- Invalid UTF-8 sequences: 0x92, 0x81, 0xD1, 0xFF, 0x7F
  (all findings-cf hangs plus fuzzer04 id:000011)
- Control characters: 0x04 outside strings
  (findings-cf id:000006)
- Excessive bracket nesting: 40 levels
  (fuzzer04 id:000004 had 618\!)
- Adjacent triple-quote sequences: '''''' and \"\"\"\"\"\"
  (fuzzer04 id:000000)
- Long repetitive content: 200+ 'J' characters
  (fuzzer04 id:000012)
- Repetitive malformed timestamp patterns
  (fuzzer04 id:000013)
- Combined multiple issues from real AFL++ findings
- Valid UTF-8 multi-byte chars (positive test: élève)
- Excessive input size >10,000 lines

Each test verifies that inputs which previously caused infinite
loops are now rejected quickly with errno=EINVAL and clear error
messages. Tests document exact hang patterns and AFL++ finding IDs
for future reference.

Changes:
- Add #include <stdlib.h> for malloc/free
- Add test_afl_hangs() function (~200 lines, 14 assertions)
- Add test_afl_hangs() call in main()
Problem: The tomltk unit tests contain intentionally malformed TOML
inputs derived from AFL++ fuzzer findings. These test strings
trigger false positive typo warnings in CI checks.

Add src/libutil/test/tomltk.c to typo checker ignore list to
suppress warnings on fuzzer-generated test data.
Problem: toml_rtod_ex() has the same heap buffer underflow bugs as
toml_rtoi(). It checks s[-1] and s[-2] without verifying bounds,
allowing reads before the buffer start when parsing floats starting
with underscores or dots.

At line 2055, the code checks s[-2] after s++, which can read
before the buffer if the float starts with a dot. At line 2070,
the code checks s[-1] without verifying s > src.

Add bounds checks:
- Line 2055: Check s - 2 >= src before accessing s[-2]
- Line 2070: Check s > src before accessing s[-1]

This matches the fix applied to toml_rtoi() in an earlier commit.
Problem: The heap-use-after-free fix in JSON conversion, the
filename preservation fix in tomltk_parse_file(), the buffer
overflow fixes in libtomlc99 number/timestamp parsing, and the
validation layer (bracket balance, comment handling, multi-line
string tracking, UTF-8 truncation detection, escape handling, literal
string handling) lack test coverage at the public API level.

Add tests through the tomltk interface (not libtomlc99 directly,
since it's vendored code we'll replace):

test_json_conversion() (5 tests):
- Nested arrays and tables to exercise array_to_json() and
  table_to_json() code paths
- Array of arrays to test recursion in array_to_json()
- Mixed type array to ensure error paths don't crash
- Verifies the heap-use-after-free fix (removed incorrect
  json_decref() calls) doesn't break valid conversions

test_parse_file_errors() (2 tests):
- Creates temp file with invalid UTF-8
- Verifies tomltk_parse_file() preserves filename in error struct
  when validation fails

test_number_parsing() (12 tests):
- Integer patterns that triggered toml_rtoi() s[-1] buffer
  underflow (underscore at start, after sign, trailing)
- Float patterns that triggered toml_rtod_ex() s[-2] and s[-1]
  underflows (underscore after dot, at start, trailing)
- Timestamp edge cases that triggered scan_string() buffer
  overflow (no trailing newline, trailing spaces, minimal)
- Tests verify no crash occurs (fixes work) without depending on
  libtomlc99 validation behavior

test_multiline_strings() (6 tests):
- Multi-line double-quote strings (opening and closing)
- Multi-line single-quote strings (opening and closing)
- Nested quotes within multi-line strings
- Both types in one file
- Unterminated multi-line strings (both types)
- Exercises the triple-quote tracking logic added to prevent
  parser hangs

test_afl_hangs() additional coverage (9 tests):
- Comment character inside array value (tests in_array check)
- Unbalanced brackets, missing close (tests square_count balance)
- Unbalanced brackets, extra close (tests negative square_count)
- Truncated UTF-8 2-byte sequence (0xC2 without continuation)
- Truncated UTF-8 3-byte sequence (0xE0 with only 1 continuation)
- Truncated UTF-8 4-byte sequence (0xF0 with only 2 continuations)
- Escaped quotes in multi-line string (fuzzer04 id:000015) - tests
  escape_next tracking inside """ strings
- Backslash in literal string (fuzzer04 id:000018) - tests that
  single-quote strings don't treat \ as escape
- Complex literal string patterns (fuzzer04 id:000019) - tests
  multiple single quotes with backslashes
- Exercises validation layer bracket/comment/UTF-8/escape tracking

This tests the security fixes through the public tomltk API that
users actually call, so tests will remain valid when we replace
libtomlc99.

Increases test count from 40 to 73 tests (+82.5%).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@grondo
Copy link
Copy Markdown
Contributor Author

grondo commented Apr 16, 2026

Thanks! I've set MWP.

@mergify mergify Bot added the queued label Apr 16, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 16, 2026

Merge Queue Status

  • Entered queue2026-04-16 21:07 UTC · Rule: default
  • Checks skipped · PR is already up-to-date
  • Merged2026-04-16 21:07 UTC · at a0df671e0638fd06ba3cb60d480c15904caccbb7

This pull request spent 12 seconds in the queue, including 1 second running CI.

Required conditions to merge

@mergify mergify Bot merged commit 60f81e2 into flux-framework:master Apr 16, 2026
25 of 26 checks passed
@mergify mergify Bot removed the queued label Apr 16, 2026
@grondo grondo deleted the fuzzer branch April 16, 2026 21:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants