fuzz: add initial fuzzing harness to flux-security using AFL++ and fix detected issues#221
Merged
mergify[bot] merged 16 commits intoflux-framework:masterfrom Apr 16, 2026
Merged
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #221 +/- ##
==========================================
+ Coverage 83.91% 84.75% +0.83%
==========================================
Files 38 38
Lines 4975 5137 +162
==========================================
+ Hits 4175 4354 +179
+ Misses 800 783 -17
🚀 New features to boost your workflow:
|
334d65e to
67b14e7
Compare
Contributor
Author
|
Fuzzers have run for 72 hours and the last new path was found >6 hrs ago, so I think it is probably time to call the first fuzzing campaign complete. This PR is now ready for a review. |
garlick
approved these changes
Apr 16, 2026
Member
garlick
left a comment
There was a problem hiding this comment.
This is really awesome. Great work!
Problem: There is no mechanism to build AFL++ fuzzing harnesses for security-critical parsers that process untrusted input in privileged contexts. Add --enable-fuzzing configure option to build AFL++ fuzzing harnesses. Requires CC to be set to an AFL compiler (afl-clang-fast, afl-gcc, etc.). When enabled, adds src/fuzz/ subdirectory containing fuzzing harnesses that target privilege escalation attack surfaces in the IMP (Independent Multi-threaded Privileged executor). Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Problem: The IMP processes untrusted data (KV format, signed payloads) in privileged contexts, creating privilege escalation attack surface. Parser bugs could allow unprivileged users to compromise the system. Add three AFL++ fuzzing harnesses targeting these surfaces: - fuzz_sign_unwrap_noverify.c: Parser-only fuzzing of flux_sign_unwrap() with FLUX_SIGN_NOVERIFY flag. Speed: ~180k execs/sec. Primary fuzzer for HEADER.PAYLOAD.SIGNATURE format. - fuzz_sign_unwrap.c: Full fuzzing with signature verification. Speed: ~20-50k execs/sec. Tests crypto integration paths. - fuzz_kv.c: Direct fuzzing of KV format parser used in privsep pipe communication between unprivileged child and privileged parent. Speed: ~200k execs/sec. - fuzz_cf.c: Fuzzing of the libutil/cf interface which exercises TOML config parsing and TOML to JSON conversion. Speed: 90-150k execs/sec All harnesses use AFL++ persistent mode (__AFL_LOOP) for high throughput and include configuration auto-detection with fallback paths. Input size limited to 1MB to prevent memory exhaustion. The sign.toml configuration enables the "none" mechanism for parser-focused fuzzing without crypto overhead. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Problem: Running AFL++ requires manual setup of multiple parallel fuzzer instances, corpus generation, monitoring dashboards, and crash triage. This complexity creates barriers to effective security testing. Add comprehensive documentation for AFL++ fuzzing infrastructure: - FUZZING.md: Reference to src/fuzz/README.md for getting started. - src/fuzz/README.md: Full fuzzing documentation. - src/fuzz/FUZZING-COVERAGE-ANALYSIS.md: Attack surface inventory analyzing all code paths where untrusted input reaches the privileged IMP parent. Documents that CLI arguments and environment variables are processed in unprivileged child and cross privilege boundary only as structured KV/signed data formats. - src/fuzz/COVERAGE-NOTES.md: Explains AFL++ coverage metrics and why 3-5% coverage is correct for parser-only fuzzing (tests 100% of parser code, which is only a fraction of total linked code) Add unified Python CLI tool (fuzz.py) for managing AFL++ fuzzing: - fuzz.py start: Launch 4 parallel fuzzers in single tmux session. Generates corpus if missing. - fuzz.py stop: Stop all fuzzers - fuzz.py watch: Live dashboard showing fuzzer stats - fuzz.py triage: Interactive crash triage with ASAN/UBSAN support The tool auto-detects AFL++ installation and project root, requires no external dependencies (Python 3.6+ stdlib only), and handles fuzzer lifecycle including session management and crash analysis. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Problem: When parsing signed payloads with empty PAYLOAD section (e.g., "HEADER..SIGNATURE" with double dots), the base64 decoder is called with NULL pointer, causing UBSAN violation. This bug is discovered by AFL++ fuzzing with input: "A..none" In payload_decode_cpy(), when srclen=0, dstlen=0, grow_buf() returns success without allocating, leaving *buf=NULL. Then sodium_base642bin() is called with NULL destination pointer. Similarly, header_decode() should reject empty headers since they cannot contain required version and mechanism fields. Handle empty payload sections by returning early before base64 decoding. This is valid behavior as payloads can be empty. Reject empty headers as invalid since headers must contain metadata. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Problem: There is no test coverage for empty PAYLOAD sections in signed payloads, which previously caused a NULL pointer bug. Add test case in test_badpayload() for empty PAYLOAD section handling (double dots in "HEADER..SIGNATURE" format). This regression test verifies the fix for a NULL pointer bug discovered by AFL++ fuzzing, where empty base64 sections cause sodium_base642bin() to be called with NULL destination pointer. Test both verification and NOVERIFY code paths to ensure empty payloads are handled correctly in all cases. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Problem: Code formatting in a couple spots in imp/privsep.c does not conform to modern project coding norms. Adjust whitespace to modernize code.
Problem: The header guard for config.h uses `#ifndef HAVE_CONFIG_H` instead of `#if HAVE_CONFIG_H`. Fix it.
Problem: File descriptors created by open() in ca_revoke() and fopen_mode() do not have O_CLOEXEC set, allowing them to leak to child processes across exec(). Add O_CLOEXEC flag to both open() calls. This prevents file descriptor leakage of CA certificate files (which contain secrets) and revocation files to exec'd processes, completing the defense-in- depth improvements identified in the FD leak audit. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Problem: toml_rtoi() reads one byte before the allocated string
buffer without bounds checking. When parse_array() calls STRNDUP()
to allocate a new string and then valtype() calls toml_rtoi(), the
expression `s[-1]` can read before the heap buffer into allocator
metadata, triggering ASan errors.
The bug is at the start of toml_rtoi():
if (s[-1] == '_') return -1; // No bounds check
Add bounds check before accessing s[-1]:
if (s > src && s[-1] == '_') return -1;
This ensures we don't read before the original src pointer.
This issue was discovered by AFL++ fuzzing of the cf interface.
7 unique crash inputs were found, all triggering SIGABRT.
Note: libtomlc99 is unmaintained. Long-term migration to tomlc99
(the maintained replacement) should be considered.
Problem: In both array_to_json() and table_to_json(), the code incorrectly calls json_decref(val) after json_array_append_new() or json_object_set_new() fails. These jansson functions take ownership of the val parameter and decrement its reference count whether they succeed or fail. Calling json_decref() after a failed call results in a use-after-free. Remove the incorrect json_decref(val) calls on the error path. The jansson *_new functions handle cleanup automatically on both success and failure paths. This issue was discovered by AFL++ fuzzing, which found 6 unique crash inputs triggering this bug with malformed TOML containing duplicate keys or keys causing JSON object insertion failures.
Problem: The scan_string() function in toml.c has a heap buffer
overflow when parsing unquoted timestamps. When processing
timestamp literals like "1979-05-27T07:32:00Z", the parser has
two bugs:
1. At line 1661, the loop continues when *p is NUL (end of input):
for ( ; strchr("0123456789.:+-T Z", toupper(*p)); p++);
The bug: strchr(haystack, '\0') returns a pointer to the NUL
terminator at the end of haystack, not NULL. So when *p == '\0',
strchr() returns non-NULL, and the loop increments p past the
buffer end, reading uninitialized memory.
2. At line 1663, the backward loop can read before buffer start:
for ( ; p[-1] == ' '; p--);
If there are no trailing spaces, this reads p[-1] before
checking if p > orig, potentially reading before the buffer.
Add explicit bounds checks:
1. Check *p before calling strchr() to stop at NUL
2. Check p > orig before accessing p[-1] to prevent underflow
This issue was discovered by AFL++ fuzzing of the cf interface.
9 unique crash inputs were found.
Problem: AFL++ fuzzing uncovered inputs that cause libtomlc99 to hang
indefinitely (5+ seconds). The patterns include: embedded NULL bytes,
invalid UTF-8 sequences (0x80-0xFF not in valid multi-byte patterns),
control characters (0x01-0x1F except \t,\n,\r) outside strings, deeply
nested brackets (40+ levels), adjacent triple-quote sequences (''''''
or """"""), escaped quotes in multi-line strings, backslashes in
single-quote (literal) strings, and excessively large inputs (>10,000
lines).
Add validate_toml_syntax() function that scans input before passing to
libtomlc99. Track state for strings (single/double quote, multi-line),
brackets (nesting depth, balance), arrays, and UTF-8 validity. Reject
patterns that trigger hangs. Track backslash escapes only in
double-quote strings (basic strings), not single-quote strings (literal
strings) per TOML spec - literal strings don't support escapes. Limits:
MAX_NESTING=32, MAX_LINES=10000. Validation completes in <0.5ms even
for pathological inputs.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Problem: AFL++ fuzzing discovered multiple inputs that cause libtomlc99 to hang indefinitely. These patterns need regression tests to ensure validation fixes remain effective. Add comprehensive test function test_afl_hangs() with 14 test cases covering all hang patterns discovered across both findings-cf and fuzzer04 directories: - Embedded NULL bytes (findings-cf id:000000, 000003, 000006) - Invalid UTF-8 sequences: 0x92, 0x81, 0xD1, 0xFF, 0x7F (all findings-cf hangs plus fuzzer04 id:000011) - Control characters: 0x04 outside strings (findings-cf id:000006) - Excessive bracket nesting: 40 levels (fuzzer04 id:000004 had 618\!) - Adjacent triple-quote sequences: '''''' and \"\"\"\"\"\" (fuzzer04 id:000000) - Long repetitive content: 200+ 'J' characters (fuzzer04 id:000012) - Repetitive malformed timestamp patterns (fuzzer04 id:000013) - Combined multiple issues from real AFL++ findings - Valid UTF-8 multi-byte chars (positive test: élève) - Excessive input size >10,000 lines Each test verifies that inputs which previously caused infinite loops are now rejected quickly with errno=EINVAL and clear error messages. Tests document exact hang patterns and AFL++ finding IDs for future reference. Changes: - Add #include <stdlib.h> for malloc/free - Add test_afl_hangs() function (~200 lines, 14 assertions) - Add test_afl_hangs() call in main()
Problem: The tomltk unit tests contain intentionally malformed TOML inputs derived from AFL++ fuzzer findings. These test strings trigger false positive typo warnings in CI checks. Add src/libutil/test/tomltk.c to typo checker ignore list to suppress warnings on fuzzer-generated test data.
Problem: toml_rtod_ex() has the same heap buffer underflow bugs as toml_rtoi(). It checks s[-1] and s[-2] without verifying bounds, allowing reads before the buffer start when parsing floats starting with underscores or dots. At line 2055, the code checks s[-2] after s++, which can read before the buffer if the float starts with a dot. At line 2070, the code checks s[-1] without verifying s > src. Add bounds checks: - Line 2055: Check s - 2 >= src before accessing s[-2] - Line 2070: Check s > src before accessing s[-1] This matches the fix applied to toml_rtoi() in an earlier commit.
Problem: The heap-use-after-free fix in JSON conversion, the filename preservation fix in tomltk_parse_file(), the buffer overflow fixes in libtomlc99 number/timestamp parsing, and the validation layer (bracket balance, comment handling, multi-line string tracking, UTF-8 truncation detection, escape handling, literal string handling) lack test coverage at the public API level. Add tests through the tomltk interface (not libtomlc99 directly, since it's vendored code we'll replace): test_json_conversion() (5 tests): - Nested arrays and tables to exercise array_to_json() and table_to_json() code paths - Array of arrays to test recursion in array_to_json() - Mixed type array to ensure error paths don't crash - Verifies the heap-use-after-free fix (removed incorrect json_decref() calls) doesn't break valid conversions test_parse_file_errors() (2 tests): - Creates temp file with invalid UTF-8 - Verifies tomltk_parse_file() preserves filename in error struct when validation fails test_number_parsing() (12 tests): - Integer patterns that triggered toml_rtoi() s[-1] buffer underflow (underscore at start, after sign, trailing) - Float patterns that triggered toml_rtod_ex() s[-2] and s[-1] underflows (underscore after dot, at start, trailing) - Timestamp edge cases that triggered scan_string() buffer overflow (no trailing newline, trailing spaces, minimal) - Tests verify no crash occurs (fixes work) without depending on libtomlc99 validation behavior test_multiline_strings() (6 tests): - Multi-line double-quote strings (opening and closing) - Multi-line single-quote strings (opening and closing) - Nested quotes within multi-line strings - Both types in one file - Unterminated multi-line strings (both types) - Exercises the triple-quote tracking logic added to prevent parser hangs test_afl_hangs() additional coverage (9 tests): - Comment character inside array value (tests in_array check) - Unbalanced brackets, missing close (tests square_count balance) - Unbalanced brackets, extra close (tests negative square_count) - Truncated UTF-8 2-byte sequence (0xC2 without continuation) - Truncated UTF-8 3-byte sequence (0xE0 with only 1 continuation) - Truncated UTF-8 4-byte sequence (0xF0 with only 2 continuations) - Escaped quotes in multi-line string (fuzzer04 id:000015) - tests escape_next tracking inside """ strings - Backslash in literal string (fuzzer04 id:000018) - tests that single-quote strings don't treat \ as escape - Complex literal string patterns (fuzzer04 id:000019) - tests multiple single quotes with backslashes - Exercises validation layer bracket/comment/UTF-8/escape tracking This tests the security fixes through the public tomltk API that users actually call, so tests will remain valid when we replace libtomlc99. Increases test count from 40 to 73 tests (+82.5%). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Contributor
Author
|
Thanks! I've set MWP. |
Contributor
Merge Queue Status
This pull request spent 12 seconds in the queue, including 1 second running CI. Required conditions to merge
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds an initial AFL++ fuzzing infrastructure to flux-security and fixes bugs discovered during an initial fuzzing campaign.
Status: WIP to let the initial campaign run a bit longer and to ensure we want to actually merge all the infrastructure.
The initial fuzzing was performed on 4 high-value functions that are used in IMP privileged code with possibly untrusted user or system data:
flux_sign_unwrap()/flux_sign_unwrap_noverify()- IMP validates signatures from untrusted guest userskv_decode()- IMP decodes data from unprivileged child processescf_*()(libutil/cf.h) - Configuration parser. While config files are permission-checked, defense-in-depth requires robustness against malformed inputWhat's Included
Fuzzing Infrastructure
src/fuzzscripts/fuzz.py) for managing multi-fuzzer campaignsSecurity Fixes
Results
The initial fuzzing campaign was run for >12 hrs and was stopped after new edges failed to be detected for several hours.