Skip to content

Bound non-null-terminated value scanners against end of input#2636

Merged
stephenberry merged 2 commits into
mainfrom
nnt-bounds-hardening
Jun 17, 2026
Merged

Bound non-null-terminated value scanners against end of input#2636
stephenberry merged 2 commits into
mainfrom
nnt-bounds-hardening

Conversation

@stephenberry

@stephenberry stephenberry commented Jun 16, 2026

Copy link
Copy Markdown
Owner

Summary

A class of JSON value scanners dereference *it without an end check, relying on the trailing '\0' sentinel that only exists for null-terminated buffers. When opts.null_terminated = false (no sentinel), truncated or boundary input reads past the end of the buffer — a heap-buffer-overflow under ASAN on an exact-size buffer.

This is the same bug class as the recent minified skip_ws end-guard work, but in scanners that never route through skip_ws, so they were not covered. Each fix preserves null-terminated behavior — the gated guards compile out entirely, and the few loop restructures are behavior-equivalent — and bounds only the non-null-terminated path.

Fixes

Scanner File Reached via
skip_number (non-validating) util/parse.hpp glz::raw_json field / value skip
skip_number_with_validation util/parse.hpp get_view_json / validate_json numbers
number_of_array_elements json/read.hpp resizable non-emplace_back arrays (e.g. std::forward_list)
skip_string (non-padded, validating) util/parse.hpp get_view_json / validate_json strings
NDJSON read_new_lines json/ndjson.hpp read_ndjson with null_terminated = false
handle_slice (JMESPath array slices) json/jmespath.hpp read_jmespath slices with null_terminated = false
enum-by-name reader (reflective) json/read.hpp reflect_enums enum values (C++26 P2996) with null_terminated = false
  • skip_number / skip_string: gated via if constexpr (not Opts.null_terminated) (their opts structs gain a null_terminated field); zero overhead on the default path.
  • skip_number_with_validation: unconditional it != end guards on the standalone *it reads (the find_if_not scans were already end-bounded).
  • number_of_array_elements: if constexpr (not Opts.null_terminated) end-checks before each dereference; compiles out by default.
  • NDJSON read_new_lines: gated; the null-terminated path is unchanged.
  • handle_slice (both the tuple and resizable-array overloads): a local at_end() guard — if constexpr (not Opts.null_terminated), so it compiles out by default — runs before each structural ] / , scan. Covers the partial-read path, the read-all fallback (negative step / negative indices), and the skip-to-slice-start pre-scan.
  • enum-by-name reader: the key scan had its operands reversed (*it dereferenced before the it != end bound); swapped, plus a guard on the post-scan ++it for an unterminated key. Gated behind GLZ_REFLECTION26 + reflect_enums.

Scope note

This hardens the null_terminated = false contract. Reading a non-null-terminated buffer under default options (null_terminated = true, e.g. a std::string_view/std::span over an exact-size region with no terminator) remains unsupported, as it is for the rest of the reader — the value parsers rely on the sentinel there.

Tests

  • non_null_terminated_scanner_bounds (tests/json_test): exercises each scanner above at its buffer boundary (truncated and complete inputs over exact-size buffers).
  • non_null_terminated_slice_bounds (tests/jmespath): feeds every truncation prefix of a complete array through an exact-size buffer (tuple and vector targets, partial-read and read-all paths); the complete array still resolves.
  • reflect_enums non-null-terminated bounds (tests/p2996_test): reads an unterminated reflective enum key over an exact-size buffer and asserts an error. With the unbounded scan, a valid name lacking its closing quote over-reads and wrongly succeeds, so the error assertion pins the bounded behavior even where that CI job runs without sanitizers; it runs in the dedicated C++26 reflection CI.

The JSON suites run under the existing ASAN CI job, which is what catches the over-reads. Verified locally: json_test passes under ASAN (685 tests, 0 failures), and the jmespath suite passes under ASAN (18 tests, 136 asserts); each pre-fix over-read was reproduced and is resolved.

Several JSON value scanners dereference *it without an end check, relying on
the trailing '\0' sentinel that only exists for null-terminated buffers. On a
non-null-terminated buffer (opts.null_terminated = false) there is no
sentinel, so truncated or boundary input reads one or more bytes past the end
(heap-buffer-overflow under ASAN). Each fix leaves the null-terminated fast
path byte-for-byte unchanged and bounds only the non-null-terminated path:

- skip_number (non-validating): gate the digit scan on it < end when not
  null-terminated (skip_number_opts gains a null_terminated field).
- skip_number_with_validation: guard each standalone *it read with it != end
  (the find_if_not scans were already end-bounded).
- number_of_array_elements: bound the element pre-scan loop on it == end.
- skip_string (non-padded, validating): bound the scan loop and the
  post-backslash read (skip_string_opts gains a null_terminated field).
- NDJSON read_new_lines: bound the inter-record newline scan on it != end.

Adds a non_null_terminated_scanner_bounds suite exercising each scanner at its
buffer boundary; these run under the ASAN CI job.
…end of input

Two more value scanners share the bug class fixed in the previous commit: they
dereference *it without an end check, relying on the trailing '\0' sentinel that
only exists for null-terminated buffers. On a non-null-terminated buffer
(opts.null_terminated = false) truncated input reads past the end of the buffer
(heap-buffer-overflow under ASAN).

- jmespath handle_slice (both the tuple and resizable-array overloads): the
  structural ']' / ',' scans now route through a local at_end() guard that
  compiles out entirely when null_terminated. Covers the partial-read path, the
  read-all fallback (negative step / negative indices), and the
  skip-to-slice-start pre-scan.
- enum-by-name reader (GLZ_REFLECTION26 reflect_enums path): the key scan had
  its operands reversed (*it dereferenced before the it != end bound); swap them
  and guard the post-scan ++it against an unterminated key.

Tests:
- non_null_terminated_slice_bounds (tests/jmespath): feeds every truncation
  prefix of a complete array through an exact-size buffer (tuple and vector
  targets, partial-read and read-all paths). Runs under the ASAN CI job; each
  pre-fix over-read was reproduced at jmespath.hpp and is resolved.
- reflect_enums non-null-terminated bounds (tests/p2996_test): reads an
  unterminated reflective enum key over an exact-size buffer and asserts an
  error. With the unbounded scan a valid name lacking its closing quote
  over-reads and wrongly succeeds, so this pins the bounded behavior; it runs in
  the dedicated C++26 reflection CI.
@stephenberry stephenberry merged commit 6352691 into main Jun 17, 2026
53 checks passed
@stephenberry stephenberry deleted the nnt-bounds-hardening branch June 17, 2026 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant