Cap recursion depth in text-format readers and unify depth_guard#2662
Merged
Conversation
The variant-reader depth guard does not cover every recursive path into a generic value: dotted keys (a.a.a...=1) recurse through resolve_nested_map and table headers ([a.a.a...]) through ensure_map_path, one native frame per path segment, with nothing bounding the descent. Both overflow the stack on adversarial input reachable from any untrusted TOML. Take the same depth_guard at the top of each helper so all recursive paths share one depth budget. Adds tests for the dotted-key and table-header cases plus a within-limit dotted-key round-trip.
The existing recursion tests sit far from the limit or are trivially shallow, and only exercise the single-category direct flow path. Adds a boundary test tied to glz::max_recursive_depth_limit (exactly the limit parses, one deeper errors) and a block-mapping value test that routes through the speculative block-mapping probe, with a within-limit round-trip.
skip_flow_content recurses one frame per delimiter on nested flow
collections with alternating brackets ("[{[{..."); same-delimiter nesting
("[[[[") loops without recursing, so only the alternating form was
unbounded. Skipping an unknown key's value (error_on_unknown_keys=false)
or scanning a tagged-variant discriminator reaches this, so untrusted
input could overflow the stack.
Take the shared depth_guard at the top of skip_flow_content. Adds a
regression test that skips an alternating-delimiter value.
The BSON and JSONB readers each carried their own depth_guard struct, byte-identical to glz::depth_guard (core/context.hpp) apart from the JSONB copy's non-noexcept constructor. Drop both local copies and use the shared guard so there is a single implementation. No behavior change: the bson_to_json call sites already resolve unqualified depth_guard to the shared type via enclosing-namespace lookup; the read call sites switch from qualified to unqualified. The bson and jsonb depth tests continue to pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Builds on #2660 by @uwezkhan (its commit is included here). That PR caps recursion depth in the YAML and TOML generic/variant readers so deeply nested flow collections (
[[[…,{{{…) and TOML inline arrays/tables can't overflow the stack on untrusted input. This branch extends the same protection to the recursion paths #2660 did not reach, and removes duplication.Added on top of #2660
resolve_nested_map(a.a.a…=1) andensure_map_path([a.a.a…]) recurse one frame per path segment with nothing bounding the descent — both still overflowed the stack. Guarded with the shareddepth_guard.skip_flow_content. Skipping an unknown key (error_on_unknown_keys=false) or scanning a tagged discriminator whose value is alternating[{[{…recursed one frame per delimiter and overflowed. Now bounded. (Same-delimiter[[[[already looped without recursing.)depth_guard. The BSON and JSONB readers each carried a near-identical copy of the guard; both now use the singleglz::depth_guardincore/context.hpp. No behavior change.glz::max_recursive_depth_limit, block-mapping speculative-path coverage, plus TOML dotted-key / table-header and a YAML skip regression test.Verification
All local (Apple Clang, C++23): yaml 668/668, toml 389/389, bson 120/120, jsonb 151/151. Each new guard was confirmed against a standalone repro that crashed (SIGSEGV) before and returns
exceeded_max_recursive_depthafter.