Skip to content

guard eetf_to_json reads against end of input#2659

Merged
stephenberry merged 3 commits into
stephenberry:mainfrom
uwezkhan:eetf-json-end-guards
Jun 21, 2026
Merged

guard eetf_to_json reads against end of input#2659
stephenberry merged 3 commits into
stephenberry:mainfrom
uwezkhan:eetf-json-end-guards

Conversation

@uwezkhan

Copy link
Copy Markdown
Contributor

term_to_json_value reads each map key and the list tail through get_type, which forwards the raw iterator to ei_get_type with no bounds check. The list and tuple paths run through write_sequence, which calls invalid_end right after its advance, but the ERL_MAP_EXT branch and the ERL_LIST_EXT tail read skip that check, and eetf_to_json calls decode_version before confirming the buffer is non-empty. A truncated map whose header declares an arity but carries no entries, a list missing its NIL tail, or an empty buffer each leave the iterator at end and the next ei_* read runs past it.

Before, those inputs returned whatever error the over-read happened to produce (or faulted against an unmapped page); after, each ei read is gated on it < end and they return unexpected_end / no_read_input deterministically. The guards sit in eetf_to_json next to the reads they protect rather than in the shared ei wrappers, because ei_get_type and ei_decode_version must read the tag byte to do their work and cannot validate end on their own. read_eetf already rejects empty input upstream, so only the eetf_to_json entry point needed the version guard.

@packit-as-a-service

Copy link
Copy Markdown

@packit-as-a-service

Copy link
Copy Markdown

1 similar comment
@packit-as-a-service

Copy link
Copy Markdown

The list-tail and map-key guards added in this branch only prove one byte (the
tag) is present, but get_type -> ei_get_type then reads a 2-4 byte length header
off the raw pointer for header-bearing tags, over-reading past end when the tag
is the final byte (verified with a guard page: SIGSEGV on a truncated improper
tail or map key). Read the tag with a single-byte peek instead: a proper list
tail only accepts ERL_NIL_EXT, and is_string/is_atom classify the raw map-key
tag while term_to_json_value re-reads and bounds-checks the full key.

Also fix decode_number casting the scratch value through the forwarding-reference
type T instead of the decayed value type V; static_cast<T> forms a reference cast
that fails to compile where int64_t is long long while long is the same width
(macOS/LLP64). Matches the existing float branch.

Add regression tests: empty buffer -> no_read_input; truncated map header and
truncated key tag -> unexpected_end; list missing its NIL tail -> unexpected_end
(with a valid-list counterpart); improper list tail -> array_element_not_found.
@stephenberry stephenberry merged commit 5668dd5 into stephenberry:main Jun 21, 2026
54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants