Skip to content

[BUG] Scanner API: infinite-loop risk for some clients... #318

@rootkid19

Description

@rootkid19

Summary
After emitting YAML_STREAM_END_TOKEN, subsequent calls to yaml_parser_scan() can return success (rc==1) with token.type == YAML_NO_TOKEN. Scanner clients that terminate only on rc==0 (and don’t check for STREAM_END) can spin indefinitely (DoS). Clients that break on STREAM_END (typical) and the parser/event API are not affected.


Minimal input yaml file

---
simple: test
...

Minimal probe (C)
Logs (rc, token.type) per iteration and aborts exactly on rc==1 && NO_TOKEN after STREAM_END.

// scan_probe.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "yaml.h"

int main(void) {
    const char *yaml = "---\nsimple: test\n...";
    yaml_parser_t parser;
    yaml_token_t token;

    if (!yaml_parser_initialize(&parser)) return 1;
    yaml_parser_set_input_string(&parser, (const unsigned char*)yaml, strlen(yaml));

    int iter = 0, saw_end = 0;
    while (iter < 20) {
        int rc = yaml_parser_scan(&parser, &token);
        printf("[iter %d] rc=%d, token.type=%d\n", iter, rc, token.type);

        if (token.type == YAML_STREAM_END_TOKEN) saw_end = 1;

        /* Success + NO_TOKEN after end should not occur */
        if (rc == 1 && token.type == YAML_NO_TOKEN && saw_end) {
            fprintf(stderr, "*** rc=1 with YAML_NO_TOKEN after STREAM_END ***\n");
            abort();
        }
        if (rc == 0) break;

        yaml_token_delete(&token);
        iter++;
    }

    yaml_parser_delete(&parser);
    return 0;
}

Build & run (Linux, built-from-source libyaml)

# from libyaml repo root
./bootstrap
CC=clang CFLAGS="-g -O1 -fsanitize=address,undefined -fno-omit-frame-pointer" \
LDFLAGS="-fsanitize=address,undefined" ./configure
make -j"$(nproc)"

clang -g -O1 -fsanitize=address,undefined \
  -Iinclude harness/scan_probe.c \
  -L./src/.libs -Wl,-rpath,"$PWD"/src/.libs -lyaml \
  -o harness/scan_probe

ASAN_OPTIONS=verbosity=1:abort_on_error=1 ./harness/scan_probe

Output observed:

[iter 9]  rc=1, token.type=2   (STREAM_END)
[iter 10] rc=1, token.type=0   (YAML_NO_TOKEN)  <-- unexpected success with no token

Expected
After STREAM_END, yaml_parser_scan() should return rc==0 (or at most re-emit a terminal token once, then return 0). It should not return success with YAML_NO_TOKEN.


Scope / impact
Availability only. Affects scanner-API clients that ignore STREAM_END and rely on rc==0 to stop. Clients that break on STREAM_END are not impacted.


Environment

  • Reproduced on macOS (clang + ASAN) and Nobara Linux (clang + ASAN).
  • Built from this repository, linked against ./src/.libs/libyaml.
  • Public commits tested:
    • Linux: git rev-parse HEAD = 840b65c40675e2d06bf40405ad3f12dec7f35923
  • Probe uses only yaml_parser_scan() (no API mixing).
  • Logging occurs before yaml_token_delete().

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions