Skip to content

Add OSS-Fuzz harnesses for HLO parser and proto deserialization#42055

Open
ricaskew wants to merge 1 commit into
openxla:mainfrom
ricaskew:add-xla-oss-fuzz-harnesses
Open

Add OSS-Fuzz harnesses for HLO parser and proto deserialization#42055
ricaskew wants to merge 1 commit into
openxla:mainfrom
ricaskew:add-xla-oss-fuzz-harnesses

Conversation

@ricaskew
Copy link
Copy Markdown

@ricaskew ricaskew commented May 5, 2026

Add OSS-Fuzz harnesses for HLO parser and proto deserialization

This PR adds two libFuzzer harnesses to xla/fuzz/ as part of an
OSS-Fuzz integration for openxla/xla. A companion PR in google/oss-fuzz
will be opened concurrently to wire the build and register the project.

The HLO text-format parser and proto deserialization path were selected
as the primary fuzzing surfaces because they accept arbitrary external
input, have significant parsing complexity, and currently have no
OSS-Fuzz coverage.

Harnesses

hlo_parser_fuzz — exercises xla::ParseAndReturnUnverifiedModule
against arbitrary text input bytes via absl::string_view. Targets
the HLO text-format parser surface.

hlo_proto_fuzz — exercises xla::HloModule::CreateFromProto
against arbitrary byte sequences. Stage 1 deserializes raw bytes into
HloModuleProto via ParseFromArray; stage 2 converts the proto into
an HloModule. Includes an explicit size guard against integer overflow
on the ParseFromArray size argument (guards size > INT_MAX before
the cast to int, implemented via std::numeric_limits<int>::max()
from <limits>).

Build

Both harnesses build via Bazel under //xla/fuzz/. The BUILD file uses
cc_binary with fuzz_target and nobuilder tags following standard
OSS-Fuzz practice. Dependencies are minimal and correctly scoped per
target.

The OSS-Fuzz build uses XLA's hermetic LLVM18 toolchain which
dynamic-links libc++. The companion oss-fuzz PR resolves packaging by
embedding --linkopt=-Wl,-rpath,$ORIGIN and copying the required
libc++ shared objects into $OUT/ at build time.

Testing

Both binaries were smoke-tested inside the OSS-Fuzz Docker base image:

  • hlo_parser_fuzz: 100 runs, zero crashes, 2,371 coverage PCs with
    growth into xla::HloLexer and related parser code paths
  • hlo_proto_fuzz: 100 runs, zero crashes, 819 coverage PCs with
    growth into xla::HloModuleConfig and protobuf TcParser family

@ricaskew
Copy link
Copy Markdown
Author

ricaskew commented May 5, 2026

Companion OSS-Fuzz integration PR: google/oss-fuzz#15464

@seantalts seantalts requested a review from GleasonK May 7, 2026 17:07
@seantalts
Copy link
Copy Markdown
Member

@GleasonK are you a good person to look at this? I am curious if we even want a fuzzer on HLO, since we mostly just care about the subset of HLO that JAX emits (AFAIK).

@ricaskew
Copy link
Copy Markdown
Author

ricaskew commented May 7, 2026

Two thoughts on this. The text parser's fuzz value comes specifically from exploring inputs outside the JAX-emitted subset — malformed and edge-case inputs are where parser bugs tend to live, and PR #37766's recursion-depth limit is a recent example of that class. An in-repo harness also gives the team a regression signal for future grammar or recursion changes. Separately, the proto harness covers a distinct surface — HloModule::CreateFromProto, the binary deserialization path used by any consumer of serialized HLO modules — which is independent of the text-parser scope question entirely.

@seantalts
Copy link
Copy Markdown
Member

Isn't the reason bugs still live there because they are never exercised by real users or real code? ;)
But I will let @GleasonK triage further.

Copy link
Copy Markdown

@DavidKorczynski DavidKorczynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From OSS-Fuzz perspective (google/oss-fuzz#15464) we'd be happy to integrate this but am waiting for maintainers to be onboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants