Skip to content

Conversation

@fellair
Copy link
Contributor

@fellair fellair commented Aug 13, 2025

This change is the first step in addressing issue #6206.

Failure during presubmit tests is expected due to an issue detailed below.

The goal is to describe syscalls as variants of generic syz_usb_connect() to improve fuzzing for specific usb drivers. Also, add seeds for such calls.
The problem is that syzkaller-generated programs (as part of its corpus) seem to differ syntax-wise from ones that are expected considering descriptions layed out for them. First, examples:

Good example, seed for ath9k (a little bit different since it's a separate syscall but still):
r0 = syz_usb_connect_ath9k(0x3, 0x5a, &(0x7f0000000080)={{0x12, 0x1, 0x200, 0xff, 0xff, 0xff, 0x40, 0xcf3, 0x9271, 0x108, 0x1, 0x2, 0x3, 0x1, [{{0x9, 0x2, 0x48, 0x1, 0x1, 0x0, 0x80, 0xfa, {{0x9, 0x4, 0x0, 0x0, 0x6, 0xff, 0x0, 0x0, 0x0, "", {{0x9, 0x5, 0x1, 0x2, 0x200, 0x0, 0x0, 0x0, ""}, {0x9, 0x5, 0x82, 0x2, 0x200, 0x0, 0x0, 0x0, ""}, {0x9, 0x5, 0x83, 0x3, 0x40, 0x1, 0x0, 0x0, ""}, {0x9, 0x5, 0x4, 0x3, 0x40, 0x1, 0x0, 0x0, ""}, {0x9, 0x5, 0x5, 0x2, 0x200, 0x0, 0x0, 0x0, ""}, {0x9, 0x5, 0x6, 0x2, 0x200, 0x0, 0x0, 0x0, ""}}}}}}]}}, 0x0)

We have a call that has arguments which correspond to all inner structures encapsulated in usb_device_descriptor_t (or usb_device_descriptor_fixed_t) type. Not only it is right from a syntax point of view, it's also useful in a limited sense for a reviewer to glean what attributes and where are used.

Bad example, seed for lan78xx:
r0 = syz_usb_connect$lan78xx(0x5, 0x3f, &(0x7f0000000000)={{0x12, 0x1, 0x200, 0xff, 0xff, 0xff, 0x40, 0x424, 0x7850, 0x0, 0x1, 0x2, 0x3, 0x1, [{{0x9, 0x2, 0x2d}}]}}, 0x0)

Device info is fine enough, but all data concerning usb_config_descriptor_t (and others inside it) is not there. The call is seemingly incomplete. Despite the lack of data about usb endpoints, we still pass all lan78xx endpoint checks and proceed further along the code. I've tested this program (see vusb_lan78xx) with syz-execprog to confirm its correctness.

I've yet to determine why it happens. So far, I blame the corpus minimization process. It's important to note that this issue was present in other variant descriptions. In vusb_sierra_net, I had to manually add "" as a missing 9th struct member of usb_endpoint_descriptor_verbose_t. In vusb_rtl8150, I had to do the same and also switch from "verbose" types to generic ones, otherwise other struct members end up missing as well.

I could just drop erroneous seeds, but for some drivers like lan78xx (with a bunch of CTRL requests) it is imperative to use them, as syzkaller may not accomplish much on its own. Also, this behaviour possibly looks a like a legitimate flaw.

@a-nogikh
Copy link
Collaborator

Thanks!

Regarding the seeds - I think what you describe is the intended behavior of program serialisation in syzkaller. We omit the fields that have their default values, which is the case both for zeroes and for the const fields.

syzkaller/prog/encoding.go

Lines 188 to 193 in 5d8c2ac

if !ctx.verbose && a.fixedInnerSize() {
for ; lastNonDefault >= 0; lastNonDefault-- {
if !isDefault(a.Inner[lastNonDefault]) {
break
}
}

If parsing is done in a non-strict mode, syzkaller then replaces the missing arguments with the default values and the program is restored exactly as is.

The sys/*/test seeds, however, are parsed in a strict mode where syzkaller expects all fields to be present.

I don't remember if we have any ready-to-use tool to convert this short representation into the verbose one, but it should be easy to craft - you first do prog, err := target.Deserialize(data, prog.NonStrict), then prog.SerializeVerbose().

@fellair
Copy link
Contributor Author

fellair commented Aug 14, 2025

Thanks!

Regarding the seeds - I think what you describe is the intended behavior of program serialisation in syzkaller. We omit the fields that have their default values, which is the case both for zeroes and for the const fields.

syzkaller/prog/encoding.go

Lines 188 to 193 in 5d8c2ac

if !ctx.verbose && a.fixedInnerSize() {
for ; lastNonDefault >= 0; lastNonDefault-- {
if !isDefault(a.Inner[lastNonDefault]) {
break
}
}

If parsing is done in a non-strict mode, syzkaller then replaces the missing arguments with the default values and the program is restored exactly as is.

The sys/*/test seeds, however, are parsed in a strict mode where syzkaller expects all fields to be present.

I don't remember if we have any ready-to-use tool to convert this short representation into the verbose one, but it should be easy to craft - you first do prog, err := target.Deserialize(data, prog.NonStrict), then prog.SerializeVerbose().

Thank you very much, I think this might be the answer. I'll try to do just that and report back.

@xairy
Copy link
Contributor

xairy commented Aug 15, 2025

Hi @fellair! Thank you for working on this!

Could you annotate the runtests with comments explaining what each request does (e.g. reads/writes specific register, etc.); see e.g. the io_uring or 80211_setup_station runtests as an example. This would be helpful for extending/altering them in the future.

Might also make sense to add a comment into the descriptions explaining that we don't describe specific control requests in all detail (i.e. don't describe the specific register commands) and rely on runtests instead.

The descriptions themselves look great! (But I didn't read deep into the drivers' source code.)

This change is the first step in addressing issue [1].

Despite syzkaller's best efforts, some usb drivers are proving
resistant to attempts to probe them. Specifically, crafted
devices are not accurate enough to bypass checks in probe().
These checks mostly deal with usb interfaces and endpoints.

One way to address this issue is to define syz_connect_* calls
variants to help syzkaller succeed probing by describing in detail
various device attributes.

Start by describing such calls for select drivers, each representing
its own category of sorts. At the moment, code coverage for these
drivers is unimpressive:

- rtl8150
Used to succeed probing until a better usb endpoint check was implemented.

- sierra_net
Same as rtl8150. Depends on usbnet API for bind() and usb ep checks.

- lan78xx
Requires numerous control requests between driver and device DURING probe.
Extra descriptions are helpful but are not enough to fully complete
probing process.

Also, add a seed for each such example.

This is only a stepping stone to improve usb fuzzing results and most
likely will be subject to change in the future.

[1] google#6206
@fellair fellair force-pushed the fellair_basic_usb_variants_descs branch from 60a854d to 522d67b Compare August 18, 2025 13:30
@fellair
Copy link
Contributor Author

fellair commented Aug 18, 2025

Details on v2:

  • Fix incomplete seeds by using tools/syz-expand to properly serialize the calls in test cases (props to @a-nogikh).
  • Switch descriptions of vusb descriptors back to fixed_t instead of generic types where appropriate.
  • Annotate as much as possible all the calls in test seeds (thanks to @xairy for suggestion).

Presubmit tests should run smoothly now. Testing done with restricted enable_syscalls[], so far now new crashes found (which is expected).

@fellair fellair marked this pull request as ready for review August 18, 2025 13:56
@a-nogikh a-nogikh added this pull request to the merge queue Aug 20, 2025
Merged via the queue into google:master with commit c3b8db6 Aug 20, 2025
23 checks passed
@xairy
Copy link
Contributor

xairy commented Sep 5, 2025

3 new bugs reported so far thanks to the added descriptions 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants