Skip to content

Fix parallel fuzzing premature termination #540

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 107 additions & 32 deletions cargo-test-fuzz/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -960,7 +960,10 @@ fn fuzz(opts: &TestFuzz, executable_targets: &[(Executable, String)]) -> Result<
let status = command
.status()
.with_context(|| format!("Could not get status of `{command:?}`"))?;
ensure!(status.success(), "Command failed: {:?}", command);
if !status.success() {
eprintln!("Warning: Command failed: {:?}", command);
return Ok(());
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the code does not involve parallel fuzzing, actually. Hence, this part of the code should not need to be modified.

return Ok(());
}

Expand Down Expand Up @@ -999,14 +1002,29 @@ fn fuzz(opts: &TestFuzz, executable_targets: &[(Executable, String)]) -> Result<
.map(|()| None::<Child>)
.collect::<Vec<_>>();
let mut i_target_prev = executable_targets.len();


// Track failed targets to detect when all targets fail
let mut failed_targets = std::collections::HashSet::new();

loop {
if n_children < n_cpus && (i_task < executable_targets.len() || !config.sufficient_cpus) {
// If all targets have failed, terminate gracefully
if failed_targets.len() == executable_targets.len() {
eprintln!("All targets failed to start. Terminating fuzzing process.");
break;
}

let Some((executable, target)) = executable_targets_iter.next() else {
unreachable!();
};

let i_target = i_task % executable_targets.len();

// Skip targets that have already failed
if failed_targets.contains(&i_target) {
i_task += 1;
continue;
}

config.first_run = i_task < executable_targets.len();

Expand All @@ -1018,20 +1036,43 @@ fn fuzz(opts: &TestFuzz, executable_targets: &[(Executable, String)]) -> Result<

let exec = format!("{command:?}");
command.stdout(Stdio::piped());
let mut popen = command
.spawn()
.with_context(|| format!("Could not spawn `{exec:?}`"))?;
let stdout = popen
.stdout
.take()
.ok_or_else(|| anyhow!("Could not get output of `{exec:?}`"))?;
let popen_result = command.spawn();

// Handle spawn failures gracefully
if let Err(e) = popen_result {
eprintln!("Warning: Could not spawn `{exec:?}`: {}", e);
failed_targets.insert(i_target);
i_task += 1;
continue;
}

let mut popen = popen_result.unwrap();
let stdout_result = popen.stdout.take();

if stdout_result.is_none() {
eprintln!("Warning: Could not get output of `{exec:?}`");
failed_targets.insert(i_target);
i_task += 1;
continue;
}

let stdout = stdout_result.unwrap();
let mut receiver = Receiver::from(stdout);
receiver
.set_nonblocking(true)
.with_context(|| "Could not make receiver non-blocking")?;
poll.registry()
.register(&mut receiver, Token(i_target), Interest::READABLE)
.with_context(|| "Could not register receiver")?;

if let Err(e) = receiver.set_nonblocking(true) {
eprintln!("Warning: Could not make receiver non-blocking: {}", e);
failed_targets.insert(i_target);
i_task += 1;
continue;
}

if let Err(e) = poll.registry().register(&mut receiver, Token(i_target), Interest::READABLE) {
eprintln!("Warning: Could not register receiver: {}", e);
failed_targets.insert(i_target);
i_task += 1;
continue;
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all of the code from the spawn to here can remain unchanged.

If an error were to occur here, it would mean that something was wrong with cargo-test-fuzz or the system on which it was running, not the fuzzing target. For errors of this kind, I think it is okay to bail out (at least for now).

The same comment applies to several more places below, which I will mark with just '^'.

The cases where I think an error could indicate a problem in the fuzzing target are:

Hence, I have not marked those with '^' below.

children[i_target] = Some(Child {
exec,
popen,
Expand All @@ -1047,13 +1088,29 @@ fn fuzz(opts: &TestFuzz, executable_targets: &[(Executable, String)]) -> Result<
}

if n_children == 0 {
assert!(config.sufficient_cpus);
assert!(i_task >= executable_targets.len());
break;
// If we have no children and all targets have failed, terminate
if failed_targets.len() == executable_targets.len() {
eprintln!("All targets failed to start. Terminating fuzzing process.");
break;
}

// If we have no children because we've cycled through all targets,
// but not all have failed, this is normal completion
if config.sufficient_cpus && i_task >= executable_targets.len() {
break;
}

// If we have no children but still have targets that haven't failed,
// something went wrong. Wait a bit and try again.
std::thread::sleep(std::time::Duration::from_millis(100));
continue;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this n_children == 0 should not need to be modified.

This case is for when there are sufficiently many cpus to fuzz all targets simultaneously (hence, the assertion).

In particular, if all children have exited (e.g., because --run-until-crash was passed and they all found crashes), then there is no more work to do, and we can simply break out of the loop.

I appreciate the way you commented your changes here, though.


poll.poll(&mut events, None)
.with_context(|| "`poll` failed")?;
let poll_result = poll.poll(&mut events, None);
if let Err(e) = poll_result {
eprintln!("Warning: Poll failed: {}", e);
continue;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^


for event in &events {
let Token(i_target) = event.token();
Expand All @@ -1063,7 +1120,14 @@ fn fuzz(opts: &TestFuzz, executable_targets: &[(Executable, String)]) -> Result<
.as_mut()
.unwrap_or_else(|| panic!("Child for token {i_target} should exist"));

let s = child.read_lines()?;
let s = match child.read_lines() {
Ok(s) => s,
Err(e) => {
eprintln!("Warning: Could not read lines from child process: {}", e);
continue;
}
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^


for line in s.lines() {
if line.contains("Time limit was reached") {
child.time_limit_was_reached = true;
Expand All @@ -1083,25 +1147,36 @@ fn fuzz(opts: &TestFuzz, executable_targets: &[(Executable, String)]) -> Result<
let mut child = children[i_target]
.take()
.unwrap_or_else(|| panic!("Child for token {i_target} should exist"));
poll.registry()
.deregister(&mut child.receiver)
.with_context(|| "Could not deregister receiver")?;

if let Err(e) = poll.registry().deregister(&mut child.receiver) {
eprintln!("Warning: Could not deregister receiver: {}", e);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^


n_children -= 1;

let status = child
.popen
.wait()
.with_context(|| format!("`wait` failed for `{:?}`", child.popen))?;
let status_result = child.popen.wait();

if let Err(e) = status_result {
eprintln!("Warning: `wait` failed for `{:?}`: {}", child.popen, e);
failed_targets.insert(i_target);
continue;
}

let status = status_result.unwrap();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^


if !status.success() {
bail!("Command failed: {:?}", child.exec);
eprintln!("Warning: Command failed for target {}: {:?}", target, child.exec);
failed_targets.insert(i_target);
continue;
}

if !child.testing_aborted_programmatically {
bail!(
r#"Could not find "Testing aborted programmatically" in command output: {:?}"#,
child.exec
eprintln!(
r#"Warning: Could not find "Testing aborted programmatically" in command output for target {}: {:?}"#,
target, child.exec
);
failed_targets.insert(i_target);
continue;
}

if opts.exit_code && !child.time_limit_was_reached {
Expand Down