chore: both fuzz jobs need to use s3 corpus files.#5714
chore: both fuzz jobs need to use s3 corpus files.#5714
Conversation
| - ./codebuild/bin/fuzz_corpus_download.sh | ||
| # -L: Restrict tests to names matching the pattern 'fuzz' | ||
| - cmake --build build/ --target test -- ARGS="-L fuzz --output-on-failure -j $(nproc)" | ||
| - ./codebuild/bin/fuzz_corpus_upload.sh |
There was a problem hiding this comment.
Would this cause some synchronization issue if there are multiple PRs downloading and uploading the corpus files in parallel?
E.g., PR #A finds an actual fuzz test failure and upload the corpus, but PR #B that ran around the same overwrites the corpus store in S3. Would we still be able to reproduce the failure in PR #A?
There was a problem hiding this comment.
True, although this is still an improvement over today where we're discarding updates; we could rework this to handle the case you're describing.
There was a problem hiding this comment.
Yeah, I think we need to handle this case before we can switch to using S3. We previously treated per-PR fuzz as more of a sanity check, but now that we're switching to use the improved corpus, we should ensure we have a reliable way to reproduce fuzz failures.
For that, we probably just need to ensure that on fuzz failure, we upload the crash-inducing corpus to a separate S3 path (e.g., s3://bucket/fuzz-failures/<PR-number>/<timestamp>/). That way failures are always reproducible, and we should be able to proceed with this change.
Another issue to consider: if a PR makes large changes that affect many code paths, the fuzzer will generate corpus entries tailored to that PR's version of the code. If uploaded, that corpus could overwrite our carefully accumulated corpus that's optimized for main. I think we should only update the fuzz corpus on main branch merges, not from individual PRs. Perhaps the easiest solution here is to make PRs download the corpus from S3 but skip the upload step?
There was a problem hiding this comment.
It makes sense to download from the accumulated corpus and then upload into a per PR S3 path to deal with the synchronization issue. This check is useful if we are able to reproduce the issue for the PR.
|
Should we remove the corpus files as part of this PR to verify this'll unblock the removal, or are there more changes needed before they can be removed? |
There was a problem hiding this comment.
Agree with @jouho that we should handle the synchronization issue so that we can ensure that the fuzz faliure from a particular PR can be reproduced before merging.
Goal
Pull corpus files from s3, instead of GitHub.
Why
We need to finisn the work to remove corpus files from GitHub.
How
Carry over the s3 up/down load of corpus files from the scheduled fuzz check to the PR fuzz job.
Callouts
Until the rust CI issues are sorted, the CI will fail.
Testing
The Fuzz job running CI is using this new buildspec.
Related
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.