-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Prototype for the new syz-verifier using the new rpcserver and fuzzer #6292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
|
@a-nogikh any chance to review? |
|
I've skimmed through, but haven't prepared the comments yet. Will take care of it tomorrow. |
a-nogikh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like the totally right direction, thanks for taking the time to resurrect the tool!
6ed977b to
0d8642c
Compare
b68ac08 to
5376734
Compare
a-nogikh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(first pass, there's a lot of code)
| }, | ||
| } | ||
|
|
||
| mgr.corpusDBMu.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move locking to the top of the function.
pkg/manager/corpus.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please extract the CorpusMinimizer refactoring changes into a separate commit. We'll likely merge it first (as a separate PR) to reduce the overall code diff here.
pkg/manager/corpus.go
Outdated
| // During corpus triage it would happen very often since we are actively adding inputs, | ||
| // and presumably the persistent corpus was reasonably minimal, and we don't use it for fuzzing yet. | ||
| if cm.PhaseCheck != nil && !cm.PhaseCheck() { | ||
| return cm.LastMinCorpus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of taking and returning cm.LastMinCorpus value over and over again, let's just return error from Minimize() (instead of doing log.Fatalf here). Or return nothing and keep the log.Fatalf.
The caller can always determine the latest corpus size anyway.
pkg/manager/corpus.go
Outdated
| LastMinCorpus int | ||
| SaturatedCalls map[string]bool | ||
| DisabledHashes map[string]struct{} | ||
| PhaseCheck func() bool // returns true if we should proceed with minimization |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep this logic on the caller side. We don't really gain anything with this callback here.
syz-verifier/README.md
Outdated
|
|
||
| ## Development Status | ||
|
|
||
| This is a **prototype** showcasing the verifier design with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: the style is a bit too ChatGPT, especially below :)
Let's keep it in line with other syzkaller docs.
syz-verifier/verifier.go
Outdated
| } | ||
| func (vrf *Verifier) fuzzingLoop(ctx context.Context) { | ||
| log.Logf(0, "starting fuzzing loop") | ||
| select { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we're getting here almost immediately, so there's no reason to do such a select at the beginning of the function.
Much better to transform other potentially blocking operations into a select block, e.g. enabledSyscalls := <-kernel.enabledSyscalls below.
syz-verifier/verifier.go
Outdated
| } | ||
| return | ||
| } | ||
| func (vrf *Verifier) fuzzingLoop(ctx context.Context) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is already huge, and, judging by TODOs, it's going to become even bigger over time. Please split it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a thought - maybe we want to split out the verifier logic into some class in pkg/fuzzer? Or create pkg/verifier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comparison of several execution results should definitely be extracted somewhere - we'll also need it later when we get to the generation of reproducers.
syz-verifier/verifier.go
Outdated
| "golang.org/x/sync/errgroup" | ||
| ) | ||
|
|
||
| // Constants for kernel phases |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if these phases are actually simplifying the implementation here.
You seem to be already incorporating the steps sequence in your code and synchronizing it via channels.
Will it be bette if we get rid of them completely here?
syz-verifier/verifier.go
Outdated
| // This enables coverage-guided fuzzing | ||
| feedbackResult := responses[0] | ||
| if feedbackResult == nil { | ||
| // If kernel 0 failed, try to find any successful result |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the kernels are different (and we must assume that they are different here in syz-verifier), we should not mix resulting coverage feedback from them. It breaks all assumptions of the fuzzer.
syz-verifier/verifier.go
Outdated
| kernel.http.Corpus.Store(kernel.corpus) | ||
| } | ||
| // TODO: Aggregate to main HTTP server instead of per-kernel | ||
| if kernel.cfg.Snapshot { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rest of the code does not support the snapshot mode, so let's not include this logic here at all.
This commit removes the old syz-verifier's rpcserver, executor, stats, monitor and utils as we are going to rewrite the verifier to use the new rpcserver and executor. Part of the fix for issue: google#5976
…uzzer This commit introduces a prototype for the new syz-verifier using the new rpcserver and fuzzer. An in-depth explanation is available in README.md. Remaining work includes connecting the reproduction loop, HTTP servers, enabling snapshots, and the comparison phase. The prototype already implements a working differential fuzzing loop.
…d improve verifier Extract corpus minimization logic into shared pkg/manager/corpus.go for reuse by both syz-manager and syz-verifier. Improve syz-verifier fuzzing implementation: - Update request distribution strategy for better workload management - Change result comparison logic to more accurately detect mismatches - Print mismatches in more readable format for easier debugging - Enforce restarting executor between program executions to avoid state pollution - Fix coverage-guided fuzzing to properly track code coverage - Fix maxSignal phase management for accurate signal tracking - Simplify crash handling logic - Refactor corpus management to use single goroutine with one channel Enhance HTTP server monitoring: - Add log, coverage statistics, and VM information to HTTP server views - Enable HTTP server logging for better observability Fix race condition where multiple executor goroutines call MaxSignal() simultaneously from RPC server connection loops.
This commit fixes 2 small bugs founded: - wrong use of kernel.runInstance() function - the return values of the function has been changed in the master branch so we had to update our calls to the function - Fix wrong print of call.Flags
Split the large fuzzingLoop function into focused helpers to improve readability and maintainability: - fetchNextRequest: handle request retrieval with context cancellation - isDistributableRequest: check if request should be compared across kernels - buildRequestCopies: construct kernel-specific request copies with callbacks - dispatchAndCollect: submit requests to all kernels and collect results The main fuzzing loop is now cleaner with clear separation of concerns: fetch → filter → build → dispatch → feedback → compare. No functional changes, purely refactoring for code clarity.
…o kernel comparison tool for pre-learned corpus. No longer learns or generates programs. Instead: - Takes existing saturated corpus.db as input (-corpus flag) - Executes each program across multiple kernel versions in parallel - Compares syscall errno per call to detect behavioral differences - Logs mismatches with full program sequence and call-by-call results
84d9b9d to
a1a73a9
Compare
|
@a-nogikh, @dvyukov and @tarasmadan |
No description provided.