Skip to content

Conversation

@dBransky
Copy link

No description provided.

@google-cla
Copy link

google-cla bot commented Aug 15, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@dvyukov dvyukov requested a review from a-nogikh August 15, 2025 22:33
@dBransky
Copy link
Author

@a-nogikh any chance to review?

@a-nogikh
Copy link
Collaborator

I've skimmed through, but haven't prepared the comments yet. Will take care of it tomorrow.

Copy link
Collaborator

@a-nogikh a-nogikh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the totally right direction, thanks for taking the time to resurrect the tool!

@dBransky dBransky force-pushed the verifier-dev branch 6 times, most recently from 6ed977b to 0d8642c Compare December 7, 2025 13:59
@dBransky dBransky force-pushed the verifier-dev branch 4 times, most recently from b68ac08 to 5376734 Compare December 7, 2025 16:26
@dBransky
Copy link
Author

dBransky commented Dec 7, 2025

@a-nogikh and @dvyukov
Please see the updated PR, all comments fixed and missing features in the verifier were added.
I think that the current version is OK as an MVP

Copy link
Collaborator

@a-nogikh a-nogikh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(first pass, there's a lot of code)

},
}

mgr.corpusDBMu.Lock()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move locking to the top of the function.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please extract the CorpusMinimizer refactoring changes into a separate commit. We'll likely merge it first (as a separate PR) to reduce the overall code diff here.

// During corpus triage it would happen very often since we are actively adding inputs,
// and presumably the persistent corpus was reasonably minimal, and we don't use it for fuzzing yet.
if cm.PhaseCheck != nil && !cm.PhaseCheck() {
return cm.LastMinCorpus
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of taking and returning cm.LastMinCorpus value over and over again, let's just return error from Minimize() (instead of doing log.Fatalf here). Or return nothing and keep the log.Fatalf.

The caller can always determine the latest corpus size anyway.

LastMinCorpus int
SaturatedCalls map[string]bool
DisabledHashes map[string]struct{}
PhaseCheck func() bool // returns true if we should proceed with minimization
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep this logic on the caller side. We don't really gain anything with this callback here.


## Development Status

This is a **prototype** showcasing the verifier design with:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the style is a bit too ChatGPT, especially below :)

Let's keep it in line with other syzkaller docs.

}
func (vrf *Verifier) fuzzingLoop(ctx context.Context) {
log.Logf(0, "starting fuzzing loop")
select {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we're getting here almost immediately, so there's no reason to do such a select at the beginning of the function.

Much better to transform other potentially blocking operations into a select block, e.g. enabledSyscalls := <-kernel.enabledSyscalls below.

}
return
}
func (vrf *Verifier) fuzzingLoop(ctx context.Context) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is already huge, and, judging by TODOs, it's going to become even bigger over time. Please split it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought - maybe we want to split out the verifier logic into some class in pkg/fuzzer? Or create pkg/verifier?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparison of several execution results should definitely be extracted somewhere - we'll also need it later when we get to the generation of reproducers.

"golang.org/x/sync/errgroup"
)

// Constants for kernel phases
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if these phases are actually simplifying the implementation here.

You seem to be already incorporating the steps sequence in your code and synchronizing it via channels.

Will it be bette if we get rid of them completely here?

// This enables coverage-guided fuzzing
feedbackResult := responses[0]
if feedbackResult == nil {
// If kernel 0 failed, try to find any successful result
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the kernels are different (and we must assume that they are different here in syz-verifier), we should not mix resulting coverage feedback from them. It breaks all assumptions of the fuzzer.

kernel.http.Corpus.Store(kernel.corpus)
}
// TODO: Aggregate to main HTTP server instead of per-kernel
if kernel.cfg.Snapshot {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rest of the code does not support the snapshot mode, so let's not include this logic here at all.

dBransky and others added 5 commits January 13, 2026 13:57
This commit removes the old syz-verifier's rpcserver, executor, stats, monitor and utils as we are going to rewrite the verifier to use the new rpcserver and executor.
Part of the fix for issue: google#5976
…uzzer

This commit introduces a prototype for the new syz-verifier using the new rpcserver and fuzzer. An in-depth explanation is available in README.md. Remaining work includes connecting the reproduction loop, HTTP servers, enabling snapshots, and the comparison phase. The prototype already implements a working differential fuzzing loop.
…d improve verifier

Extract corpus minimization logic into shared pkg/manager/corpus.go
for reuse by both syz-manager and syz-verifier.

Improve syz-verifier fuzzing implementation:
- Update request distribution strategy for better workload management
- Change result comparison logic to more accurately detect mismatches
- Print mismatches in more readable format for easier debugging
- Enforce restarting executor between program executions to avoid state pollution
- Fix coverage-guided fuzzing to properly track code coverage
- Fix maxSignal phase management for accurate signal tracking
- Simplify crash handling logic
- Refactor corpus management to use single goroutine with one channel

Enhance HTTP server monitoring:
- Add log, coverage statistics, and VM information to HTTP server views
- Enable HTTP server logging for better observability

Fix race condition where multiple executor goroutines call MaxSignal()
simultaneously from RPC server connection loops.
This commit fixes 2 small bugs founded:
- wrong use of kernel.runInstance() function - the return values of the function has been changed in the master branch so we had to update our calls to the function
- Fix wrong print of call.Flags
Split the large fuzzingLoop function into focused helpers to improve
readability and maintainability:

- fetchNextRequest: handle request retrieval with context cancellation
- isDistributableRequest: check if request should be compared across kernels
- buildRequestCopies: construct kernel-specific request copies with callbacks
- dispatchAndCollect: submit requests to all kernels and collect results

The main fuzzing loop is now cleaner with clear separation of concerns:
fetch → filter → build → dispatch → feedback → compare.

No functional changes, purely refactoring for code clarity.
…o kernel comparison tool for pre-learned corpus.

No longer learns or generates programs. Instead:
- Takes existing saturated corpus.db as input (-corpus flag)
- Executes each program across multiple kernel versions in parallel
- Compares syscall errno per call to detect behavioral differences
- Logs mismatches with full program sequence and call-by-call results
@dBransky dBransky force-pushed the verifier-dev branch 2 times, most recently from 84d9b9d to a1a73a9 Compare January 13, 2026 12:07
@dBransky
Copy link
Author

@a-nogikh, @dvyukov and @tarasmadan
Please review updated PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants