Simplify challenge sampling to use full file content#301
Merged
Conversation
… samples generateSample descended into the first non-empty subdirectory and only fell back to the current directory's files when every subtree failed, so files outside leaf directories were almost never sampled. Treat each subdirectory and each file as equal candidates at every level so coverage spans the whole tree. Replace the random line substring (which could be a single near-worthless character) with the whole file's contents as the sample. FileStack now carries just the file index. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_019AKujXgp59pHmUDM4uaiTF
|
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
Add deterministic tests on an in-memory tree (fstest.MapFS) for the two fixed bugs: every sample is a whole file (not a substring), files at every directory depth are sampled (not just leaves), the generate/resolve round-trip holds, the challenge equals SHA256(wholeFile + nonce), and the index/empty-stack/empty-codebase error paths. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_019AKujXgp59pHmUDM4uaiTF
Consolidate the whole-file/all-level traversal tests into the existing challenge_test.go instead of a separate file. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_019AKujXgp59pHmUDM4uaiTF
io/fs paths (embed.FS, fstest.MapFS) are always slash-separated. filepath.Join produces backslashes on Windows, so traversal below the top level failed and only root files were sampled - breaking sampling on Windows nodes and cross-platform challenges. Switch to path.Join. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_019AKujXgp59pHmUDM4uaiTF
Contributor
There was a problem hiding this comment.
Pull request overview
This PR refactors the security challenge sampling/resolution flow to sample entire file contents (instead of random line substrings), simplifying SampleLocation.FileStack and removing now-unneeded line/substring logic and error cases.
Changes:
- Simplified sampling to read whole file contents via
fs.ReadFile, and updatedSampleLocation.FileStackto only store a file index. - Removed line-based helpers/errors and updated stack-size validation/error messaging accordingly.
- Added targeted tests to lock in the new whole-file sampling and round-trip behavior; bumped the repo version.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| version | Bumps application/version identifier. |
| security/challenge.go | Implements whole-file sampling, updates location stack structure, and simplifies resolution accordingly. |
| security/challenge_test.go | Adds regression/contract tests for whole-file sampling and Generate/Resolve round-trips. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
findSample resolves a SampleLocation that arrives over the wire (JSON) from a peer; a negative file_stack/dir_stack value passed the upper-bound-only check and panicked on slice indexing. Guard both lower bounds and return ErrSampleIndexOutOfBounds. Addresses a PR review comment. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_019AKujXgp59pHmUDM4uaiTF
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Refactored the challenge sampling mechanism to read entire file contents instead of extracting random lines and substrings. This simplifies the sampling logic and reduces the number of error cases.
Key Changes
getRandomLine()andreadLines()functions that filtered and parsed files line-by-lineSampleLocation.FileStackfrom storing 4 values (file index, line index, left/right bounds) to just 1 (file index)generateSample()to read full file content viafs.ReadFile()instead of extracting random substringsErrEmptySampleLine,ErrNoNonEmptySampleLines,ErrInvalidSubstringBoundsfindSample()to retrieve and return complete file contentbufioimport (no longer needed)ErrInvalidStackSizemessage to reflect new stack structureImplementation Details
FileStackhas at least 1 element and that the file index is in boundshttps://claude.ai/code/session_019AKujXgp59pHmUDM4uaiTF