Skip to content

Conversation

@MichaelChirico
Copy link
Collaborator

@MichaelChirico MichaelChirico commented Jul 26, 2025

Originally #2822, which can't be recovered because it was closed while it was targeting a branch that got deleted. GitHub bug, I guess.

Part of #2191, including progress on #2737

@MichaelChirico MichaelChirico changed the title Fuzz comments Add a fuzzer for injecting comments Jul 26, 2025
@codecov
Copy link

codecov bot commented Jul 28, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.24%. Comparing base (a6cdb2a) to head (4836fd2).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2899   +/-   ##
=======================================
  Coverage   99.24%   99.24%           
=======================================
  Files         128      128           
  Lines        7260     7262    +2     
=======================================
+ Hits         7205     7207    +2     
  Misses         55       55           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@MichaelChirico MichaelChirico added this to the 3.3.0 milestone Jul 29, 2025
@MichaelChirico MichaelChirico modified the milestones: 3.3.0, 3.4.0 Nov 25, 2025
@MichaelChirico MichaelChirico marked this pull request as ready for review December 30, 2025 23:27
@MichaelChirico
Copy link
Collaborator Author

MichaelChirico commented Jan 8, 2026

Gemini PR review follows


Prompt:

Use the R gh package (which is already installed) to

 1. read the bug 'https://github.com/r-lib/lintr/issues/2191'
 2. read the diff of the PR 'https://github.com/r-lib/lintr/pull/2899'
 3. read the existing PR review comments for further context
 4. read the git history on this branch (you are on the checked out branch locally, and the target is 'main')

With this context in hand, provide a thorough, constructive review of the pull request.

Focus on readability, and check for gaps in logic or missing tests. Could the same fix be done more readably, more robustly, or any way better? Does it fit with the
style of the package?

Extirpate any instinct to be overly polite/groveling. Pushback is welcome. Politeness in the review is not an asset -- the only goal is to have the best submitted code possible.

Response:

This is a great initiative to improve the robustness of our linters against comments, and it's uncovered a lot of edge cases, as evidenced by the NEWS.md entry. The core idea of a comment_injection_fuzzer is a fantastic addition to our testing strategy.

I have a few high-level comments and some more specific suggestions.

High-Level Feedback

  1. Scope: This PR is massive. It introduces the fuzzer and fixes dozens of linters. This makes it very difficult to review. In the future, it would be preferable to land the infrastructure first (the fuzzer itself and the improved expect_lint() diagnostics), and then follow up with separate PRs for
    fixing the linters, perhaps grouped by theme. For now, we'll have to work with this.

  2. nofuzz annotations: There are a lot of tests being excluded from fuzzing. This is pragmatic to get the PR landed, but it also feels like we're deferring a lot of the work this fuzzer is supposed to enable. Could we add issues for the tests that are being skipped, so we have a record of what
    still needs to be fixed? Or, are there patterns in the skipped tests that suggest the fuzzer could be made "smarter" to avoid breaking valid tests?

Specific Code Feedback

  1. In .dev/maybe_fuzz_content.R (comment_injection_fuzzer):

    • The logic to avoid injecting comments before SYMBOL_FUNCTION_CALL is good. The comment about formulas ('~') being messy is understandable, but it would be good to have an issue to track this as a future improvement.
    • injection_count <- sample(0:length(terminal_token_idx), 1L): this could be sample.int(length(terminal_token_idx) + 1L, 1L) - 1L for a slightly more robust way of sampling, but it's a minor point.
    • The way the line is reconstructed with paste0 and substr is a bit brittle. Have you considered using regmatches and gregexpr for a more robust way to insert the comment? It might be overkill, but worth considering if this logic becomes more complex.
  2. In .dev/ast_fuzz_test.R:

    • The modification of expect_lint_file to include the file content in failure messages is a fantastic improvement for debugging. Great idea. The way you're injecting the code to do this is clever, but also a bit fragile. It relies on the exact string representation of the sprintf call. Have you
      considered submitting a patch to testthat to make this easier? Or perhaps a custom reporter would be cleaner? For our internal tooling, this is probably acceptable.
  3. Test files:

    • The switch from expect_lint(..., NULL, ...) to expect_no_lint(...) is a great cleanup. Thanks for doing that.
    • The sheer number of nofuzz annotations is concerning. For example, in test-commas_linter.R, the entire file is excluded. Is it really the case that none of these tests can handle a single extra comment? It would be good to have a bit more context on why so many tests need to be excluded.

Git History

The git history is very noisy with many merge commits and small, often uninformative commit messages. Before merging, this should be squashed into a few logical commits. Something like:

  1. feat(testing): Add comment injection fuzzer
  2. fix(linters): Improve comment handling in N linters
  3. refactor(tests): Modernize tests with expect_no_lint
  4. chore(testing): Improve fuzzing failure diagnostics

Conclusion

This is a valuable PR that will improve the quality of lintr. I'm happy to approve it once we've discussed the scope and the nofuzz situation. The main thing is to ensure we're not just sweeping a bunch of issues under the rug with nofuzz.


@MichaelChirico
Copy link
Collaborator Author

The idea to use a custom reporter for this is interesting. Filed #3001 to follow up on that.

It also appears a bit confused by the introduction of all the commits in the branch, most of which are already in main; ignoring that part.

The point about excessive nofuzz is also well taken, but we already have #2832 in the pipeline to handle that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants