Improve username URL encoding and fix Reddit verification page false positives#2947
Open
lipeize689-prog wants to merge 2 commits into
Open
Conversation
…rom being misclassified as claimed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR improves username URL handling and fixes a real false-positive issue on Reddit.
Safer URL interpolation
I added [encode_username_for_url()] and used it for URL-based username interpolation to avoid breaking URL structure when usernames contain special characters (like '/').
Reddit verification/challenge page detection
Reddit sometimes returns a verification/challenge page instead of a normal user-not-found response.
Previously, this could be misclassified as [CLAIMED], causing a false positive.
To address this, I added [is_reddit_verification_page()] and integrated it into the main detection flow so these pages are classified as [QueryStatus.WAF] with a clearer context string: Reddit verification page.
I also updated the [--dump-response] output and WAF terminal output so the detection reason is easier to understand.
Tests
Added coverage for:
URL encoding behavior
Reddit verification-page detection
Main-flow WAF classification for Reddit