-
Notifications
You must be signed in to change notification settings - Fork 2
IT-692: Stop false positives on substrings #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis PR refactors the Changes
Sequence DiagramsequenceDiagram
participant Client
participant ServeHTTP
participant shouldBlockAgent
participant Handler
Client->>ServeHTTP: Request with User-Agent header
Note over ServeHTTP: Extract IP & User-Agent
ServeHTTP->>shouldBlockAgent: userAgent string
alt Agent matches block-list
shouldBlockAgent->>shouldBlockAgent: Regex verification
shouldBlockAgent-->>ServeHTTP: (true, "matched-agent", nil)
Note over ServeHTTP: Invoke timer on block
Note over ServeHTTP: Log blocked request with quoted values
ServeHTTP-->>Client: Block response
else Agent does not match
shouldBlockAgent-->>ServeHTTP: (false, "", nil)
Note over ServeHTTP: Invoke timer before delegating
ServeHTTP->>Handler: Delegate to next handler
Handler-->>Client: Continue processing
else Error during check
shouldBlockAgent-->>ServeHTTP: (_, "", error)
Note over ServeHTTP: Invoke timer on error
ServeHTTP-->>Client: Internal Server Error (500)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes The changes involve a method signature update affecting the public API contract, new error handling paths, regex verification logic, and corresponding test adaptations. While cohesive to a single feature area, the modifications span logic changes, error semantics, and logging adjustments requiring careful review of each concern. Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsErrors were encountered while retrieving linked issues. Errors (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
botblocker.go (3)
202-207: Consider fail-open on UA-check error to avoid availability impact.A malformed blocklist entry would currently yield 500 for requests. Prefer logging the error and allowing the request to proceed.
Optional diff:
- if err != nil { - timer() - http.Error(rw, "internal error", http.StatusInternalServerError) - return - } + if err != nil { + log.Errorf("user-agent check failed: %v", err) + timer() + // Fail-open: continue the request pipeline + b.next.ServeHTTP(rw, req) + return + }
195-196: Tighten log message: grammar + quoting.Minor cleanup: fix wording and use %q for quoting.
- log.Infof("blocked request with from IP \"%v\"", remoteAddrPort.Addr()) + log.Infof("blocked request from IP %q", remoteAddrPort.Addr().String())
208-211: Clarify UA log and use %q.“contained” is misleading with boundary matching; prefer “matched”.
- log.Infof("blocked request with user agent \"%v\" because it contained \"%v\"", agent, badAgent) + log.Infof("blocked request with user agent %q matched bad-agent %q", agent, badAgent)botblocker_test.go (2)
168-170: Fix mismatched failure message.The assertion checks for
blockedAgent == badAgent, but the message says “want ""”.- if blockedAgent != badAgent { - t.Fatalf("botBlocker.shouldBlockAgent(%s) = %s; want \"\"", requestAgent, blockedAgent) - } + if blockedAgent != badAgent { + t.Fatalf("botBlocker.shouldBlockAgent(%s) blockedAgent=%q; want %q", requestAgent, blockedAgent, badAgent) + }
193-211: Good regression test for substring false positives.This guards the original bug. Consider also adding a case where the blocklist entry contains regex metacharacters (e.g.,
curl/7.64) to ensure escaping works.Example additional test:
func TestShouldBlockUserAgentWithRegexMeta(t *testing.T) { botBlocker := BotBlocker{ userAgentBlockList: []string{"curl/7.64"}, } ua := "curl/7.64.1" blocked, bad, err := botBlocker.shouldBlockAgent(ua) if err != nil { t.Fatalf("unexpected error: %v", err) } if blocked { t.Fatalf("should not block %q by %q with word boundaries", ua, bad) } ua2 := "curl/7.64" blocked, bad, err = botBlocker.shouldBlockAgent(ua2) if err != nil { t.Fatalf("unexpected error: %v", err) } if !blocked || bad != "curl/7.64" { t.Fatalf("expected block of %q by %q", ua2, bad) } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Jira integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
botblocker.go(3 hunks)botblocker_test.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
botblocker_test.go (1)
botblocker.go (1)
BotBlocker(33-40)
botblocker.go (1)
utils/log.go (1)
Infof(99-103)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: test (1.20)
🔇 Additional comments (4)
botblocker.go (3)
190-193: LGTM on timer coverage for error path.Calling timer() before returning on parse error is correct.
215-216: LGTM on final timer before delegating.Accurately measures only the blocker check, not downstream handler time.
234-241: Fix critical regex compilation error and correctness issue:(?:...)unsupported in Go RE2 andbadAgentunescaped.Verification confirms the issues:
- Go RE2 does not support
(?:...)non-capturing groups — the pattern at line 234 will fail with a regex parse error at runtime, triggering 500s.badAgentis unescaped — if list entries contain regex metacharacters (.,*,+,[,], etc.), they will be interpreted as regex syntax instead of literal text, breaking the matching logic.- Error format verb
%eis invalid — should use%wfor error wrapping.Only one regex pattern exists in the file (line 234), and it has all these issues.
Apply this diff:
- // verify with regex - pattern := fmt.Sprintf(`(?:\b)%s(?:\b)`, badAgent) - matched, err := regexp.Match(pattern, []byte(userAgent)) - if err != nil { - return false, "", fmt.Errorf("failed to check user agent %s: %e", userAgent, err) - } + // verify with regex word boundaries; escape list entry to avoid regex meta + pattern := `\b` + regexp.QuoteMeta(badAgent) + `\b` + matched, err := regexp.MatchString(pattern, userAgent) + if err != nil { + return false, "", fmt.Errorf("failed to check user agent %q: %w", userAgent, err) + }Likely an incorrect or invalid review comment.
botblocker_test.go (1)
181-191: Tests align with new API; assertions read well.Covers allow-path and verifies empty badAgent.
Fixes an issue where a bad user agent that is a substring of a good bot was
blocking the good bot. Fixed by verifying with regex and word boundaries.
Also fixed: