-
-
Notifications
You must be signed in to change notification settings - Fork 49
Open
Labels
bugSomething isn't workingSomething isn't workingpythonPull requests that update Python codePull requests that update Python code
Description
Description
During the refactoring of the Cowrie extraction strategy (PR #639), a bug was identified in greedybear/regex.py regarding how URL protocols are matched.
Problem
The current regex REGEX_URL_PROTOCOL uses the pattern ps?, which expects the protocol to be immediately followed by // (e.g., http//). This causes the regex to miss standard URLs that include a colon, such as http:// or https://, which are commonly found in payloads.
Proposed Solution
Update the regex pattern to optionally match the colon character by changing ps? to ps?:?.
Location
greedybear/regex.py
Example
- Current behavior: Matches
http//example.com, misseshttp://example.com - Expected behavior: Should match both.
Related PRs
Discovered and fixed in #639.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingpythonPull requests that update Python codePull requests that update Python code