-
Notifications
You must be signed in to change notification settings - Fork 28
feat(malware-check): add whitespace check to detect excessive spacing and invisible characters #1086
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
src/macaron/malware_analyzer/pypi_heuristics/sourcecode/white_spaces.py
Outdated
Show resolved
Hide resolved
…d invisible characters Signed-off-by: Amine <[email protected]>
Signed-off-by: Amine <[email protected]>
db0e35c
to
6978bd7
Compare
@@ -600,3 +600,5 @@ major_threshold = 20 | |||
epoch_threshold = 3 | |||
# The number of days +/- the day of publish the calendar versioning day may be. | |||
day_publish_error = 4 | |||
# THe threshold for the number of repeated spaces in a line from the source code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small typo, "The" instead of "THe"
@@ -381,6 +383,10 @@ def run_check(self, ctx: AnalyzeContext) -> CheckResultData: | |||
failed({Heuristics.CLOSER_RELEASE_JOIN_DATE.value}), | |||
forceSetup. | |||
|
|||
% Package released with excessive whitespace in the code . | |||
{Confidence.HIGH.value}::trigger(malware_high_confidence_4) :- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you walk me through the rationale of why we should combine WHITE_SPACES
failing with the forceSetup
rule and why these rules together are a malicious indicator with HIGH
confidence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because it implies, not only that harmful code may be executed during the installation process, but also that there appears to be an effort to hide the form of this code which strongly suggests malicious motives.
Summary
This PR adds a new heuristic that analyzes code to detect suspicious use of excessive spaces and invisible characters. It checks whether the amount of spacing and invisible Unicode characters exceeds a defined threshold.
Description of changes
WhiteSpaces
heuristic in a new Python module.heuristics.py
file.WhiteSpacesAnalyzer
heuristic.detect_malicious_metadata_check.py
to integrate and execute the new heuristic logic during analysis.Related issues
None
Checklist
verified
label should appear next to all of your commits on GitHub.