Skip to content

Machofile analyzer#3268

Merged
mlodic merged 22 commits intointelowlproject:developfrom
IshaanXCoder:feature/analyzer-MachoFile
Feb 9, 2026
Merged

Machofile analyzer#3268
mlodic merged 22 commits intointelowlproject:developfrom
IshaanXCoder:feature/analyzer-MachoFile

Conversation

@IshaanXCoder
Copy link
Contributor

Add MachoFile analyzer

Resolves issue: #2948

Description

Implemented a new file analyzer MachoFile to parse and extract information from Mach-O binaries

Used https://github.com/pstirparo/machofile library to extract the details. It fully supports both single-architecture binaries (ARM64, x86_64) and Universal (FAT) binaries.

Screenshot 2026-02-03 at 4 06 41 PM

Type of change

  • New feature (non-breaking change which adds functionality).

Checklist

  • I have read and understood the rules about how to Contribute to this project
  • The pull request is for the branch develop
  • A new plugin (analyzer, connector, visualizer, playbook, pivot or ingestor) was added or changed, in which case:
    • I strictly followed the documentation "How to create a Plugin"
    • I have dumped the configuration from Django Admin using the dumpplugin command and added it in the project as a data migration.
    • If a File analyzer was added and it supports a mimetype which is not already supported, you added a sample of that type inside the archive test_files.zip and you added the default tests for that mimetype in test_classes.py.
    • I have provided the resulting raw JSON of a finished analysis and a screenshot of the results.
    • If a new analyzer has beed added, I have created a unittest for it in the appropriate dir.
    • I have created the corresponding DataModel for the new analyzer.
  • I have inserted the copyright banner at the start of the file.
  • Linters (Black, Flake, Isort) gave 0 errors.
  • I have added tests for the feature/bug I solved (see tests folder). All the tests (new and old ones) gave 0 errors.

Copilot AI review requested due to automatic review settings February 3, 2026 11:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a new Mach-O file analyzer (MachoFile/MachoInfo) so IntelOwl can parse macOS/iOS Mach-O binaries, similar to how existing analyzers handle PE and ELF files. It wires the analyzer into the plugin system, playbooks, dependencies, and test suite (unit + integration) for the new Mach-O mimetype.

Changes:

  • Introduced MachoInfo file analyzer using the machofile library to extract headers, architectures, segments, dylibs, imports/exports, hashes, and code-signature data from Mach-O binaries.
  • Registered the new MachoFile analyzer via Django data migrations, added it to the FREE_TO_USE_ANALYZERS playbook, and added Mach-O support to the shared file-analyzer test base and integration tests.
  • Added the machofile dependency from its GitHub repository to project-requirements.txt.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
api_app/analyzers_manager/file_analyzers/macho_info.py Implements the MachoInfo FileAnalyzer with robust parsing of Mach-O binaries using machofile, including logging and report error handling.
api_app/analyzers_manager/migrations/0175_analyzer_config_macho_info.py Adds the MachoFile analyzer configuration (module path, description, supported application/x-mach-binary mimetype, and settings) into the analyzers registry.
api_app/playbooks_manager/migrations/0062_add_machofile_to_free_to_use.py Ensures the MachoFile analyzer is included in the FREE_TO_USE_ANALYZERS playbook via a Django migration.
requirements/project-requirements.txt Introduces the machofile dependency via a Git URL so the analyzer can parse Mach-O binaries.
tests/api_app/analyzers_manager/unit_tests/file_analyzers/base_test_class.py Extends the shared mapping of mimetype-to-sample to include application/x-mach-binary for unit tests.
tests/api_app/analyzers_manager/unit_tests/file_analyzers/test_macho_info.py Adds unit tests for MachoInfo, mocking machofile to validate behavior over supported structures.
tests/api_app/analyzers_manager/integration_tests/file_analyzers/test_macho_info.py Adds an integration test running MachoFile against a real Mach-O sample and asserting key report fields (header, architectures, segments, dylibs, hashes, exports, code_signature).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings February 3, 2026 12:40
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 80 to 82
python_path = plugin.pop("model")
Model = apps.get_model(*python_path.split("."))
if not Model.objects.filter(name=plugin["name"]).exists():
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plugin.pop("model") is called both in migrate and again in reverse_migrate, so after the forward migration has run once in a given process the second pop will raise a KeyError because the model key has already been removed from the shared plugin dict. To avoid this, avoid mutating plugin here (e.g. use python_path = plugin["model"]) or cache the model path in a separate module-level variable that both functions can read without popping.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 3, 2026 13:28
@IshaanXCoder
Copy link
Contributor Author

IshaanXCoder commented Feb 3, 2026

@mlodic could you please review this PR? I've made multiple commits trying to fix these CICD Fails. ruff is passing all checks locally but fails here. I'll try to fix this in the meanwhile. Will make a clean PR later post your review fixes and cicd fixes.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@mlodic
Copy link
Member

mlodic commented Feb 3, 2026

pull from develop and re-install pre-commit then try again please

Copilot AI review requested due to automatic review settings February 3, 2026 15:52
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings February 4, 2026 05:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

else:
no_mtm[field] = value
try:
o = Model.objects.get(**no_mtm)
Copy link

Copilot AI Feb 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable o is not used.

Suggested change
o = Model.objects.get(**no_mtm)
Model.objects.get(**no_mtm)

Copilot uses AI. Check for mistakes.
self.job_id, self.analyzer_name, self.md5, error_msg
)
)
raise Exception(error_msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AnalyzerRunException must be raised here. Also, this should not be a warning, it is too noisy. It should only be added to the report in the errors list. see other analyzers as a reference

else:
results["architectures"] = []

if hasattr(macho, "uuid"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are just converting an object into a dict. refactor all of this in a more convenient way. If the supported tool changes, we should change this too here...it is unfeasible...also this code is unreadable. please find a workaround

results["hashes"] = macho.get_similarity_hashes(formatted=True)

except Exception as e:
warning_message = "job_id:{} analyzer:{} md5:{} filename:{} MachoFile parsing error: {}".format(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

f-strings everywhere

try:
macho = self._parse_macho()
if macho is None:
raise Exception("Failed to create MachO object")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this check can be removed as _parse_macho() raises only on failure. works?

            if macho is None:
                raise Exception("Failed to create MachO object")

Copilot AI review requested due to automatic review settings February 6, 2026 15:06
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"80cc133a33786ceeacbd3acacd025dfc",
# not hardcoded, MD5 of tested data
"application/x-mach-binary",
"MachoFile",
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The analyzer config name in the migration is "MachoInfo" but the integration test is trying to retrieve an AnalyzerConfig with name "MachoFile". This mismatch will cause the test to fail with a DoesNotExist error. Please update line 17 to use "MachoInfo" instead of "MachoFile" to match the analyzer configuration name.

Suggested change
"MachoFile",
"MachoInfo",

Copilot uses AI. Check for mistakes.
except AnalyzerRunException:
raise
except Exception as e:
error_msg = f"job_id:{self.job_id} analyzer:{self.analyzer_name} md5:{self.md5} filename:{self.filename} MachoFile parsing error: {e}"
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message refers to "MachoFile" but the analyzer class is named "MachoInfo". For consistency, consider changing "MachoFile parsing error" to "MachoInfo parsing error" or use a generic term like "Mach-O parsing error".

Suggested change
error_msg = f"job_id:{self.job_id} analyzer:{self.analyzer_name} md5:{self.md5} filename:{self.filename} MachoFile parsing error: {e}"
error_msg = f"job_id:{self.job_id} analyzer:{self.analyzer_name} md5:{self.md5} filename:{self.filename} Mach-O parsing error: {e}"

Copilot uses AI. Check for mistakes.
Comment on lines 12 to 21
pc.analyzers.add(AnalyzerConfig.objects.get(name="MachoFile").id)
pc.full_clean()
pc.save()


def reverse_migrate(apps, schema_editor):
playbook_config = apps.get_model("playbooks_manager", "PlaybookConfig")
AnalyzerConfig = apps.get_model("analyzers_manager", "AnalyzerConfig")
pc = playbook_config.objects.get(name="FREE_TO_USE_ANALYZERS")
pc.analyzers.remove(AnalyzerConfig.objects.get(name="MachoFile").id)
Copy link

Copilot AI Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The playbook migration is trying to retrieve an AnalyzerConfig with name "MachoFile", but the analyzer config migration (0176_analyzer_config_macho_info.py) creates it with name "MachoInfo". This mismatch will cause the migration to fail. Please update "MachoFile" to "MachoInfo" on both lines 12 and 21 to match the analyzer configuration name.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 7, 2026 06:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +25 to +30
"supported_filetypes": [
"application/x-mach-binary",
"application/mac-binary",
"application/x-binary",
"application/x-executable",
],
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The supported_filetypes list includes very broad MIME types like "application/x-executable" and "application/x-binary", which are commonly used for non–Mach-O binaries (e.g., ELF). This will cause MachoInfo to run (and likely fail) on unrelated files; restrict supported_filetypes to Mach-O-specific MIME types (and/or add explicit signature checks + early exit) to avoid noisy failures and unnecessary work.

Copilot uses AI. Check for mistakes.
Copilot AI review requested due to automatic review settings February 7, 2026 09:39
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +31 to +32
"application/x-binary",
"application/x-executable",
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

supported_filetypes includes very generic executable mimetypes (e.g. application/x-executable, application/x-binary). Those often match ELF/Linux binaries too, which conflicts with the “mac samples only” intent and will cause the analyzer to be offered/run on non‑Mach‑O files (likely failing at runtime). Consider restricting this list to Mach-O specific mimetypes only, or add an early magic-byte check in run() that exits with a clear "not a Mach-O" error before attempting to parse.

Suggested change
"application/x-binary",
"application/x-executable",

Copilot uses AI. Check for mistakes.
@IshaanXCoder
Copy link
Contributor Author

@mlodic pls review this, fixed yours and copilot recommendations

@IshaanXCoder
Copy link
Contributor Author

I checked out the cause behind failing backend tests,
image

The Mach-O sample added in this PR caused BlintAnalyzer tests to run on application/x-mach-binary revealing a recursion limit bug during metadata export
I've patched blint_scan.py to skip export_metadata, which resolves the crash

Copilot AI review requested due to automatic review settings February 7, 2026 19:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 11 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +28 to +33
"supported_filetypes": [
"application/x-mach-binary",
"application/mac-binary",
"application/x-binary",
"application/x-executable",
],
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

supported_filetypes includes generic MIME types like application/x-binary and application/x-executable. Since this analyzer is Mach-O specific (macOS/iOS), this configuration will make MachoInfo run on many non–Mach-O samples (e.g., ELF binaries commonly labeled application/x-executable), producing avoidable failures/noise in default playbooks. Consider restricting supported_filetypes to Mach-O specific MIME types (e.g., application/x-mach-binary and/or application/mac-binary) and removing the generic ones, or moving generic types to a separate opt-in config.

Copilot uses AI. Check for mistakes.
macho = machofile.MachO(self.filepath)
try:
macho.parse()
except AttributeError:
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Copilot uses AI. Check for mistakes.
macho = machofile.UniversalMachO(self.filepath)
try:
macho.parse()
except AttributeError:
Copy link

Copilot AI Feb 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'except' clause does nothing but pass and there is no explanatory comment.

Copilot uses AI. Check for mistakes.
@IshaanXCoder
Copy link
Contributor Author

@mlodic i made the changes, ptal whenever free

@mlodic mlodic merged commit e8dcbff into intelowlproject:develop Feb 9, 2026
14 of 15 checks passed
Abhishek9639 pushed a commit to Abhishek9639/IntelOwl that referenced this pull request Feb 17, 2026
* feature/analyzer : machofile analyzer

* fix/added unit tests, freeToUse, test_files.zip

* fix/cicd and copilot recommendations

* fix: precommit using ruff

* fix: precommit using ruff(again)

* fix: precommit using ruff(again)

* fix/cicd

* fix/cicd

* fix/cicd

* fix/migration-conflicts

* fix/migration-conflicts

* fix/cicd migrations

* fix/cicd migrations due to zip

* fix/add macho_sample

* fix/copilot and maintainer recommendations

* fix/cicd

* fix/zip-pass

* fix/cicd

* minor fixes : copilot recommendations

* fix/maintainer-feedback

* fix/cicd failure due to BlintAnalyzer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments