Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds a new Mach-O file analyzer (MachoFile/MachoInfo) so IntelOwl can parse macOS/iOS Mach-O binaries, similar to how existing analyzers handle PE and ELF files. It wires the analyzer into the plugin system, playbooks, dependencies, and test suite (unit + integration) for the new Mach-O mimetype.
Changes:
- Introduced
MachoInfofile analyzer using themachofilelibrary to extract headers, architectures, segments, dylibs, imports/exports, hashes, and code-signature data from Mach-O binaries. - Registered the new
MachoFileanalyzer via Django data migrations, added it to theFREE_TO_USE_ANALYZERSplaybook, and added Mach-O support to the shared file-analyzer test base and integration tests. - Added the
machofiledependency from its GitHub repository toproject-requirements.txt.
Reviewed changes
Copilot reviewed 7 out of 8 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
api_app/analyzers_manager/file_analyzers/macho_info.py |
Implements the MachoInfo FileAnalyzer with robust parsing of Mach-O binaries using machofile, including logging and report error handling. |
api_app/analyzers_manager/migrations/0175_analyzer_config_macho_info.py |
Adds the MachoFile analyzer configuration (module path, description, supported application/x-mach-binary mimetype, and settings) into the analyzers registry. |
api_app/playbooks_manager/migrations/0062_add_machofile_to_free_to_use.py |
Ensures the MachoFile analyzer is included in the FREE_TO_USE_ANALYZERS playbook via a Django migration. |
requirements/project-requirements.txt |
Introduces the machofile dependency via a Git URL so the analyzer can parse Mach-O binaries. |
tests/api_app/analyzers_manager/unit_tests/file_analyzers/base_test_class.py |
Extends the shared mapping of mimetype-to-sample to include application/x-mach-binary for unit tests. |
tests/api_app/analyzers_manager/unit_tests/file_analyzers/test_macho_info.py |
Adds unit tests for MachoInfo, mocking machofile to validate behavior over supported structures. |
tests/api_app/analyzers_manager/integration_tests/file_analyzers/test_macho_info.py |
Adds an integration test running MachoFile against a real Mach-O sample and asserting key report fields (header, architectures, segments, dylibs, hashes, exports, code_signature). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tests/api_app/analyzers_manager/integration_tests/file_analyzers/test_macho_info.py
Outdated
Show resolved
Hide resolved
api_app/analyzers_manager/migrations/0175_analyzer_config_macho_info.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 8 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| python_path = plugin.pop("model") | ||
| Model = apps.get_model(*python_path.split(".")) | ||
| if not Model.objects.filter(name=plugin["name"]).exists(): |
There was a problem hiding this comment.
plugin.pop("model") is called both in migrate and again in reverse_migrate, so after the forward migration has run once in a given process the second pop will raise a KeyError because the model key has already been removed from the shared plugin dict. To avoid this, avoid mutating plugin here (e.g. use python_path = plugin["model"]) or cache the model path in a separate module-level variable that both functions can read without popping.
tests/api_app/analyzers_manager/integration_tests/file_analyzers/test_macho_info.py
Show resolved
Hide resolved
|
@mlodic could you please review this PR? I've made multiple commits trying to fix these CICD Fails. ruff is passing all checks locally but fails here. I'll try to fix this in the meanwhile. Will make a clean PR later post your review fixes and cicd fixes. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tests/api_app/analyzers_manager/integration_tests/file_analyzers/test_macho_info.py
Show resolved
Hide resolved
|
pull from develop and re-install pre-commit then try again please |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 9 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 9 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| else: | ||
| no_mtm[field] = value | ||
| try: | ||
| o = Model.objects.get(**no_mtm) |
There was a problem hiding this comment.
Variable o is not used.
| o = Model.objects.get(**no_mtm) | |
| Model.objects.get(**no_mtm) |
| self.job_id, self.analyzer_name, self.md5, error_msg | ||
| ) | ||
| ) | ||
| raise Exception(error_msg) |
There was a problem hiding this comment.
AnalyzerRunException must be raised here. Also, this should not be a warning, it is too noisy. It should only be added to the report in the errors list. see other analyzers as a reference
| else: | ||
| results["architectures"] = [] | ||
|
|
||
| if hasattr(macho, "uuid"): |
There was a problem hiding this comment.
you are just converting an object into a dict. refactor all of this in a more convenient way. If the supported tool changes, we should change this too here...it is unfeasible...also this code is unreadable. please find a workaround
| results["hashes"] = macho.get_similarity_hashes(formatted=True) | ||
|
|
||
| except Exception as e: | ||
| warning_message = "job_id:{} analyzer:{} md5:{} filename:{} MachoFile parsing error: {}".format( |
| try: | ||
| macho = self._parse_macho() | ||
| if macho is None: | ||
| raise Exception("Failed to create MachO object") |
There was a problem hiding this comment.
this check can be removed as _parse_macho() raises only on failure. works?
if macho is None:
raise Exception("Failed to create MachO object")
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tests/api_app/analyzers_manager/unit_tests/file_analyzers/base_test_class.py
Outdated
Show resolved
Hide resolved
| "80cc133a33786ceeacbd3acacd025dfc", | ||
| # not hardcoded, MD5 of tested data | ||
| "application/x-mach-binary", | ||
| "MachoFile", |
There was a problem hiding this comment.
The analyzer config name in the migration is "MachoInfo" but the integration test is trying to retrieve an AnalyzerConfig with name "MachoFile". This mismatch will cause the test to fail with a DoesNotExist error. Please update line 17 to use "MachoInfo" instead of "MachoFile" to match the analyzer configuration name.
| "MachoFile", | |
| "MachoInfo", |
| except AnalyzerRunException: | ||
| raise | ||
| except Exception as e: | ||
| error_msg = f"job_id:{self.job_id} analyzer:{self.analyzer_name} md5:{self.md5} filename:{self.filename} MachoFile parsing error: {e}" |
There was a problem hiding this comment.
The error message refers to "MachoFile" but the analyzer class is named "MachoInfo". For consistency, consider changing "MachoFile parsing error" to "MachoInfo parsing error" or use a generic term like "Mach-O parsing error".
| error_msg = f"job_id:{self.job_id} analyzer:{self.analyzer_name} md5:{self.md5} filename:{self.filename} MachoFile parsing error: {e}" | |
| error_msg = f"job_id:{self.job_id} analyzer:{self.analyzer_name} md5:{self.md5} filename:{self.filename} Mach-O parsing error: {e}" |
| pc.analyzers.add(AnalyzerConfig.objects.get(name="MachoFile").id) | ||
| pc.full_clean() | ||
| pc.save() | ||
|
|
||
|
|
||
| def reverse_migrate(apps, schema_editor): | ||
| playbook_config = apps.get_model("playbooks_manager", "PlaybookConfig") | ||
| AnalyzerConfig = apps.get_model("analyzers_manager", "AnalyzerConfig") | ||
| pc = playbook_config.objects.get(name="FREE_TO_USE_ANALYZERS") | ||
| pc.analyzers.remove(AnalyzerConfig.objects.get(name="MachoFile").id) |
There was a problem hiding this comment.
The playbook migration is trying to retrieve an AnalyzerConfig with name "MachoFile", but the analyzer config migration (0176_analyzer_config_macho_info.py) creates it with name "MachoInfo". This mismatch will cause the migration to fail. Please update "MachoFile" to "MachoInfo" on both lines 12 and 21 to match the analyzer configuration name.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 9 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tests/api_app/analyzers_manager/integration_tests/file_analyzers/test_macho_info.py
Show resolved
Hide resolved
tests/api_app/analyzers_manager/integration_tests/file_analyzers/test_macho_info.py
Show resolved
Hide resolved
tests/api_app/analyzers_manager/integration_tests/file_analyzers/test_macho_info.py
Show resolved
Hide resolved
| "supported_filetypes": [ | ||
| "application/x-mach-binary", | ||
| "application/mac-binary", | ||
| "application/x-binary", | ||
| "application/x-executable", | ||
| ], |
There was a problem hiding this comment.
The supported_filetypes list includes very broad MIME types like "application/x-executable" and "application/x-binary", which are commonly used for non–Mach-O binaries (e.g., ELF). This will cause MachoInfo to run (and likely fail) on unrelated files; restrict supported_filetypes to Mach-O-specific MIME types (and/or add explicit signature checks + early exit) to avoid noisy failures and unnecessary work.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 9 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "application/x-binary", | ||
| "application/x-executable", |
There was a problem hiding this comment.
supported_filetypes includes very generic executable mimetypes (e.g. application/x-executable, application/x-binary). Those often match ELF/Linux binaries too, which conflicts with the “mac samples only” intent and will cause the analyzer to be offered/run on non‑Mach‑O files (likely failing at runtime). Consider restricting this list to Mach-O specific mimetypes only, or add an early magic-byte check in run() that exits with a clear "not a Mach-O" error before attempting to parse.
| "application/x-binary", | |
| "application/x-executable", |
|
@mlodic pls review this, fixed yours and copilot recommendations |
api_app/playbooks_manager/migrations/0063_add_machofile_to_free_to_use.py
Show resolved
Hide resolved
tests/api_app/analyzers_manager/unit_tests/file_analyzers/base_test_class.py
Outdated
Show resolved
Hide resolved
tests/api_app/analyzers_manager/unit_tests/file_analyzers/base_test_class.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 11 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| "supported_filetypes": [ | ||
| "application/x-mach-binary", | ||
| "application/mac-binary", | ||
| "application/x-binary", | ||
| "application/x-executable", | ||
| ], |
There was a problem hiding this comment.
supported_filetypes includes generic MIME types like application/x-binary and application/x-executable. Since this analyzer is Mach-O specific (macOS/iOS), this configuration will make MachoInfo run on many non–Mach-O samples (e.g., ELF binaries commonly labeled application/x-executable), producing avoidable failures/noise in default playbooks. Consider restricting supported_filetypes to Mach-O specific MIME types (e.g., application/x-mach-binary and/or application/mac-binary) and removing the generic ones, or moving generic types to a separate opt-in config.
tests/api_app/analyzers_manager/integration_tests/file_analyzers/test_macho_info.py
Show resolved
Hide resolved
| macho = machofile.MachO(self.filepath) | ||
| try: | ||
| macho.parse() | ||
| except AttributeError: |
There was a problem hiding this comment.
'except' clause does nothing but pass and there is no explanatory comment.
| macho = machofile.UniversalMachO(self.filepath) | ||
| try: | ||
| macho.parse() | ||
| except AttributeError: |
There was a problem hiding this comment.
'except' clause does nothing but pass and there is no explanatory comment.
|
@mlodic i made the changes, ptal whenever free |
* feature/analyzer : machofile analyzer * fix/added unit tests, freeToUse, test_files.zip * fix/cicd and copilot recommendations * fix: precommit using ruff * fix: precommit using ruff(again) * fix: precommit using ruff(again) * fix/cicd * fix/cicd * fix/cicd * fix/migration-conflicts * fix/migration-conflicts * fix/cicd migrations * fix/cicd migrations due to zip * fix/add macho_sample * fix/copilot and maintainer recommendations * fix/cicd * fix/zip-pass * fix/cicd * minor fixes : copilot recommendations * fix/maintainer-feedback * fix/cicd failure due to BlintAnalyzer

Add MachoFile analyzer
Resolves issue: #2948
Description
Implemented a new file analyzer
MachoFileto parse and extract information from Mach-O binariesUsed https://github.com/pstirparo/machofile library to extract the details. It fully supports both single-architecture binaries (ARM64, x86_64) and Universal (FAT) binaries.
Type of change
Checklist
developdumpplugincommand and added it in the project as a data migration.test_files.zipand you added the default tests for that mimetype in test_classes.py.DataModelfor the new analyzer.Black,Flake,Isort) gave 0 errors.testsfolder). All the tests (new and old ones) gave 0 errors.