High P-code Normalization strategies for Cross-Arch Analysis #8874
Replies: 2 comments
-
|
Thanks for the insights first! I observed that INDIRECT and MULTIEQUAL account for ~70% of operations in my MIPS High P-code samples (vs. a much lower percentage in x86). Given this massive distribution shift, I am considering strictly filtering these two opcodes out to improve cross-arch alignment. Is this a valid strategy to reduce noise? Or is there a recommended normalization approach (e.g., edge contraction or specific analyzer settings) that would better preserve the underlying data-flow logic while still reducing this architectural artifact? I appreciate any ideas. |
Beta Was this translation helpful? Give feedback.
-
|
Hi, the Discussions page can be a graveyard at times
My guess is that you'll need to enable the undocumented Decompiler Rules and tailor for each processor. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all,
I'm researching cross-architecture malware classification and using High P-code for feature extraction. I've noticed a significant structural divergence between x86 and MIPS even for the exact same C source code, which is causing a major accuracy drop in my models (MIPS accuracy is ~20% lower than x86).
The Problem: Despite using High P-code (via DecompInterface), MIPS retains many architectural artifacts compared to x86.
Example: A simple function call appears as a clean CALL in x86, but manifests as multiple INDIRECT operations or complex pointer arithmetic in MIPS High P-code.
(See attached screenshots for the comparison of signal_handler / sheet_hash_1).
What I'm looking for: Is there a specific decompiler optimization pass or a recommended normalization strategy within the Ghidra API to abstract these architectural differences further? Or is High P-code expected to retain this level of architectural dependency?
Any insights or pointers to relevant classes/analyzers would be greatly appreciated. Thanks!
Beta Was this translation helpful? Give feedback.
All reactions