Speed optimizations#18
Draft
meck-gd wants to merge 7 commits into
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
NOTE: This branch is based on #17 and depends on it being merged first, thus the draft status.
We've had to deal with huge scripts (9.5 MB) that use lots of arithmetic, duplicated variables etc. - these scripts have millions of nodes that proved to be really challenging for deobshell to handle. Initially, I stopped the process after multiple hours. I did quite a few changes to the code that brought the runtime down to ~17 minutes for 3 million nodes (excluding AST generation, which is also really resource and time intensive...).
In particular:
Smaller/micro optimizations:
iter("node tag")instead of tag comparisons in the loops. This pushes the comparison into native code, which is quite a bit faster.inexpressions with a single string on the right-hand side with==.inchecks, use tuples instead.I also added some more barewords, and operators including a new test script.
I understand that this changeset is pretty big. If you review the commits, I recommend using "Hide whitespaces", especially for Code speed optimizations. I ran the updated code against all scripts in the
datafolder and there were no changes.If you wish I can add a (malware) script that exercises these changes and ones I've done in the past. I have one that is 1.1 MB, so not outrageously large. I won't add its AST because that's over 40 MB, but the
.deob.ps1anddeob.xmlis ok. Let me know if you'd like me to do that.