Skip to content

Add a WildcardAssembler Plugin. #6118

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

plucia-mitre
Copy link
Contributor

@plucia-mitre plucia-mitre commented Jan 15, 2024

Allows assembling instructions containing wildcards. Includes support for filtering assembly results to only include some completions for each wildcard.

This starts to address issue #5800 which may also address several other existing issues including #679 and #2516.

This pull request builds on the assembler changes merged in e7458ed and was developed alongside @nsadeveloper789 (to whom we are very thankful for his support!)

This WildcardAssembler is not specific to Pickled Canary (though certainly useful for it). In addition to Pickled Canary and other assembly search tools, this assembler would be useful for tools which could now more easily surface architectural details about assembly instructions to end-users or other tools.

For more background on this pull request, see the now-closed #5801

Allows assembling instructions containing wildcards. Includes support for filtering assembly results to only include some completions for each wildcard.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me if I need to add something else to this file to make the documentation work properly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you would need to add some entries to make this work. I'd say look at any project except Features Base as an example. Before you go too far, though:

  • Usually everything in Help is meant to describe things in the GUI. I suppose, in anticipation of a GUI, it might make sense to include this.
  • For API-only features, I'd recommend instead including example GhidraScripts. You can often borrow code from your unit tests, and this gives you an opportunity to demonstrate one or two use cases in prototypical fashion. I think a basic instruction pattern search could be demoed? Keep it as simple as possible, and remember what seems simple to you, the developers, may not be simple others. Even skilled users seeing this for the first time will appreciate simple examples to start.
  • It's true, there are parts of our Help that describe the internals of Ghidra, e.g., some of the decompiler and Sleigh documentation. If this is the approach you intend to take, please indicate in the text (html) that the feature is currently only available as an API for plugins and scripts to use. If you add example scripts, you can refer to them in the help. (Scripts are things the user can definitely see from the GUI.)

If you continue down the "Help" road, the way to validate your help is:

  1. Run the help builder:
    gradle buildHelp
  2. Refresh all projects in Eclipse so that it copies the resources over into the build.
  3. Run Ghidra from Eclipse, hit F1 to open help, and search for your topic.
  4. Verify your HTML displays correctly in Ghidra's help viewer. (Sadly, Swing is stuck in HTML 3, CSS 2 times.)

I might recommend formatting your help as well. Beware that Tidy (which our formatter uses) may mangle your nested list, though, so commit first, then:

gradle WildcardAssembler:tidyHelp

Because it's Gradle, you can do the build and the tidy in the same command. Don't omit the WildcardAssembler: part, though. Doing just gradle buildHelp tidyHelp will mangle all the other projects' help. Tidying is a newer practice that not all modules follow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll do some combo of the above and we can edit from there :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added two example scripts and tweaked the documentation. I've verified that the documentation looks as expected in the generated help (Help > Contents).

I certainly could be missing something, but the gradle WildcardAssembler:tidyHelp gradle command gives me:

Cannot locate tasks that match 'WildcardAssembler:tidyHelp' as task 'tidyHelp' not found in project ':WildcardAssembler'.

Furthermore, I'm not able to find any references to 'tidyHelp' in the repo unless GitHub search is letting me down! Thanks for your ongoing help!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tidyHelp task is simply a call to an HTML beautification library. It appears as though it is not in the repo, but instead inside of a build file that we use internally. It is not necessary for you to run this task.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense; thanks for letting me know I'm not completely crazy! 😄

@plucia-mitre
Copy link
Contributor Author

For reference (though certainly not critical to this request), there's a pull request for Pickled Canary here (mitre/pickled-canary#3) which refactors Pickled Canary to use this Wildcard Assembler.

@ryanmkurtz ryanmkurtz added the Status: Triage Information is being gathered label Jan 16, 2024
Copy link
Contributor

@nsadeveloper789 nsadeveloper789 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scripts and the help are a nice improvement! Thank you. Many coding style comments, then a few usability comments for the scripts. These aren't hard a fast "you musts," but strong suggestions. If you've already tried some of the suggestions and ran into issues, let me know.

</DL>
</BLOCKQUOTE>

<P class="providedbyplugin">Provided by: <I>Wildcard Assembler Plugin</I></P>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A Plugin is an actual configurable object in the Ghidra GUI. Because this is API only, I'd just say "Wildcard Assembler Module"

<BLOCKQUOTE>

<P><B>This feature is currently only available as an API for Ghidra
scripts and plugins.</B></P>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd go ahead and name your example scripts here, and even direct the user to find them in Ghidra's Script Manager. Something like:

For an example of how to use the API, see the FindInstructionWithWildcard and WildSleighAssemblerInfo scripts in the Script Manager.

}

/**
* Use an x86_64 WildSleighAssembler to assemble the given {@code wildcardedInstruction}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should use the {@link ...} syntax on WildSleighAssembler.

// Get our current program or build a new program if our current binary isn't x86_64
Program baseProgram = currentProgram;
if (!baseProgram.getLanguageCompilerSpecPair().languageID.getIdAsString()
.equals("x86:LE:64:default")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should just use the current program, whatever it is. We can always warn (at least in general) that the user's mileage may vary depending on the processor language.

// either "R12D" or "R13D". The instruction and/or wildcard here can be modified
// to an instruction found in your x86_64 binary, or patch your binary to have
// the bytes "0x45 0x31 0xed" somewhere.
var allValidResults = getAllResolvedPatterns("XOR R13D,`Q1/R1(2|3)D`");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not askString()? You can always include some examples in the comments above, or even include some examples in the prompt.


println("Hit at address: " + matchAddress.toString());

// Search over all the resolutions we were searching for and find the one which matches and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Search" is overloaded here. Consider rewording.

// Search over all the resolutions we were searching for and find the one which matches and
// use it determine what the wildcard values are for the given hit.
//
// It'd likely be much faster the deduplicate similar WildAssemblyResolvedPatterns based on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

faster the to deduplicate

* {@code WildAssemblyResolvedPatterns}
*
* @param results
* @return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing descriptions for param and return

// Make sure that if one of the example instructions was chosen the current binary has the
// correct architecture.
if (sampleInstructions.contains(wildcardedInstruction) &&
!language.getLanguageID().toString().equals("x86:LE:64:default")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto from the other script.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is different from the other script because here I am just raising a warning if the user is using one of the example instructions with an unexpected architecture). I've removed the "return" statement right after this to allow us to try anyways (and updated the warning text slightly). Do you think I should do something different here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. Honestly, I missed the if statement on Line 75. I'll re-read as I check your other revisions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice script. Even though it's a demo, for something provided out of the box, we should strive to make it as generally useful as possible without requiring the user to modify it. I marked particulars below as I was reading through it, but some usablility tips:

  • Just use the current program and its language. The assembler is supposed to accommodate all Ghidra's languages, anyway (even though we know it's not perfect.)
  • Allow the user to input the instruction. I see you use askString() in one script but not the other....
  • I had suggested using a table, but IIRC, addresses become links in the script Console? If so, then don't worry about it. Otherwise, consider just finding the next result and calling goTo, or using the table.

Ideally, the user would only need to modify the script to add a feature or change its operation, not just as a means of providing input.

@nsadeveloper789 nsadeveloper789 added Type: Enhancement New feature or request Feature: Assembler Status: Internal This is being tracked internally by the Ghidra team and removed Status: Triage Information is being gathered labels Jan 25, 2024
<CODE>!</CODE> character. For example, in a x86:LE:32:default binary:</P>
<BLOCKQUOTE>
<DL>
<DT>Assembles (no wildcard):</DT>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably remove the "Assembles" and parentheses for these. Just, e.g.:

No wildcard:

For the examples that doesn't work:

Single token (does not assemble):

@plucia-mitre
Copy link
Contributor Author

I think I've addressed most of the above feedback but I've left a few comments on several of the points. Let me know if you'd like me to mark the other threads as resolved or leave them open for someone else to confirm my fixes.

Copy link
Contributor

@nsadeveloper789 nsadeveloper789 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Thanks for the revisions! At this point, I think any remaining revisions should be minor, and we can take care of them during our internal review process. No guarantees, but we should be good to go.

@nsadeveloper789 nsadeveloper789 added Status: Prioritize This is currently being prioritized Status: Internal This is being tracked internally by the Ghidra team and removed Status: Internal This is being tracked internally by the Ghidra team Status: Prioritize This is currently being prioritized labels Feb 1, 2024
@plucia-mitre
Copy link
Contributor Author

Great! Thanks! Let me know if there's anything else here, else I'll be working on cleaning up the search engine and pattern parser portions of Pickled Canary as time permits.

@ryanmkurtz ryanmkurtz added this to the 11.1 milestone Feb 6, 2024
@ryanmkurtz ryanmkurtz closed this in 8abc0ce Feb 7, 2024
@astrelsky
Copy link
Contributor

8f0cos.jpg

I couldn't help myself, someone had to do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature: Assembler Status: Internal This is being tracked internally by the Ghidra team Type: Enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants