Skip to content

Feature Request: Add INT_ROTATE PCode operation #7377

Open
@LukeSerne

Description

@LukeSerne

Is your feature request related to a problem? Please describe.
While reverse engineering a PowerPC program, I regularly come across the rlwinm instruction. This instruction (documented on page 501 of 6xx_pem.pdf) can perform a bit-rotate followed by a binary AND. This instruction is quite heavily used, for example for bit-rotating an integer, extracting bits from a bitfield and bit-shifting an integer. Additionally, the instruction frequently features in compiler optimisations to avoid branches.

Unfortunately, the current pcode implementation of this instruction uses a combination of INT_OR, INT_SUB, INT_LEFT and INT_RIGHT. While this is functionally correct, when the instruction is used as part of a larger structure that can be optimised, the decompiler often fails to recognise the rotate and thus the opportunity for optimisation.

I noticed bit-rotation is quite a common operation across the architectures currently supported by Ghidra. For instance, PowerPC has rlwinm, ARM has ror, x86 has ROR, MIPS32 has ROTR, z80 has RL and 68k has ROR. While many architectures include separate instructions for left-rotate and right-rotate, only one rotate instruction is necessary, since the other instruction can be modeled by negating the shift amount. Most of these architectures have several not just one, but several instructions that perform a bit rotate. As such, this new PCode instruction would be widely applicable.

Additionally, having a separate instruction for bit rotation allows multiple decompiler rules to simplify patterns involving bit rotations without each having to attempt to recognise the relatively complex structure that arises from the sequence. Then, there could be a single rule to convert the sequence of shifts to the new rotate instruction, which avoids several different rules having to detect bit-rotations. This would avoid code duplication, and it would also be more efficient.

Finally, INT_ROTATE(x, y) is arguably much clearer to human analysts than the equivalent (x << y) | (x >> (32 - y)), especially if x and y are more complicated expressions.

Describe the solution you'd like
I would like a new PCode op to be added, that takes 2 varnodes as parameters. I propose the name INT_ROTATE( V, n ). The first parameter is the value being rotated, and the second varnode specifies the rotation amount. This would be equivalent to (V << n) | (V >> (8 * |V| - n)), where |V| denotes the size of the varnode V in bytes.

Describe alternatives you've considered
An alternative is to not introduce this new PCode operation, and let the decompiler rules that need to deal with bit rotations just detect those cases themselves. However, I think that will lead to code duplication and less efficient decompilation. The decompiled output will also be harder to understand for analysts whenever bit rotations are not simplified away.

Additional context
An example of some PowerPC instructions that Ghidra currently struggles to simplify:

7c 00 00 34     cntlzw     r0, r0
38 60 00 01     li         r3, 0x1
5c 63 07 fe     rlwnm      r3, r3, r0, 0x1f, 0x1f
4e 80 00 20     blr

This corresponds to return r0 < 0 (edit: oops, it's actually return r0 <= 0 - just goes to show how tricky these expressions are for humans to understand), but ghidra currently decompiles this to:

  return (bool)(((byte)(1 << (LZCOUNT(r0) & 0x1fU))
                | (byte)(1 >> 0x20 - (LZCOUNT(r0) &
                                     0x1fU))) & 1);

While the decompiled code is functionally correct, it is nearly impossible to see that the code is testing for the sign bit of r0. The decompiler misses a few simplification steps:

  1. The rotate of 1 by LZCOUNT(r0) & 0x1fU, followed by the & 1, which is then cast to a bool, can be simplified to LZCOUNT(r0) & 0x1fU == 0, since only a rotate amount of 0 can ensure a 1 bit ends up in the ones-position of the result.
  2. LZCOUNT(r0) & 0x1fU can be simplified to just LZCOUNT(r0) == 0 || LZCOUNT(r0) == 32, since the varnode representing r0 is 4 bytes big, so LZCOUNT can never return a value that has any bits masked off.
  3. LZCOUNT(r0) == 0 implies that r0 s>> 31 != 0, since r0 is 32 bits wide.
  4. LZCOUNT(r0) == 32 implies r0 == 0, since r0 is 32 bits wide.

The decompiler can simplify r0 s>> 31 != 0 to r0 < 0 by RuleTestSign. The expression r0 < 0 || r0 == 0 is then simplified to r0 <= 0 by RuleLessEqual.

Note that step 1 would be much simpler to recognise and implement a rule for if there was a dedicated INT_ROTATE PCode instruction. There would be no need for step 2 if the semantics of INT_ROTATE were that it automatically masks the rotation amount by the number of bits in the value to be rotated, since the & 0x1f is introduced by the Pcode translation of rlwnm only because this masking is required for the rlwnm instruction. And if there is no INT_AND in the raw Pcode, there is also no need for the decompiler to simplify it away.

Metadata

Metadata

Assignees

Labels

Status: InternalThis is being tracked internally by the Ghidra team

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions