Feature Request: Add `INT_ROTATE` PCode operation

**Is your feature request related to a problem? Please describe.**
While reverse engineering a PowerPC program, I regularly come across the `rlwinm` instruction. This instruction (documented on page 501 of [6xx_pem.pdf](http://kib.kiev.ua/x86docs/POWER/6xx_pem.pdf)) can perform a bit-rotate followed by a binary `AND`. This instruction is quite heavily used, for example for bit-rotating an integer, extracting bits from a bitfield and bit-shifting an integer. Additionally, the instruction frequently features in compiler optimisations to avoid branches.

Unfortunately, the [current pcode implementation of this instruction](https://github.com/NationalSecurityAgency/ghidra/blob/21382506445918f49516512e7332f47c3cceba05/Ghidra/Processors/PowerPC/data/languages/ppc_instructions.sinc#L3058-L3070) uses a combination of `INT_OR`, `INT_SUB`, `INT_LEFT` and `INT_RIGHT`. While this is functionally correct, when the instruction is used as part of a larger structure that can be optimised, the decompiler often fails to recognise the rotate and thus the opportunity for optimisation.

I noticed bit-rotation is quite a common operation across the architectures currently supported by Ghidra. For instance, PowerPC has [`rlwinm`](http://kib.kiev.ua/x86docs/POWER/6xx_pem.pdf), ARM has [`ror`](https://developer.arm.com/documentation/dui0379/e/arm-and-thumb-instructions/ror), x86 has [`ROR`](https://c9x.me/x86/html/file_module_x86_id_273.html), MIPS32 has [`ROTR`](https://www.cs.tau.ac.il/~afek/MipsInstructionSetReference.pdf), z80 has [`RL`](https://www.zilog.com/docs/z80/um0080.pdf) and 68k has [`ROR`](http://68k.hax.com/ROR). While many architectures include separate instructions for left-rotate and right-rotate, only one rotate instruction is necessary, since the other instruction can be modeled by negating the shift amount. Most of these architectures have several not just one, but several instructions that perform a bit rotate. As such, this new PCode instruction would be widely applicable.

Additionally, having a separate instruction for bit rotation allows multiple decompiler rules to simplify patterns involving bit rotations without each having to attempt to recognise the relatively complex structure that arises from the sequence. Then, there could be a single rule to convert the sequence of shifts to the new rotate instruction, which avoids several different rules having to detect bit-rotations. This would avoid code duplication, and it would also be more efficient.

Finally, `INT_ROTATE(x, y)` is arguably much clearer to human analysts than the equivalent `(x << y) | (x >> (32 - y))`, especially if `x` and `y` are more complicated expressions.

**Describe the solution you'd like**
I would like a new PCode op to be added, that takes 2 varnodes as parameters. I propose the name `INT_ROTATE( V, n )`. The first parameter is the value being rotated, and the second varnode specifies the rotation amount. This would be equivalent to ` (V << n) | (V >> (8 * |V| - n))`, where `|V|` denotes the size of the varnode `V` in bytes. 

**Describe alternatives you've considered**
An alternative is to not introduce this new PCode operation, and let the decompiler rules that need to deal with bit rotations just detect those cases themselves. However, I think that will lead to code duplication and less efficient decompilation. The decompiled output will also be harder to understand for analysts whenever bit rotations are not simplified away.

**Additional context**
An example of some PowerPC instructions that Ghidra currently struggles to simplify:

```
7c 00 00 34     cntlzw     r0, r0
38 60 00 01     li         r3, 0x1
5c 63 07 fe     rlwnm      r3, r3, r0, 0x1f, 0x1f
4e 80 00 20     blr
```

This corresponds to `return r0 < 0` (edit: oops, it's actually `return r0 <= 0` - just goes to show how tricky these expressions are for humans to understand), but ghidra currently decompiles this to:
```c++
  return (bool)(((byte)(1 << (LZCOUNT(r0) & 0x1fU))
                | (byte)(1 >> 0x20 - (LZCOUNT(r0) &
                                     0x1fU))) & 1);
```

While the decompiled code is functionally correct, it is nearly impossible to see that the code is testing for the sign bit of `r0`. The decompiler misses a few simplification steps:

1. The rotate of `1` by `LZCOUNT(r0) & 0x1fU`, followed by the `& 1`, which is then cast to a bool, can be simplified to `LZCOUNT(r0) & 0x1fU == 0`, since only a rotate amount of `0` can ensure a `1` bit ends up in the ones-position of the result.
2. `LZCOUNT(r0) & 0x1fU` can be simplified to just `LZCOUNT(r0) == 0 || LZCOUNT(r0) == 32`, since the varnode representing `r0` is 4 bytes big, so `LZCOUNT` can never return a value that has any bits masked off.
3. `LZCOUNT(r0) == 0` implies that `r0 s>> 31 != 0`, since `r0` is 32 bits wide.
4. `LZCOUNT(r0) == 32` implies `r0 == 0`, since `r0` is 32 bits wide.

The decompiler can simplify `r0 s>> 31 != 0` to `r0 < 0` by `RuleTestSign`. The expression `r0 < 0 || r0 == 0` is then simplified to `r0 <= 0` by `RuleLessEqual`.

Note that step 1 would be much simpler to recognise and implement a rule for if there was a dedicated `INT_ROTATE` PCode instruction. There would be no need for step 2 if the semantics of `INT_ROTATE` were that it automatically masks the rotation amount by the number of bits in the value to be rotated, since the `& 0x1f` is introduced by the Pcode translation of `rlwnm` only because this masking is required for the `rlwnm` instruction. And if there is no `INT_AND` in the raw Pcode, there is also no need for the decompiler to simplify it away.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Add `INT_ROTATE` PCode operation #7377

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Add INT_ROTATE PCode operation #7377

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Feature Request: Add `INT_ROTATE` PCode operation #7377