Skip to content

Sifting instruction encodings on ARM64, many capstone unsupported encodings discovered #2150

Closed
@watbulb

Description

Hello,

I am working on a project to locate undefined instructions on various ARM64 processors, and attempt to attribute them to hardware.

In my code, I do a naïve masked increment to search the encoding space from 00 00 00 00 to ff ff ff ff, however, before I run the incremented mask as a instruction, I first pass the instruction to execute to capstone in-order to first check if the encoding is known by some disassembler, before attempting to execute the instruction and checking various pieces of the processor state if executed/decoded.

Doing this increment, disassemble, check loop has resulted in creating a corpus of instructions that decode properly using LLVM 16.0.6 objdump, however, capstone has no knowledge of such encodings. Some of these are due to missing extension support in capstone, which is fine, I can filter and work around that. The instructions I am concerned about are instructions that are in the base ISA for Aarch64 that LLVM handles, but capstone does not.

I wanted to start a discussion here about how I should go about working with the capstone contributors here and which way would be the best to report these decoding inconsistencies. I can upload a corpus set with instructions that are not part of a extension set for Aarch64 which capstone does not decode, but LLVM does. Would this be the best way forward? Unfortunately, I'm not terribly familiar with the capstone codebase, but I'm quite familiar with TableGen, I'd be happy to try and diagnose this if its indeed an issue and i'm not crazy or doing something stupid 😆. I apologize if this is just a bunch of noise that will be fixed in #2026. I can also try @Rot127's auto-sync-aarch64 branch now and report if these have been fixed, if at all helpful.

Thank you!

Below I'll include a couple examples of these instructions:

LDRSB
LLVM objdump 16.0.6

1809d38: 38de27de      ldrsb   w30, [x30], #-0x1e

cstool 5.0.1:

./cstool -d arm64 '38de27de'
ERROR: invalid assembly code
./cstool -d arm64 'de27de38'
ERROR: invalid assembly code

LDXRB
LLVM objdump 16.0.6

2324: 0d 02 40 08   ldxrb   w13, [x16]

cstool 5.0.1:

./cstool -d arm64 '0d024008'
ERROR: invalid assembly code
./cstool -d arm64 '0840020d'
ERROR: invalid assembly code

LDTR
LLVM objdump 16.0.6

60121e4: 42 f8 5e f8   ldtr    x2, [x2, #-17]

cstool 5.0.1

./cstool arm64 '42f85ef8'
ERROR: invalid assembly code
./cstool arm64 'f85ef842'

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions