-
Notifications
You must be signed in to change notification settings - Fork 6.3k
Add eBPF ISA v4 instructions #7982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add eBPF ISA v4 instructions #7982
Conversation
According to the first blog post, v4 adds 7 new instructions - the signed operations, an unconditional jump, and a byte-swapping operation. If there's only two additional instructions, I think we can add those to this PR. |
3ae65b8
to
9346fe2
Compare
In 2023, the eBPF instruction set was modified to add several instructions related to signed operations (load with sign-extension, signed division, etc.), a 32-bit jump instruction and some byte-swap instructions. This became version 4 of eBPF ISA. Here are some references about this change: - https://pchaigno.github.io/bpf/2021/10/20/ebpf-instruction-sets.html (a blog post about eBPF instruction set extensions) - https://lore.kernel.org/bpf/[email protected]/ (documentation sent to Linux Kernel mailing list) - https://www.rfc-editor.org/rfc/rfc9669.html#name-sign-extension-load-operati (IETF's BPF Instruction Set Architecture standard defined the new instructions) - https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/core.c?h=v6.14#n1859 (implementation of signed division and remainder in Linux kernel. This shows that 32-bit signed DIV and signed MOD are zero-extending the result in DST) - https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/core.c?h=v6.14#n2135 (implementation of signed memory load in Linux kernel) - https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1f9a1ea821ff25353a0e80d971e7958cd55b47a3 (commit which added signed memory load instructions in Linux kernel) This can be tested with a recent enough version of clang and LLVM (this works with clang 19.1.4 on Alpine 3.21). For example for signed memory load instructions: signed int sext_8bit(signed char x) { return x; } produces: $ clang -O0 -target bpf -mcpu=v4 -c test.c -o test.ebpf $ llvm-objdump -rd test.ebpf ... 0000000000000000 <sext_8bit>: 0: 73 1a ff ff 00 00 00 00 *(u8 *)(r10 - 0x1) = r1 1: 91 a1 ff ff 00 00 00 00 r1 = *(s8 *)(r10 - 0x1) 2: bc 10 00 00 00 00 00 00 w0 = w1 3: 95 00 00 00 00 00 00 00 exit (The second instruction is a signed memory load) Instruction MOVS (Sign extend register MOV) uses offset to encode the conversion (whether the source register is to be considered as signed 8-bit, 16-bit or 32-bit integer). The mnemonic for these instructions is quite unclear: - They are all named MOVS in the proposal https://lore.kernel.org/bpf/[email protected]/ - LLVM and Linux disassemblers only display pseudo-code (`r0 = (s8)r1`) - RFC 9669 (https://datatracker.ietf.org/doc/rfc9669/) uses MOVSX for all instructions. - GCC uses MOVS for all instructions: https://github.com/gcc-mirror/gcc/blob/releases/gcc-14.1.0/gcc/config/bpf/bpf.md?plain=1#L326-L365 To make the disassembled code clearer, decode such instructions with a size suffix: MOVSB, MOVSH, MOVSW. The decoding of instructions 32-bit JA, BSWAP16, BSWAP32 and BSWAP64 is straightforward.
9346fe2
to
ed8b5cc
Compare
Thanks for your quick reply! I added the 32-bit-offset jump and byte-swap instructions. I tested it decoded code using unsigned short do_bswap16(unsigned short x) {
return __builtin_bswap16(x);
}
unsigned int do_bswap32(unsigned int x) {
return __builtin_bswap32(x);
}
unsigned long do_bswap64(unsigned long x) {
return __builtin_bswap64(x);
} By the way, I believe instructions EDITED TO ADD: I opened another Pull Request to tackle this other issue: #7985 |
Hello,
In 2023, the eBPF instruction set was modified to add several instructions related to signed operations (load with sign-extension, signed division, etc.), in "version 4".
Here are some references about this change:
This can be tested with a recent enough version of clang and LLVM (this works with clang 19.1.4 on Alpine 3.21).
For example for signed memory load instructions:
produces:
(The second instruction is a signed memory load)
Instruction MOVS (Sign extend register MOV) uses offset to encode the conversion (whether the source register is to be considered as signed 8-bit, 16-bit or 32-bit integer). The mnemonic for these instructions is quite unclear:
r0 = (s8)r1
)To make the disassembled code clearer, decode such instructions with a size suffix: MOVSB, MOVSH, MOVSW. This deviation is my own choice and if you prefer to stick with what GCC does (MOVS) or what IETF's RFC standardized (MOVSX), I can change this.
To test the new instructions, I wrote a C program with several small functions and compiled it with
clang -O0 -target bpf -mcpu=v4 -c
andclang -O2 -target bpf -mcpu=v4 -c
on Alpine 3.21. This archive contains the source code and the 2 compiled programs: ebpf_v4_signed_op.zip.For information, eBPF ISA v4 contains other new instructions (for example 32-bit JA instruction). I choose to restrict the scope of this Pull Request to the signed operations only, to make reviewing it easier. Please let me know if you prefer a single Pull Request with all instructions from ISA v4.