Skip to content

Inline assembly with ymm regs fails in external assembler #1616

@ctz

Description

@ctz

MRE:

fn main() {
    unsafe { yo() };
}

#[target_feature(enable = "avx,avx2")]
unsafe fn yo() {
    unsafe {
        core::arch::asm!(
            "       vpxor   {zero}, {zero}, {zero}",
            // clobbers
            zero = out(ymm_reg) _,
        )
    }
}

This fails with:

   Compiling huh9 v0.1.0 (/home/jbp/tmp/huh9)
error: invalid operand for instruction
   |
note: instantiated into assembly here
  --> <inline asm>:12:8
   |
12 |        vpxor   y0, y0, y0
   |        ^

error: aborting due to 1 previous error

error: Failed to assemble `.globl _RNvCs7XiDh2GTzAB_4huh92yo__inline_asm_4x7668az5195ogw9gyg60zd30_n0
       .type _RNvCs7XiDh2GTzAB_4huh92yo__inline_asm_4x7668az5195ogw9gyg60zd30_n0,@function
       .section .text._RNvCs7XiDh2GTzAB_4huh92yo__inline_asm_4x7668az5195ogw9gyg60zd30_n0,"ax",@progbits
       _RNvCs7XiDh2GTzAB_4huh92yo__inline_asm_4x7668az5195ogw9gyg60zd30_n0:
       .intel_syntax noprefix
           push rbp
           mov rbp,rsp
           push rbx
           mov rbx,rdi
              vpxor   y0, y0, y0
           pop rbx
           pop rbp
           ret
       .att_syntax
       .size _RNvCs7XiDh2GTzAB_4huh92yo__inline_asm_4x7668az5195ogw9gyg60zd30_n0, .-_RNvCs7XiDh2GTzAB_4huh92yo__inline_asm_4x7668az5195ogw9gyg60zd30_n0
       .text


       `

error: could not compile `huh9` (bin "huh9") due to 1 previous error

I believe the issue is here:

if reg as u32 >= X86InlineAsmReg::xmm0 as u32
&& reg as u32 <= X86InlineAsmReg::xmm15 as u32 =>
{
// rustc emits x0 rather than xmm0
let class = match *modifier {
None | Some('x') => "xmm",
Some('y') => "ymm",
Some('z') => "zmm",
_ => unreachable!(),
};
write!(
generated_asm,
"{class}{}",
reg as u32 - X86InlineAsmReg::xmm0 as u32
)
.unwrap();

While this seems to intend to fix up ymm and zmm registers, it doesn't:

  • first, the guard is only for xmm registers, so the match on modifier is unreachable for those registers?
  • second, the reconstruction of the register number wouldn't work for ymm and zmm registers, because -- for example -- X86InlineAsmReg::ymm0 as u32 - X86InlineAsmReg::xmm0 as u32 is 16 so ymm16 would be the output.

Potential root cause of ctz/graviola#127

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-bugCategory: This is a bug.O-x86_64Target: x64 processors

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions