Skip to content

[ARM64/Linux] Register allocator doesn't seem to always track liveness across basic blocks #12737

Open
@TamarChristinaArm

Description

@TamarChristinaArm

In the following case with QuickJIT turned off

static int[] Test(int[] a)
{
  unsafe {
    fixed (int *a_ptr = a, b_ptr = b)
    {
    }
  }
  return a;
}

There are some dead stores and repeated loads of the same values

for instance the initialization code along with the next basic block which seems to do a null check:

G_M32566_IG01:
        A9BE7BFD          stp     fp, lr, [sp,#-32]!
        910003FD          mov     fp, sp
        F9000FBF          str     xzr, [fp,#24] // [V05 loc2]
        F9000BBF          str     xzr, [fp,#16] // [V06 loc3]

G_M32566_IG02:
        F9000FA0          str     x0, [fp,#24]  // [V05 loc2]
        B4000120          cbz     x0, G_M32566_IG04
        F9400FA2          ldr     x2, [fp,#24]  // [V05 loc2]
        B9800842          ldrsw   x2, [x2,#8]
        340000C2          cbz     w2, G_M32566_IG04

fp+24 is stored twice to without being read. This is probably due to an ABI requirement that all variables be initialized to 0? And the stack slots are probably used by the GC to track pinned memory?

But essentially the first store of xzr to fp+24 is dead as you always reach G_M32566_IG02 from G_M32566_IG01. I don't know how much DSE the CLR does, but for the non-QuickJIT case as this I would have expected some in this case.

But also the copying of the pointer via the stack seems unneeded.

        F9000FA0          str     x0, [fp,#24]  // [V05 loc2]
        B4000120          cbz     x0, G_M32566_IG04
        F9400FA2          ldr     x2, [fp,#24]  // [V05 loc2]
        B9800842          ldrsw   x2, [x2,#8]

is the same as, and the storing of x0 can be done in the initialization above it.

        B4000120          cbz     x0, G_M32566_IG04
        B9800842          ldrsw   x2, [x0,#8]

but saves a load and store as x2 us still live.

looking at the larger sequence:

G_M32566_IG02:
        F9000FA0          str     x0, [fp,#24]  // [V05 loc2]
        B4000120          cbz     x0, G_M32566_IG04
        F9400FA2          ldr     x2, [fp,#24]  // [V05 loc2]
        B9800842          ldrsw   x2, [x2,#8]
        340000C2          cbz     w2, G_M32566_IG04

G_M32566_IG03:
        52800002          mov     w2, #0
        F9400FA3          ldr     x3, [fp,#24]  // [V05 loc2]
        B9800863          ldrsw   x3, [x3,#8]
        6B03005F          cmp     w2, w3
        54000202          bhs     G_M32566_IG08

G_M32566_IG02 seems to be doing an initialization check on the pointer and G_M32566_IG03 a null ptr check? But you can only get to G_M32566_IG03 by falling through from G_M32566_IG02 for which it's doing the same load it did in G_M32566_IG02.

This sequence can be:

G_M32566_IG02:
        B4000120          cbz     x0, G_M32566_IG04
        B9800842          ldrsw   x2, [x0,#8]
        340000C2          cbz     w2, G_M32566_IG04

G_M32566_IG03:
        6B03005F          cmp     wzr, w2
        54000202          bhs     G_M32566_IG08

But this also shows an oddity here. Since you can only get to G_M32566_IG03 if x2 is non-zero, the check in G_M32566_IG03 is odd, it seems to be superfluous.

The failure actions performed in G_M32566_IG04 and G_M32566_IG08 are different though, in the former it seems to just skip over the range check and goes to the initialization check of b_ptr where the latter will hit the RNGCHKFAIL.

So does it actually need the first check?

This seems to be the pattern for pinning any array for use with Vector Create, which gives a lot of overhead even before you load the vector.

/CC @CarolEidt @tannergooding

category:cq
theme:pinning
skill-level:intermediate
cost:large

Metadata

Metadata

Assignees

No one assigned

    Labels

    JitUntriagedCLR JIT issues needing additional triagearch-arm64area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions