Description
At the moment, on Arm64 we always establish frame pointer when allocating a stack frame. Given that fact and based on the logic in Compiler::lvaFrameAddress()
the JIT would always use fp
as a base register for addressing locals and temps. This can lead to un-optimal codegen in a situation when the JIT is required to use frameType=5
(i.e., when fp
points to a location above locals) instead of frameType=3
(i.e., when fp
points to a location below locals). The following is a snippet of such un-optimal codegen when such restriction on fp
value is imposed:
-; V285 tmp284 [V285 ] ( 2, 4 ) struct ( 8) [fp+0x1C50] do-not-enreg[XSB] addr-exposed "Inlining Arg"
+; V285 tmp284 [V285 ] ( 2, 4 ) struct ( 8) [fp-0x7E8] do-not-enreg[XSB] addr-exposed "Inlining Arg"
-; TEMP_01 byref -> [fp+0x10]
+; TEMP_01 byref -> [fp-0x2428]
- ldrh w11, [fp,#0xd1ffab1e] // [V285 tmp284]
+ movn xip1, #0xd1ffab1e
+ ldrh w11, [fp, xip1] // [V285 tmp284]
- str x14, [fp,#16] // [TEMP_01]
+ movn xip1, #0xd1ffab1e
+ str x14, [fp, xip1] // [TEMP_01]
I noticed this while implementing stack probe helper support on Arm64 in #43250 when the JIT is required to store fp,lr
register pair earlier during prolog execution to be able to call a helper.
@dotnet/jit-contrib
category:cq
theme:register-allocator
Metadata
Metadata
Assignees
Labels
Type
Projects
Status