-
Notifications
You must be signed in to change notification settings - Fork 5k
Improve startup perf by avoiding JIT when invoking well-known signatures #115345
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Tagging subscribers to this area: @dotnet/area-system-reflection |
src/coreclr/System.Private.CoreLib/src/System/Reflection/MethodInvokerCommon.CoreCLR.cs
Outdated
Show resolved
Hide resolved
@@ -15,152 +14,188 @@ internal static unsafe class InstanceCalliHelper | |||
// Zero parameter methods such as property getters: | |||
|
|||
[Intrinsic] | |||
[MethodImpl(MethodImplOptions.NoInlining)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do these need to be marked with NoInlining
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without it, there is an AV during inlining.
So either the "dynamic method" that we created based on the intrinsic doesn't support inlining or there is a jit bug.
The exception is in the call to GetModule() here with call stack:
Assert failure(PID 60300 [0x0000eb8c], Thread: 40924 [0x9fdc]): Consistency check failed: AV in clr at this callstack:
------
CORECLR! CEEInfo::getArgType + 0x56E (0x00007ffe`20b74cfe)
CLRJIT! Compiler::impInlineInitVars + 0x322 (0x00007ffe`36f8f0e2)
CLRJIT! `Compiler::fgInvokeInlineeCompiler'::`2'::<lambda_1>::operator() + 0x24 (0x00007ffe`36edf994)
CORECLR! `CEEInfo::runWithErrorTrap'::`6'::__Body::Run + 0x86 (0x00007ffe`20b6c2c6)
CORECLR! CEEInfo::runWithErrorTrap + 0x2D (0x00007ffe`20ba27ed)
CLRJIT! Compiler::fgInvokeInlineeCompiler + 0x34C (0x00007ffe`36ee342c)
CLRJIT! Compiler::fgMorphCallInlineHelper + 0x2E3 (0x00007ffe`36ee3ae3)
CLRJIT! Compiler::fgMorphCallInline + 0x54 (0x00007ffe`36ee3664)
CLRJIT! Compiler::fgInline + 0x1C3 (0x00007ffe`36ee10d3)
CLRJIT! Phase::Run + 0x76 (0x00007ffe`3708da86)
CLRJIT! Compiler::compCompile + 0x57F (0x00007ffe`36e97cef)
CLRJIT! Compiler::compCompileHelper + 0xA03 (0x00007ffe`36e9bb93)
CLRJIT! Compiler::compCompile + 0xA18 (0x00007ffe`36e9a978)
CLRJIT! `jitNativeCode'::`8'::__Body::Run + 0xE2 (0x00007ffe`36e96762)
CLRJIT! jitNativeCode + 0x16E (0x00007ffe`36ea0f8e)
CLRJIT! CILJit::compileMethod + 0x169 (0x00007ffe`36ea8759)
CORECLR! invokeCompileMethodHelper + 0x151 (0x00007ffe`20b9cc01)
set XUNIT_HIDE_PASSING_OUTPUT_DIAGNOSTICS=1
CORECLR! invokeCompileMethod + 0x128 (0x00007ffe`20b9c9a8)
CORECLR! UnsafeJitFunctionWorker + 0x2EF (0x00007ffe`20b6dccf)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this looks like a bug in the intrinsic implementation. It would be a good idea to understand it (and fix it). It may have other silent manifestations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll take a look and report back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a better look at this and created issue #115429 to track. I can continue to investigate but thought I would create the issue now to see if someone has ideas or wants to pick it up.
If addressed, then we can either remove the NoInlining
or replace with AggressiveInlining
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am worried that this bug is some kind of memory corruption, and it is still going repro - but less frequently - with the NoInlining workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@steveharter I will try and take a look this week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix is #115639
… to use interpreted mode on first call to each method
This is the main contribution for tracking issue #112994.
For a R2R-enabled console "hello world" app, it reduced the jitted methods during startup by half (24 to 12) saving ~9% or 7ms (locally from 78->71ms).
For a non-R2R app there will be no jit necessary for each well-known signature after the first is jitted since they are cached. They are cacheable since the IL uses calli with no coupling to specific MemberInfo tokens. Note that it is possible to add calli and caching to the emit-based path (non-well-known cases) as was done in the prior PR for this; that would allow similar perf benefits of avoiding jit (+emit) for signatures that have already done that, but also adds the complexity and overhead of a cache that needs to key on each parameter.
Currently, the well-known method list is currently fairly short but will still cover many signatures because:
Design \ implementation notes:
InvokerEmitUtil.cs
for a total of 5. This also pushed the code to add more shared methods in that file.IntPtr.Zero
for the other cases, this avoids having additional delegates and conditionals to call a different delegate.ForceInterpretedInvoke
is set. This code should be removed shortly. Previously, the interpreted path is used for the first call to each method, then the method is emit'd on the second call but since we have the intrinsics now for startup\warmup, we should no longer have to do this. This also simplified the code since we no longer need to switch dynamically, and we moved from 3 fields to hold the 3 delegates to 1 field with appropriate casts to one of the 5 delegates._allocator
field.