-
Notifications
You must be signed in to change notification settings - Fork 5k
Move FLS init before EventPipeAdapter::FinishInitialize and the first SetupThread() #115546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging subscribers to this area: @mangod9 |
If I deliberately delay finalizer thread by adding If I run With this PR and with the deliberate delay like above, if I run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR reorders initialization routines in the EEStartupHelper to ensure that the FLS slot is set up by the finalizer thread before any threads are created for OS destruction. Key changes include:
- Removal of the SyncBlockCache::Attach() function.
- Reordering of profiling services initialization, exception handling, and thread setup.
- Moving the call to FinalizerThread::WaitForFinalizerThreadStart() earlier in the startup sequence under a TARGET_WINDOWS guard.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
File | Description |
---|---|
src/coreclr/vm/syncblk.cpp | Removed the unused SyncBlockCache::Attach() implementation to streamline sync block cache initialization. |
src/coreclr/vm/ceemain.cpp | Reordered initialization calls: moved finalizer thread waiting earlier and adjusted profiling and thread setup. |
Comments suppressed due to low confidence (3)
src/coreclr/vm/syncblk.cpp:591
- [nitpick] Following the removal of the SyncBlockCache::Attach() implementation, consider cleaning up its declaration in the header files if it is no longer used anywhere else.
void SyncBlockCache::Attach()
src/coreclr/vm/ceemain.cpp:893
- The call to FinalizerThread::WaitForFinalizerThreadStart() is now guarded by TARGET_WINDOWS. Please confirm that non-Windows builds do not require this waiting, as it was previously executed unconditionally.
FinalizerThread::WaitForFinalizerThreadStart();
src/coreclr/vm/ceemain.cpp:872
- [nitpick] The profiling services initialization block has been moved after the exception filter setup; please confirm that this reordering does not affect profiling behavior or the overall startup sequencing.
hr = ProfilingAPIUtility::InitializeProfiling();
I think this is ready to be reviewed. |
} | ||
|
||
#ifdef TARGET_WINDOWS | ||
// g_pGCHeap->Initialize() above could take nontrivial time, so by now the finalizer thread |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we end up creating the finalizer thread too soon on non-Windows?
For example, can the finalizer Thread
be registered in thread store before ProfilingAPIUtility::InitializeProfiling
and miss profiler notifications that are done as part of the registration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If InitializeProfiling
does not need Thread
object for the current thread, it can be moved earlier.
The comments inside InitializeProfiling
say that there are asserts somewhere that finalizer thread has started. So I wonder - are there tests for profiler attach? (or maybe I misunderstood the comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments inside InitializeProfiling say that there are asserts somewhere that finalizer thread has started.
These comments refer to the old profiler attach implementation. I am deleting them in #115680
So I wonder - are there tests for profiler attach?
Yes, look for "Attach" under https://github.com/dotnet/runtime/tree/main/src/tests/profiler. (The tests won't hit every subtle race condition though.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved the profiler initializtion earlier. If it needs to attach a Thread, then it would hit an assert in OsAttachThread
. There is an assert there now to catch these cases deterministically.
I was mostly wondering if we test scenarios beyond "profiler is not involved at all".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With default test set there are no failures, so based on that we could say profiler initializtion does not create/attach Thread objects - directly or indirectly, then it is all good.
If the profiler tests you've pointed to run in default test set, it may be good enough. If we need to run some other leg - like outer loop, perhaps makes sense to do that. Maybe makes sense to do that regardless.
Co-authored-by: Jan Kotas <[email protected]>
src/coreclr/vm/ceemain.cpp
Outdated
@@ -839,33 +839,13 @@ void EEStartupHelper() | |||
|
|||
#ifdef PROFILING_SUPPORTED | |||
// Initialize the profiling services. | |||
// THis must happen before the finalizer thread is stopped on its first wait. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think this describes the required invariant accurately.
My concern is about making sure that the profilers receive these ThreadCreated notification for the finalize or any other thread reliably. I think it means that ProfilingAPIUtility::InitializeProfiling
is called before SetupThread
is on any thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering whether it would make sense to start the profiler thread in a different place on Windows vs. non-Windows to make both cases about as optimal as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Start it earlier on Windows since we know we will need to wait for it and it won't run away to do other things.
Start it later on Windows once there is no problem with the finalizer thread running away to do other things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it means that ProfilingAPIUtility::InitializeProfiling is called before SetupThread is on any thread.
I think the place where I moved InitializeProfiling
would guarantee that any thread will do its SetupThread
later than InitializeProfiling
.
The place where InitializeProfiling
is now does not allow SetupThread
, at least on Windows - an assert will catch that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main concern is that InitializeProfiling
itself would not try to set up a thread, directly or indirectly. It looks like it does not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we may need to introduce a wait like that on the finalizer thread to fix the debugger initialization problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the debugger needs that the current thread is initialized and attached. We can, indeed, miss ThreadCreated
on non-windows.
Maybe wait on all platforms in the same way. Creating a thread is fast next to GC initializing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe wait on all platforms in the same way
I do not think we want to penalize other platforms with the extra context switches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think next to what GC does (reserves tons of VM, creates a few threads,..), it may be ok.
But maybe one day GC will do all that in pay-for-play fashion. Fundamentally, it does not have to do it upfront.
The alternative is to start finalizer thread later on non-windows - as you've suggested. After the debugger is fully initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we may need to introduce a wait like that on the finalizer thread to fix the debugger initialization problem.
Yes, to be sure that finalizer thread on Windows waits for debugger init we need to make it wait for EE thread to tell it when it can proceed to HasStarted()
I've added that in the last change. Also in non-Windows case simply moved FinalizerThreadCreate()
to a place after g_pDebugInterface->StartupPhase2
// Notify debugger once the first thread is created to finish initialization. | ||
if (g_pDebugInterface != NULL) | ||
{ | ||
g_pDebugInterface->StartupPhase2(GetThread()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HasStarted
on the finalizer thread will send ThreadCreated notifications to the debugger. On non-Windows, are there going to be issues if this debugger notification on the finalizer thread is sent to debugger before this StartupPhase2 finishes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SetupThread
also calls ThreadCreated
and the EE initializing thread calls ThreadCreated
before StartupPhase2
is called.
Both SetupThread
and HasStarted
will check for if (CORDebuggerAttached())
before sending the notification though.
The functional part of ThreadCreated
(apart from notifications) is to check if debugger wants threads to stop, which cannot happen if debugger is not yet officially attached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there could be a race where finalizer thread checks if debugger is attached (sees that it is not) and then debugger asks threads to stop.
There is no such race for the thread that initializes the EE.
Co-authored-by: Jan Kotas <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
Co-authored-by: Jan Kotas <[email protected]>
@jkotas Re: backporting. I think we have not seen evidence that the bug is harmful in release builds. Possibly it just leads to leaks of Thread objects for the event pipe thread or for the thread that initializes EE (if it quits before the rest of the app). There is a possibility of effects that are worse, but not yet seen or attributed to this issue. It feels like there is no urgency in backporting and there is some risk that the fix may not be complete or brings up some other issue. I would suggest to at least wait couple weeks before considering backporting. |
Ok, sounds good. |
Thanks! |
Fixes: #114506
In the previous fix we tried to call
FinalizerThread::WaitForFinalizerThreadStart();
as late as possible inEEStartupHelper()
- to make sure finalizer has enough time to do its part in initializing COM and FLS slot.But if calls that happen before
FinalizerThread::WaitForFinalizerThreadStart();
can result inThread
objects created and registered for OS destruction, then those threads will seeg_flsIndex != FLS_OUT_OF_INDEXES
, that includes the call toSetupThread()
, whichEEStartupHelper()
itself does.The change moves
FinalizerThread::WaitForFinalizerThreadStart();
earlier, before anyThreads
could be created and registered for OS destruction.We still have GC heap initialization happening before that, which will typically result in finalizer thread completing its part before the EE init thread needs to check for the completion.