Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use FLS detach callback as a thread termination notification. Another try. #112809

Open
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

VSadov
Copy link
Member

@VSadov VSadov commented Feb 22, 2025

Fixes: #110350

Switches the mechanism for OS notification about thread termination to use FLS (Fiber Local Storage) - the same mechanism as used on NativeAOT.

The advantage of FLS notification is that it does not run while holding the Loader Lock, thus it does not require that the termination callback never calls or waits for an OS call that may need to acquire that same Loader Lock. (It is hard for us to guarantee the latter)

This PR is similar to #110589 with additional mitigation for scenario where COM thread cleanups reinitialize the Thread object after we had our last chance to clean it up.
The mitigation is basically ensuring that COM is initialized and sets up its FLS slot before we set up ours.

@Copilot Copilot bot review requested due to automatic review settings February 22, 2025 01:55

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

@VSadov
Copy link
Member Author

VSadov commented Feb 22, 2025

@jkotas @mangod9 This is basically the same PR as before + mitigation for the COM reinitializing/attaching Thread objects or even running managed code after we had our last chance to detach Threads.

The idea is to call CoInitialize before setting up our FLS slot, which would make COM to set up its FLS slot, if not already initialized.

We call CoInitialize in most cases anyways, and fairly early, for example once the first method starts JIT-ing we launch the tiering helper, which is explicitly an MTA thread. Most user-launched threads are MTA as well. Alas, all that would be after we have our FLS slot set.

Why on the finalizer thread and not on the thread that initializes EE?
My concern is that we do not know the context of the thread that calls into EE initialization. In theory CoInitialize may be not allowed on some threads and initialize nothing. We control the state of the Finalizer thread though.

@VSadov
Copy link
Member Author

VSadov commented Feb 22, 2025

I have run quite a bit of winform tests with this + there is now a failfast in case we see an orphaned Thread in DLL_THREAD_DETACH, which is more deterministic than crashing on GC holes.

@jkotas
Copy link
Member

jkotas commented Feb 22, 2025

System.Threading.Tests.EtwTests.WaitHandleWaitEventTest failure looks related to this change.

Comment on lines 1416 to 1421
if (!IsFinalizerThread())
{
FinalizerThread::GetFinalizerThread()->SetRequiresCoInitialize();
// set the finalizer event
FinalizerThread::EnableFinalization();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this trying to do? Is it still needed now that we wait for the finalizer thread to CoInitialize COM during startup?

Copy link
Member Author

@VSadov VSadov Feb 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think all the stuff that tries to CoInitialize the current and Finalizer threads lazily can be removed if this change goes in.
I think quite a bit of cleanup might be possible even without this change.
Compared to NativeAOT there is a lot of extra complexity in Thread init/deinit. Some of that may still be needed (i.e. debugger/COM/syncblocks/threadstatics related), but some of that seems to be leftovers since we had hosting/fibers/appdomains and the like.

I did not want to go too far beyond what is needed for the fix - jut to minimize the diff, in case we backport this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that there is a lot of cruft that can be cleaned up that is better left for a different PR.

I am not able to reason about whether this if (!IsFinalizerThread()) { ... } block is just a useless code or whether it can be actively harmful in some corner cases.

For this PR, can we delete the if (!IsFinalizerThread()) block, fCoInitCurrentThread argument for EnsureComStarted and the one call to EnsureComStarted(FALSE)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added g_pFinalizerThread->SetApartmentOfUnstartedThread(Thread::AS_InMTA) to match manual call into CoInitialize, but that results in "lazy coinitialize" pattern forcing finalizer to set itself to MTA in response to setting finalizer thread to MTA.

I think I can remove both this change and g_pFinalizerThread->SetApartmentOfUnstartedThread(Thread::AS_InMTA) to raise fewer questions.
It is ok to use native API to set apartment state before Thread is attached. The rest is unnecessary and perhaps confusing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking about performance overhead of this change. We are going to do extra thread switch during startup critical path. It is unavoidable.

However, EnsureComStarted that is going to be executed a bit later is going to spin the finalizer thread again by setting FinalizerThread::EnableFinalization.

Can we reduce the startup overhead by setting g_fComStarted on finalizer thread once we call CoInitialize and reduce EnsureComStarted to just:

    _ASSERTE(g_fComStarted); // COM is expected to be started on finalizer thread during startup

?

Copy link
Member Author

@VSadov VSadov Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking about performance overhead of this change. We are going to do extra thread switch during startup critical path. It is unavoidable.

I've moved launching of finalizer thread a bit earlier. EE Initializing thread will do some other work before it needs to check if FLS/COM initialization is finished. Hopefully it will be at least partially done, so we'd not need to wait much, if at all.

Can we reduce the startup overhead by setting g_fComStarted on finalizer thread

Good point. Since we will be initializing anyways, why not get some benefit from it.

@jkotas jkotas requested a review from kouvel February 22, 2025 12:52
@VSadov
Copy link
Member Author

VSadov commented Feb 22, 2025

System.Threading.Tests.EtwTests.WaitHandleWaitEventTest failure looks related to this change.

That is on alpine-3.21-helix-arm32v7. I did not realize we run ceemain/DLL_THREAD_DETACH on linux as well, nor I am certain if the same invariant applies in such case.

It may be worth to investigate and add similar guards on other platforms (unix, NativeAOT), if possible and appropriate, but it should not be in this PR.

I will scope the failfast to Windows only.

_ASSERTE_ALL_BUILDS(!"Detaching a thread from the wrong fiber");
}

flsState = FLS_STATE_CLEAR;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? I do not think that we can handle re-initialization of the Thread after this point gracefully.

Copy link
Member Author

@VSadov VSadov Feb 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This happens in winforms. We currently rely on re-initialization in COM cleanup scenarios -

A thread returns from its entry point and we call DestroyThread, which comes here and detaches/destroys the Thread.

Then OS goes through FLS callbacks and COM callback may need to run something that needs Thread, so we attach a new instance of Thread (because the former is gone by then), and the new Thread gets later deinitialized in our FLS callback.
As long as our deinitialization happens last, things work ok. Seems like it has been like this for a long time.

Ideally, we should not call DestroyThread explicitly when thread exits, just rely on callback - so we would never need to reinitialize. Naively, making such change breaks other things, so probably not in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I agree that we want to fix this one in separate PR. The reinitialization sounds like a great source of mysterious corner-case bugs.

Co-authored-by: Jan Kotas <[email protected]>
@VSadov
Copy link
Member Author

VSadov commented Feb 25, 2025

I think the new failures in SslStream_ServerUntrustedCaWithCustomTrust_OK are unrelated.
(The remote certificate is invalid because of errors in the certificate chain: PartialChain)

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM otherwise. Thank you!

#define FLS_STATE_ARMED 1
#define FLS_STATE_INVOKED 2

static __declspec(thread) byte flsState;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static __declspec(thread) byte flsState;
static __declspec(thread) byte t_flsState;

Naming nit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Process not responsive dump indicates garbage collection
2 participants