Handle processes whose main thread has exited #376

christos68k · 2025-02-27T22:02:00Z

Summary

This PR implements both steps described in #365 (comment).

Thanks to @korniltsev for suggesting disassociate_ctty, I ended up using another tracepoint sched_process_free instead as it makes fewer assumptions and is more stable (see this comment for more context). It also allows us to simplify cleanup logic (no need for the extra periodic cleanups I had in the first prototype solution), as userspace will get a final PID notification when the process gets freed by the kernel.

Essentially, whenever the main thread exits, we do not unload process information thus allowing profiling the remaining threads to continue. Processmanager can also track mapping changes triggered by one of the remaining threads.

I added some debug warning statements to ease review, I will remove the commit that introduced them before merging. I also added a C program that you can compile and run as a testing workload with the profiling agent also running, that should exercise all the corner cases that this PR addresses. Looking at the warning logs I added and the generated flamegraph in devfiler should make the timeline of processmanager operations very clear.

It's probably easier to review this commit-by-commit.

TODO:

DONE ~~Add test program~~
More testing

processmanager/processinfo.go

korniltsev · 2025-03-04T12:45:14Z

Thanks for looking into this.

This looks OK overall and should solve the issue from the user perspective.

One of the downsides I see is that while we do not unload the old mappings, we re also not loading new mappings, which may degrade profiling of such processes ( I am still not sure if there are legit applications with dead main thread, or is it a highly infrequent corner case)

I personally would prefer if the processmanager "re-elected" a main thread by looking into the process threads, although I realize it may require more work and we may do this later.

Another thing to consider is to hook a kprobe on disassociate_ctty which is called when the process group is dead
https://github.com/torvalds/linux/blame/master/kernel/exit.c#L935-L936 this may help avoiding a separate timer for this case.

It would be nice to have a unit test for this case regardless of the solution we chose.

christos68k · 2025-03-04T14:35:43Z

One of the downsides I see is that while we do not unload the old mappings, we re also not loading new mappings, which may degrade profiling of such processes ( I am still not sure if there are legit applications with dead main thread, or is it a highly infrequent corner case)
I personally would prefer if the processmanager "re-elected" a main thread by looking into the process threads, although I realize it may require more work and we may do this later.

I'm currently working on this, will push new commits (implementing part 2 of the proposed solution in #365) today.

Another thing to consider is to hook a kprobe on disassociate_ctty which is called when the process group is dead https://github.com/torvalds/linux/blame/master/kernel/exit.c#L935-L936 this may help avoiding a separate timer for this case.

I think we can switch to sched_process_free tracepoint (instead of sched_process_exit) which should be more performant than a kprobe. I'll verify.

EDIT: sched_process_free fires for every kernel task so it's not suitable if we want to avoid notifying userspace of every thread exit. On the other hand, disassociate_ctty seemingly (no in-depth investigation done on my part) does what we want and also seemingly executes after task has been removed from /proc which eliminates a possible race in userspace that would otherwise be a (probably unlikely) concern.

EDIT2: Went back to sched_process_free which we can make work by checking whether PID is something we track or not.

support/ebpf/process_monitor.ebpf.c

christos68k · 2025-03-20T18:12:18Z

process/process.go

-			} else if path != "" {
-				// Ignore [vsyscall] and similar executable kernel
-				// pages we don't care about
+			} else {


No semantic change, I just inlined the logic from GetMappings here as this is the more appropriate place.

christos68k · 2025-03-20T18:19:07Z

processmanager/processinfo.go

@@ -538,7 +538,7 @@ func (pm *ProcessManager) synchronizeMappings(pr process.Process,
 // fast enough and this particular pid is reused again by the system.
 func (pm *ProcessManager) processPIDExit(pid libpf.PID) {
 	exitKTime := times.GetKTime()
-	log.Debugf("- PID: %v", pid)
+	log.Warnf("- PID: %v", pid)


I'll remove these newly added warnings before merging, they should help with reviewing the PR as you don't need to run the agent with debug logs enabled and sort through a lot of irrelevant noise.

christos68k · 2025-03-20T18:20:14Z

processmanager/processinfo.go

@@ -626,22 +633,7 @@ func (pm *ProcessManager) SynchronizeProcess(pr process.Process) {
 			// return ESRCH. Handle it as if the process did not exist.
 			pm.mappingStats.errProcESRCH.Add(1)
 		}
-		return
-	}
-	if len(mappings) == 0 {


These comments are no longer relevant.

christos68k · 2025-03-20T18:54:15Z

I added some more information and notes on how to review/test to the description.

@korniltsev please take another look and review/test.

korniltsev · 2025-03-21T15:38:19Z

Great job. Thank you for looking into this.
I like the trick with sched_process_free and that we have no extra timers in userspace and the logic of the PM did not complicate.
I've run both my repro and your repro with libcrypto and the profiler works as expected. It keeps profiling remaining threads including new libraries (libcrypto)
I wish we could somehow create an integration test for it from the repro you've added so that it is run with every testruns instead of hoping I don't forget to run it. But I understand writing a test may be hard / time consuming so we may do this later.
LGTM

support/tests/main_thread_exit.c

support/ebpf/sched_monitor.ebpf.c

tracer/tracer.go

christos68k · 2025-03-21T20:58:52Z

process/process.go

@@ -53,9 +58,10 @@ func init() {
 }

 // New returns an object with Process interface accessing it
-func New(pid libpf.PID) Process {
+func New(pid, tid libpf.PID) Process {


I didn't switch Process to accept libpf.PIDTID as the latter is only used with PID events, and I'd rather not couple it here too.

florianl

Please remove the log.Warn(..) messages as mentioned in #376 (comment) before merging.

fabled

A quick review with few questions. But looks good. Will look in more detail tomorrow.

fabled · 2025-03-30T08:56:52Z

process/process.go

+		// Neither /proc/sp.pid/map_files nor /proc/sp.pid/task/sp.tid/map_files
+		// exist if main thread has exited, so we use the mapping path directly.
+		return m.Path


Shouldn't this then be /proc/PID/root/FILE to make sure the file is opened from the right namespace?

/proc/PID/root doesn't exist after main thread exits, I switched the root to /proc/sp.pid/task/sp.tid/root instead.

fabled · 2025-03-30T08:58:36Z

process/process.go

@@ -229,20 +234,45 @@ func (sp *systemProcess) GetMappings() ([]Mapping, uint32, error) {
 	defer mapsFile.Close()


Could this original path of checking the primary maps file be made conditional on if sp.mainThreadExit == false? (Or if this object is not kept between calls, perhaps this could be an argument?)

Hmm yes, I can add the check/skip-pid-maps logic but I'll also need to add the zombie check to ensure that we're only going to be reading tid-specific maps if it's absolutely the case that main thread has exited.

I added the zombie check for process exit but adding a conditional check for sp.mainThreadExit == false in GetMappings doesn't currently serve any purpose: GetMappings is only called once for any given systemProcess instance, so the check will never fail (caching mainThreadExit on the instance does serve a purpose however, as there are subsequent method calls after GetMappings that will leverage it).

So we can either have processmanager cache the mainThreadExit status for a particular PID and add this check (and then we'd need extra logic inside GetMappings to determine when the process has finally exited, currently the primary maps file path serves this purpose) or keep the logic as is now.

christos68k · 2025-04-16T20:52:37Z

I rebased this PR on top of current main.

fabled

Thanks! Looks pretty good already. I added few questions and comments. But I'll pre-approve this already so we can go forward.

fabled · 2025-04-21T08:31:34Z

process/process.go

+		// Test for main thread exit by checking for Zombie state
+		pidStat, err := os.ReadFile(fmt.Sprintf("/proc/%d/stat", sp.pid))
+		if err != nil {
+			// Should never happen while process is alive
+			return nil, 0, err
+		}
+
+		var p int
+		var c string
+		var state rune
+		n, err := fmt.Sscanf(string(pidStat), "%d %s %c", &p, &c, &state)
+		if err != nil || n < 3 {
+			// Should never happen
+			return nil, 0, err
 		}
-		sp.fileToMapping = fileToMapping
+		if state != 'Z' {
+			return mappings, numParseErrors, ErrNoMappings
+		}
+
+		log.Warnf("PID: %v main thread exit", sp.pid)


Is this really needed? I think we can just remove the zombie check.

If the maps file exists, it means process is running.

If the maps file is empty, it means the main thread has exited. There is no other condition that the main maps is empty, because the main thread cannot be executing code if there are no mappings available. This is purely a side effect of kernel having released the main thread specific resources.

Based on the two above things we can determine if: the process exited (since ebpf sent the event), or if the main thread has exited.

Or are you aware of some condition where this makes a difference? I think it was if all mappings entries resulted in parsing error? But I believe this should be handled as an error earlier. The reason is that reading the TID specific maps should be identical to the PID specific as memory mappings are shared between all threads.

Perhaps the only check to do here is if pid == tid then return early with ErrNoMappings.

Removed the zombie check (we'd need it if we go back to walking /proc but we don't need it now).

fabled · 2025-04-21T08:40:02Z

process/process.go

-				continue
-			}
-			fileToMapping[m.Path] = m
+	if err != nil {


I think here we should also return early if err is nil and numParseErrors is non-zero. Or perhaps even better, parseMappings could return an err if it failed to find usable mappings (but it managed to read data). The idea is basically to distinguish here if mappings is empty or all lines were non-parseable.

Doesn't the len(mappings) == 0 check that follows cover this case?

If err == nil and len(mappings) != 0 then we simply continue and process the mappings. If err == nil and len(mappings) == 0 then we continue and try mappings from another thread. Essentially all branching logic depends on err and len(mappings), not numParseErrors which is purely advisory.

fabled · 2025-04-21T08:42:33Z

process/process.go

+		numParseErrorsAlt := uint32(0)
+		mappings, numParseErrorsAlt, err = parseMappings(mapsFileAlt)
+		numParseErrors += numParseErrorsAlt


I'd just overwrite the numParseErrors instead of adding them. It is only ever used for counters. And since the per TID and per PID maps should be identical, you are basically reporting doubled errors counter in this case.

sched_process_free is called when the task is freed by the kernel, which allows for simpler cleanup of processes whose main thread has exited.

Making TID available to processmanager allows the agent to keep profiling a process whose main thread calls pthread_exit while other threads continue to run.

This allows the agent to continue profiling a process whose main thread has exited, but other threads continue to run. Mapping changes triggered by one of the remaining threads are also tracked.

The latter is OS-agnostic, but the agent only runs on Linux.

christos68k requested review from a team as code owners February 27, 2025 22:02

christos68k mentioned this pull request Feb 27, 2025

Profiler incorrectly handles process exit when non-main threads are still running #365

Closed

christos68k commented Feb 27, 2025

View reviewed changes

processmanager/processinfo.go Show resolved Hide resolved

processmanager/processinfo.go Outdated Show resolved Hide resolved

christos68k marked this pull request as draft February 27, 2025 22:07

fabled reviewed Feb 28, 2025

View reviewed changes

processmanager/processinfo.go Outdated Show resolved Hide resolved

christos68k force-pushed the ck/process-exit branch 3 times, most recently from 093a15f to 38f6e51 Compare March 5, 2025 04:18

christos68k mentioned this pull request Mar 5, 2025

processmanager: Don't synchronize a process that's waiting cleanup #379

Merged

fabled reviewed Mar 5, 2025

View reviewed changes

support/ebpf/process_monitor.ebpf.c Outdated Show resolved Hide resolved

christos68k force-pushed the ck/process-exit branch 3 times, most recently from e11a0dc to 87e351e Compare March 20, 2025 18:46

christos68k marked this pull request as ready for review March 20, 2025 18:53

christos68k commented Mar 20, 2025

View reviewed changes

christos68k requested a review from fabled March 20, 2025 18:55

christos68k self-assigned this Mar 20, 2025

christos68k linked an issue Mar 20, 2025 that may be closed by this pull request

Profiler incorrectly handles process exit when non-main threads are still running #365

Closed

christos68k force-pushed the ck/process-exit branch from 87e351e to 48698d5 Compare March 21, 2025 00:46

korniltsev reviewed Mar 21, 2025

View reviewed changes

support/tests/main_thread_exit.c Show resolved Hide resolved

christos68k force-pushed the ck/process-exit branch from 94a86eb to 5905d6a Compare March 21, 2025 16:01

florianl reviewed Mar 21, 2025

View reviewed changes

support/ebpf/sched_monitor.ebpf.c Show resolved Hide resolved

florianl reviewed Mar 21, 2025

View reviewed changes

tracer/tracer.go Outdated Show resolved Hide resolved

christos68k commented Mar 21, 2025

View reviewed changes

florianl approved these changes Mar 27, 2025

View reviewed changes

fabled reviewed Mar 30, 2025

View reviewed changes

christos68k force-pushed the ck/process-exit branch 2 times, most recently from c01c8b0 to 9f268e1 Compare March 31, 2025 19:26

christos68k force-pushed the ck/process-exit branch from 4f58051 to 7f8bea8 Compare April 16, 2025 20:45

fabled approved these changes Apr 21, 2025

View reviewed changes

christos68k added 15 commits April 25, 2025 16:40

Hook sched_process_free instead of sched_process_exit

a03325e

sched_process_free is called when the task is freed by the kernel, which allows for simpler cleanup of processes whose main thread has exited.

ebpf: report_pid now sends both pid and tid to userspace

6ebd504

Making TID available to processmanager allows the agent to keep profiling a process whose main thread calls pthread_exit while other threads continue to run.

Handle processes whose main thread has called pthread_exit

218cdd4

This allows the agent to continue profiling a process whose main thread has exited, but other threads continue to run. Mapping changes triggered by one of the remaining threads are also tracked.

processmanager: Better handling of missing mappings

9cb832a

support: Add main thread exit C test

75697c8

Minor doc update

8057b43

[Remove] Add debug warn logging to help with review

ba8e383

Add copyright

1efeee6

sched_monitor: Add struct layout comment

feba978

Add libpf.PIDTID

408cab3

process: Resolve mapping path through pid/tid root

fefdc8f

process: Add zombie check for main thread exit

c2ea120

process: Use path.Join instead of filepath.Join

195f78b

The latter is OS-agnostic, but the agent only runs on Linux.

Remove zombie check

f9a7e22

Update eBPF artifacts

ce6940a

christos68k force-pushed the ck/process-exit branch from 7eb3d80 to ce6940a Compare April 25, 2025 20:46

Remove warning logs

34f5656

christos68k merged commit 81ef3fd into main Apr 25, 2025
25 checks passed

christos68k deleted the ck/process-exit branch April 25, 2025 21:16

gnurizen mentioned this pull request May 19, 2025

go labels dd parca-dev/opentelemetry-ebpf-profiler#70

Closed

		@@ -229,20 +234,45 @@ func (sp *systemProcess) GetMappings() ([]Mapping, uint32, error) {
		defer mapsFile.Close()

Handle processes whose main thread has exited #376

Handle processes whose main thread has exited #376

Uh oh!

Conversation

christos68k commented Feb 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

korniltsev commented Mar 4, 2025

Uh oh!

christos68k commented Mar 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

christos68k commented Mar 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

korniltsev commented Mar 21, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

florianl left a comment

Choose a reason for hiding this comment

Uh oh!

fabled left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

christos68k Apr 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

christos68k commented Apr 16, 2025

Uh oh!

fabled left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

christos68k Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

christos68k Apr 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

christos68k commented Feb 27, 2025 •

edited

Loading

christos68k commented Mar 4, 2025 •

edited

Loading

christos68k commented Mar 20, 2025 •

edited

Loading

christos68k Apr 4, 2025 •

edited

Loading

christos68k Apr 25, 2025 •

edited

Loading

christos68k Apr 25, 2025 •

edited

Loading