Skip to content

Profiling: Improve tracking of greenlets #1949

Open
@Daverball

Description

@Daverball

Problem Statement

Currently the profiler will sample whichever greenlet happens to be active at the time of the sample, so you can end up with very confusing flame charts that mix the work of several greenlets, which may not even have anything to do with the current transaction.

This is far from ideal, but even when you start tracking just the greenlet that was active at the start of the transaction you end up with a profile that's not as helpful as the equivalent threading profile.

Solution Brainstorm

There's a few steps that need to happen to better support tracking greenlets.

  1. The backend needs to become aware of fibers (as a general concept, since greenlets are not the only fiber implementation out there, especially once you look outside Python) and needs to be able to track them separately from threads, which means adding a second identifier, which should generally be unique per thread, but might collide with the identifier with a fiber on a different thread, so the tuple of thread + fiber identifier should uniquely identify a fiber.
  2. The profiler needs to keep track of all the greenlets, so greenlet.gr_frame can be sampled for all the currently alive greenlets and not just the active one. In gevent you can register a callback on the hub that will be called whenever a new greenlet is spawned, so in the case of gevent this will be fairly straightforward: I've created a small POC that works Daverball@e7d7dcd
  3. Since fiber context switches are cooperative and have smaller overhead they generally tend to happen a lot more frequently than context switches between threads. Which poses a problem for a sampling profiler, since there will be a lot more jitter between the samples, rendering the collected data less useful. The creators of greenlet themselves are very aware of that themselves, which is why there is actually a way to install a trace function using greenlet.settrace similar to sys.settrace which gets called on every context switch and exception. Obviously it would be far too much overhead to collect a stack frame on every context switch (otherwise you wouldn't have made it a sampling profiler in the first place) but I think there's at least an opportunity to collect timings with a lightweight trace function which lets you visualize context switches and render the flamechart in a way that's more helpful (you could gray out the time where the greenlet wasn't actually running or even have a condensed view where the timeline skips on every context switch, to get a condensed view of how long your greenlet was actually doing work)

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions