-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Add source location to task and tasktrace object #2707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add source location to task and tasktrace object #2707
Conversation
Add an empty, default constructed std::source_location object to the task object and getter / setter.
Add calls to update `resume_point` variable with location of next resume location to all `await_suspend` functions and `then` functions.
02aa70a
to
c0149e3
Compare
@@ -35,6 +36,8 @@ class task { | |||
protected: | |||
scheduling_group _sg; | |||
private: | |||
std::source_location resume_point = {}; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is making a very heavily used object larger.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After adding std::source_location
coroutine_traits_base<>::promise_type
(which inherits from task) is 64 bytes,
continuation<Promise, Func, Wrapper, T>
is around 104-120, depending on exact version.
No difference except for that single unlucky value, who is "promoted" to the next 64 byte cache line in case of continuation
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's okay.
Aren't those addresses derivable from the stack trace? Or are those points lost? How large is std::source_location? |
Probably with a little bit of magic. You'd still have to store a pointer somewhere, because stacktrace is valid for an actually running task, while my patch adds "next instruction" to tasks, which are suspended.
8 bytes. |
Add resume point locations to `tasktrace` object. Update `formatter::format` to print source location of next resume alone with task type.
c0149e3
to
08d3c8a
Compare
Maybe we can tolerate it. @tgrabiec what's your opinion? Please run scylladb's perf-simple-query and report insn-per-op before and after (although we won't learn of the impact of the size change). |
my patch
master
My patch will run more instructions, as there are additional stores to the task object (std::source_location needs to be written). Why do we measure instructions per operation? We are not billed by instruction count, rather by time, we should measure total running time instead? Does it work as a sanity check or am i missing something? |
Time is very unstable (just try it) since it depends on temperature and cooling. Instructions is a more stable proxy for time (but inaccurate since instructions-per-clock can change). |
So we add 0.25% overhead. Hard to judge if it's worthwhile. |
Agree.
It's absolutely stable, but we're counting steps, which - in my opinion - sort-of blinds us. It's like two different walks, the same step count, but step sizes differ, so is distance. Here for example if we're running at max IPC, then we've 0.25% overhead. If otherwise we're waiting for memory, then the write is pretty much free (we've already waited for this cache line, because other stuff is read / written there) and proc is waiting for other memory pieces, so it can do our write. |
Writes are more-or-less free, except for the increase in instruction footprint (which are extra reads). We aren't close to running at max IPC. Typical for mini-benchmarks like perf-simple-query is 2, and for production ~1. The main killer is waiting for instruction fetch (although maybe in production mispredicts would also contribute since the the data is less regular). I'm sure an M4 would give much higher IPC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the performance impact could be a concern, can we make it an optional feature which is enabled only in the debug builds?
@@ -30,6 +30,7 @@ | |||
|
|||
#ifndef SEASTAR_MODULE | |||
#include <coroutine> | |||
#include <source_location> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it took me a while to check if it's safe to use <source_location>
instead of util/std-compat.hh
. the answer is yes. as we support the latest two major versions of the GCC and Clang compilers. in general, we use libstdc++, which added the support to source_location
in gcc-mirror/gcc@57d76ee, which is included by gcc 12 and up. and the latest stable release of GCC is GCC 14.2.
This has been discussed here (#2381) and consensus is that the release is where it's useful the most. |
Adds source location of next resume point to
task
objects. When constructingtasktrace
object, source location of resume points will be added to it. When printing, the source location of next resume point will be printed as well, hopefully improving debug experience.Example:
Partially fixes #2381