Modifying a span's start_time #157
Description
Team, I have encountered a relevant issue with measuring spans for other libraries. Disclaimer: I'm a new comer to the Erlang/Elixir community, so please, if you could point me in the right direction — I'd hugely appreciate it!
Context: I want to trace the time taken to run each query Ecto has ran using :ocp
.
The problem: Ecto's :telemetry
usage is constrained to a single event: :query
, this event is emitted upon query completion and includes details such as the query's duration. This in turn doesn't provide a feasible mechanism to create spans via telemetry events only. In other words, libraries don't always allow one to easily wrap their calls with ocp:with_child_span
.
To further illustrate the problem, see the screenshot below:
The image above shows a trace for a Phoenix request trace, that has a single query associated to it and relies on the telemetry :query
event from Ecto: select pg_sleep(1);
i.e. Postgres sleeps for 1 second before returning the query result.
The trace for elixir.duffel.web.air.airport.indexcontroller.index
is in fact correct: ~1.076s. However, the trace for the query itself reports 54µs
.
This happens because the telemetry event is emitted only at the query end, so we are forced to start a span and finish it immediately in order to get a trace.
Ideally we'd be able to properly create a span for the query, meaning that we'd have the actual query duration (~1s
).
Ecto's principles suggest1 that the maintainers do not plan to emit other query related events and we can't modify a Span's "start_time".
What I'd like from you: My question would be: what is the library's standpoint for problems like this i.e. would you push back that Ecto should emit more events? Would you suggest that we pursue a more manual approach to tracing (i.e. wrapping the library's calls manually)?
Personally, I can see some value in being able to easily alter the span's start_time — but I would love any thoughts on the matter before I create a pull-request, as I can also imagine potential issues with this approach.
Thank you!
Footnotes
- Ecto's maintainers have mentioned provided some reasoning for emitting a single event:
The rationale is that some libraries like Ecto only emit the "stop" event with the whole duration, which means you can track back, but not build it forward.