-
Notifications
You must be signed in to change notification settings - Fork 278
Description
Which middleware is the feature for?
@hono/prometheus
What is the feature you are proposing?
Hi. I added the @hono/prometheus to my project, one using Mastra AI, but noticed that streaming answers are not being measured as I expected. For example, a 1s stream will show on http_request_duration_seconds_sum as having taken less than a ms. This was something I noticed earlier in other middlewares, both on Hono and other servers, which make timings around await next() instead of listening to the finish event of http.ServerResponse. An isolated reproduction example, with plain Hono on Node.js: https://github.com/moret/hono-prometheus-streaming . I have a couple of questions about this.
First is, am I expecting something conceptually misguided - e.g.: the "request duration" is just the handling of the request, so this metric indeed should not be seen as the full response time. I mean, I do have this expectation in non-streaming responses, but I can see that the name might imply it, meaning this would actually require another metric.
On that note, second question / suggestion, do you see this as another metric that could be added as an option to the middleware - at least on Node.js deployments - or an adjustment on the current one?
Finally, I initially asked this on the community Discord, and got the suggestion to bring it here. Is this the best place to discuss or suggest this as an option or adjustment to the middleware? If not, could you help me guide this to a better place?
Thank you all in advance.