new metric in postgres_driver to estimate payload stats#3596
new metric in postgres_driver to estimate payload stats#3596Ivansete-status merged 1 commit intomasterfrom
Conversation
|
This PR may contain changes to database schema of one of the drivers. If you are introducing any changes to the schema, make sure the upgrade from the latest release to this change passes without any errors/issues. Please make sure the label |
|
You can find the image built from this PR at Built from f3d4ca9 |
8688ed1 to
9fe8c6f
Compare
NagyZoltanPeter
left a comment
There was a problem hiding this comment.
Need some more clarification about this.
| declarePublicGauge postgres_payload_size_bytes, | ||
| "Payload size in bytes of correctly stored messages" |
There was a problem hiding this comment.
Sorry, I don't get the mean of this gauge. Is it about to show the last message payload size, or should it somehow sum up?
How should itt help in maintaining DB space?
There was a problem hiding this comment.
Sure!
The original idea is to track the average message size stored in the databases over time. This serves as a small step toward better understanding overall database usage.
| ) | ||
|
|
||
| if ret.isOk(): | ||
| postgres_payload_size_bytes.set(message.payload.len) |
There was a problem hiding this comment.
It is only set to the last message size always. Was it the aim?
There was a problem hiding this comment.
Unless I'm missing something, it writes all messages' sizes
There was a problem hiding this comment.
Yes, it writes all, just does not sum them up. That made me think the approach. But than as @fcecin gave insight I'm ok with it.
|
This is a "best-effort estimate" metric kept in RAM for free. The documentation provided is general and allows it to implement the estimate in different ways (sliding window/running average) in the future. There's no promise in the gauge description string that this has to be a running average of some sort, which is good. It's good to start with the simplest, N=1. I'm actually wondering how far we can go with just this. |
NagyZoltanPeter
left a comment
There was a problem hiding this comment.
Thank for the explanation! It is good to go!
|
Another thing we can do (later, or roll in this PR) is add two other gauges: total message count and total message size, then use For example, the gauges at T=1: count=0 bytes=0, T=2: count=1 bytes=1000, T=3: count=2 bytes 1020. With no sliding window the simple average is 510 (1020 / 2). with a sliding-window-size=1 that excludes T=2 and only considers T=3 (dumb example, but works), the average is now 20 (20 / 1). Whoever is polling the gauges (at whatever rates) decides what to do, instead of us ever solving that problem here. |
|
Thanks for the comments! |
Original PR: #3544
The original PR got abruptly closed after a deep cleanup and refactor applied by this on 2025-09-30
Issue