Skip to content

Conversation

@tkornai
Copy link

@tkornai tkornai commented Mar 26, 2024

Add a new committed offset metric per consumer group and per partition that can be used to judge if consumers are making progress on all partitions or if there is a poison pill on any of them.

Please let me know if exporting these metrics should be configurable.

Add a new committed offset metric per consumer group and per partition that can be used to judge if consumers are making progress on all partitions or if there is a poison pill on any of them.

Please let me know if exporting these metrics should be configurable.
@CLAassistant
Copy link

CLAassistant commented Mar 26, 2024

CLA assistant check
All committers have signed the CLA.

Offset resets can decrease its value.
@weeco
Copy link
Contributor

weeco commented Nov 3, 2024

Hey @tkornai ,
thanks for your PR and sorry for the late review. I'm unsure if this new metric is valuable since we have a consumer group lag for each partition already. Even if you had this group offset metric you'd need to compare it with the partition offset to see if there are new messages that it should have committed to. And that's exactly what the consumer group lag metric already does. Maybe I'm missing something that cannot be solved using the existing metrics already?

My concern is that adding this metric could increase the partition cardinality quite a lot (depending on groups & partition count of course).

@mnovikov-mindbox
Copy link

Hi! We have a case, where we need such metric:
Our consumer is dropping off from one or two partitions due to a bug in our Kafka library. As a result, overall processing is at 30-50% (kminion_kafka_consumer_group_topic_offset_sum increases over time), and alerts for total consumption stop don't catch this case. Alerts for 'uneven lag' or 'big partition lag' are very noisy, as such situations can occur normally.

If there was a metric for committed offset per CG and partition we can use it to alert if there is a lag in partition and committed offset for this partition stays still

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants