Description
Description
I am having a huge number of messages being published into my topic. I have many consumers in a consumer group consuming them. The consumers are optimized to process messages fast enough based on the consumer configs. But still I some how get the below errors in few consumers:
[sarama] 2024/12/10 07:45:04 kafka: error while consuming my-topic/12: kafka server: Request exceeded the user-specified time limit in the request
I also noticed that when I receive this errors, the CPU usage of my consumer drops and it almost sits idle around this time despite of having large number of messages in the partitions waiting to be consumed. Because of this consumer lag keeps increases for affected partitions.
I also took a CPU profile of consumer when I receive this errors and it also proves that the consumer's CPU is idling and messages are not being processed.

Above is a 60s CPU profile, out of which CPU was used only for 4s. That indicates idle CPU.
I also took goroutine blocking profile and it seems that even ConsumeClaim()
is waiting for the messages and hence my message consumer goroutines are waiting on a channel to receive any message to process.

Below is Network Blocking Profile (it's an SVG, click to open full preview)
Below Synchronization blocking profile:
Below is Syscall profile:
In all above profiles roughly indicates that sarama is waiting for messages from broker.
I looked at below similar issues:
But none of them are having any satisfactory solution and it seems that the root cause is also not identified.
Versions
Sarama | Kafka | Go |
---|---|---|
v1.43.3 | 3.8.0 | 1.22.8 |
Configuration
config.Consumer.Offsets.AutoCommit.Interval = 10 * time.Second
config.Consumer.Fetch.Default = 1048576 * 2 // 2 MB for faster consuming (Default 1MB)
config.Consumer.MaxProcessingTime = 10 * time.Second // increasing max processing time per message, to prevent frequent partition rebalances
config.Consumer.Group.Session.Timeout = 60 * time.Second // to prevent unnecessary partition rebalances
config.Consumer.Group.Heartbeat.Interval = 12 * time.Second
Logs
logs: CLICK ME
[sarama] 2024/12/10 07:45:04 kafka: error while consuming my-topic/7: kafka server: Request exceeded the user-specified time limit in the request
[sarama] 2024/12/10 07:45:04 kafka: error while consuming my-topic/11: kafka server: Request exceeded the user-specified time limit in the request
[sarama] 2024/12/10 07:45:04 kafka: error while consuming my-topic/12: kafka server: Request exceeded the user-specified time limit in the request
Activity