producing message with error not the leader for partition #3050
-
I'm facing an issue with my application that uses the Sarama library to send messages to a Kafka cluster. Locally: My application runs smoothly, and messages are successfully sent to the cluster. In Cluster: I encounter the following error:
my client config:
sending message:
these are the logs from Sarama:
I'm unsure whether this issue originates from my application code or some network issues in the cluster(communication between kafka and application). This error is a bit ambiguous, because client found the leader, and it found it correctly (broker 0, partition 0). Could you please provide some suggestions on how to troubleshoot this error and pinpoint the root cause?" |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
When a Kafka broker restarts, it resigns from being the current leader for the partitions it is preferred leader for. Kafka automatically detects this and triggers a leader election to choose a new leader from the remaining in-sync replicas. When the original broker comes back Kafka will fairly quickly hand back leadership to it in order to keep the preferred leaders. This can show as a transient error at the producer if it send a message during this switchover - it will be told it got the wrong broker now and it will refresh metadata and retry. With retries enabled as you have in your config, this shouldn’t even be seen by your application In your example snippet you show creating a producer and then using it. You aren’t creating a new producer every time you want to send a message are you? It is better to re-use a producer for your requests as it will maintain a connection to Kafka and be aware of state |
Beta Was this translation helpful? Give feedback.
-
here in the logs I just showed one of retries, of course it has 5 retries and it does the same thing. And also I'm sure that no leader change or any other changes in the cluster. |
Beta Was this translation helpful? Give feedback.
-
We finally tracked down the problem. It was a DNS configuration issue in our private Kubernetes cluster. The DNS resolver wasn't handling broker redirects for partition leadership properly, causing the "Your metadata is out of date" errors. So, if you see such an error, double-check your network setup or your client configuration related to network. |
Beta Was this translation helpful? Give feedback.
We finally tracked down the problem. It was a DNS configuration issue in our private Kubernetes cluster. The DNS resolver wasn't handling broker redirects for partition leadership properly, causing the "Your metadata is out of date" errors. So, if you see such an error, double-check your network setup or your client configuration related to network.