Skip to content

Commit 0e53f9e

Browse files
committed
fix: align kafka consumer fetch ceiling to 64 MiB broker max
librdkafka defaults max.partition.fetch.bytes to 1 MiB and fetch.max.bytes to 50 MiB, both below the 64 MiB a DLQ re-publish can reach. An oversized-but-valid record is accepted/stored by the broker yet trips MSG_SIZE_TOO_LARGE on consume, wedging the partition in a re-seek loop (observed: DLQ replay group at ~243 MiB in-flight, 0 committed partitions). Raise consumer fetch ceilings to match the producer/broker 64 MiB max.message.bytes. Config-only; applies to both primary and _dlq groups.
1 parent 0b45863 commit 0e53f9e

1 file changed

Lines changed: 8 additions & 0 deletions

File tree

src/Pkg/Queue.hs

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -470,6 +470,14 @@ kafkaService appLogger appCtx tp role label kafkaTopics batchSize fn = checkpoin
470470
-- round-trip cost). 250ms stays well under session/poll timeouts.
471471
<> K.extraProp "fetch.min.bytes" "65536"
472472
<> K.extraProp "fetch.wait.max.ms" "250"
473+
-- Consumer fetch ceiling MUST match the producer/broker 64 MiB max.message.bytes.
474+
-- librdkafka defaults max.partition.fetch.bytes to 1 MiB and fetch.max.bytes to
475+
-- 50 MiB — both below the 64 MiB a DLQ re-publish can reach — so an oversized-but-
476+
-- valid record (esp. header-restamped DLQ messages) is accepted and stored by the
477+
-- broker yet MSG_SIZE_TOO_LARGE wedges the partition on consume, re-seeking forever.
478+
<> K.extraProp "max.partition.fetch.bytes" "67108864"
479+
<> K.extraProp "fetch.max.bytes" "67108864"
480+
<> K.extraProp "receive.message.max.bytes" "104857600"
473481
<> K.extraProp "partition.assignment.strategy" "cooperative-sticky"
474482
<> K.extraProp "group.instance.id" clientId
475483
<> K.logLevel K.KafkaLogInfo

0 commit comments

Comments
 (0)