Skip to content

Interaction of ack timeout, retry middleware, and poison queue causes failures to write to poison queue. #636

@Zach-Johnson

Description

@Zach-Johnson

Steps to reproduce

  • Postgres running with Watermill publisher/subscriber

Subscriber with a default config like this, ack deadline gets 30s by default:

	beginner := watermillSQL.BeginnerFromPgx(pool)

	subscriber, err := watermillSQL.NewSubscriber(
		beginner,
		watermillSQL.SubscriberConfig{
			SchemaAdapter:    watermillSQL.DefaultPostgreSQLSchema{},
			OffsetsAdapter:   watermillSQL.DefaultPostgreSQLOffsetsAdapter{},
			InitializeSchema: true,
		},
		NewWatermillLogger(ctx),
	)

Retry middleware, total time of retries exceeds 30s.

		middleware.Retry{
			MaxRetries:      5,
			InitialInterval: time.Second,
			MaxInterval:     time.Second * 30,
			Multiplier:      2,
			Logger:          events.NewWatermillLogger(ctx),
		}.Middleware,

Poison queue enabled.

Expected behavior

I would expect failed messages to be published to the poison queue.

Actual behavior

{"level":"error","time":"2025-11-17T10:39:08-07:00","message":"Handler returned error: context deadline exceeded\ncannot publish message to poison queue: could not insert message as row: context deadline exceeded (fields: map[handler_name:notification_handler message_uuid:4afd0039-e529-406b-8e22-3dfc96870e0e])"}

This happens because the default ack deadline causes immediate context expiration when attempting to write to the poison queue.

Possible solution

I would expect either:

  • Watermill to fail on startup somewhere and flag this as a mis-configuration
  • The context to reset on publishing to the poison queue - maybe with a configurable timeout separate for poison queue writes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions