Background
database/sql already waits when a single store's local pool is exhausted, but NewPostgresStore does not do anything special when Postgres rejects a new server connection with too many clients already (SQLSTATE 53300).
This came up while running many parallel Postgres integration tests, where multiple independent PostgresStore pools hit the container-wide max_connections limit. The immediate CI fix is test-capacity related, but it also highlights a production hardening gap for shared-Postgres deployments.
Problem
NewPostgresStore opens a fresh database/sql pool and calls PingContext
- if Postgres is already at
max_connections, store initialization fails immediately
- this is different from normal in-process pool exhaustion, which
database/sql already handles by waiting for a free local connection
Proposed scope
- detect Postgres server connection exhaustion during store initialization / connect path
- apply bounded retry with backoff for
SQLSTATE 53300
- stop retrying when the caller context is canceled or times out
- keep retries narrow to connect-time failures only
- fail fast for non-retriable errors
Non-goals
- replacing proper pool sizing / capacity planning
- retrying arbitrary query failures
- hiding persistent misconfiguration
Acceptance criteria
NewPostgresStore retries boundedly on SQLSTATE 53300
- retries honor context cancellation / deadlines
- non-retriable errors still fail immediately
- behavior is covered by tests
- docs clarify the difference between local pool exhaustion and server-side connection exhaustion
Background
database/sqlalready waits when a single store's local pool is exhausted, butNewPostgresStoredoes not do anything special when Postgres rejects a new server connection withtoo many clients already(SQLSTATE 53300).This came up while running many parallel Postgres integration tests, where multiple independent
PostgresStorepools hit the container-widemax_connectionslimit. The immediate CI fix is test-capacity related, but it also highlights a production hardening gap for shared-Postgres deployments.Problem
NewPostgresStoreopens a freshdatabase/sqlpool and callsPingContextmax_connections, store initialization fails immediatelydatabase/sqlalready handles by waiting for a free local connectionProposed scope
SQLSTATE 53300Non-goals
Acceptance criteria
NewPostgresStoreretries boundedly onSQLSTATE 53300