Skip to content

wallet/internal/db: handle Postgres server connection exhaustion on store init #1191

@yyforyongyu

Description

@yyforyongyu

Background

database/sql already waits when a single store's local pool is exhausted, but NewPostgresStore does not do anything special when Postgres rejects a new server connection with too many clients already (SQLSTATE 53300).

This came up while running many parallel Postgres integration tests, where multiple independent PostgresStore pools hit the container-wide max_connections limit. The immediate CI fix is test-capacity related, but it also highlights a production hardening gap for shared-Postgres deployments.

Problem

  • NewPostgresStore opens a fresh database/sql pool and calls PingContext
  • if Postgres is already at max_connections, store initialization fails immediately
  • this is different from normal in-process pool exhaustion, which database/sql already handles by waiting for a free local connection

Proposed scope

  • detect Postgres server connection exhaustion during store initialization / connect path
  • apply bounded retry with backoff for SQLSTATE 53300
  • stop retrying when the caller context is canceled or times out
  • keep retries narrow to connect-time failures only
  • fail fast for non-retriable errors

Non-goals

  • replacing proper pool sizing / capacity planning
  • retrying arbitrary query failures
  • hiding persistent misconfiguration

Acceptance criteria

  • NewPostgresStore retries boundedly on SQLSTATE 53300
  • retries honor context cancellation / deadlines
  • non-retriable errors still fail immediately
  • behavior is covered by tests
  • docs clarify the difference between local pool exhaustion and server-side connection exhaustion

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions