|
| 1 | +# PostgreSQL Connection Pool Configuration (HikariCP) |
| 2 | + |
| 3 | +The Broker already pools its PostgreSQL connections with HikariCP, but the pool |
| 4 | +is configured by hand: every tuning value is a hardcoded literal in |
| 5 | +`PostgresConnectivity.createDataSource()`, the pool is unnamed, the |
| 6 | +`DataSource` is never closed, and each repository opens its own |
| 7 | +`Database.connect(...)`. The companion service `sdkman-state` recently |
| 8 | +reworked the same concern as part of a throughput initiative and arrived at a |
| 9 | +more operable standard: pool tuning is externalised to config and environment |
| 10 | +variables with documented defaults, the pool is named, the `DataSource` |
| 11 | +lifecycle is managed, and a single Exposed `Database.connect` pins an explicit |
| 12 | +transaction isolation level. This spec brings the Broker's PostgreSQL pooling |
| 13 | +up to that standard so the two services — twin towers either side of the same |
| 14 | +Postgres schema — behave and are tuned consistently. |
| 15 | + |
| 16 | +The scope here is **connection pooling and Exposed wiring only**. The Broker's |
| 17 | +query logic, the Mongo backend, the `postgres.*` config namespace, and the |
| 18 | +read-only contract with `sdkman-state` are all unchanged. |
| 19 | + |
| 20 | +*Reference: `sdkman-state/specs/concurrency-optimisations.md` (R1, R2, R9 and |
| 21 | +the "Extra Considerations" notes on isolation level) — the upstream spec that |
| 22 | +established this standard.* |
| 23 | + |
| 24 | +## Behaviour |
| 25 | + |
| 26 | +After this change the Broker's pooled `javax.sql.DataSource` is configured |
| 27 | +entirely from `application.conf` with environment-variable overrides and the |
| 28 | +same defaults `sdkman-state` ships. The five tunables — maximum pool size, |
| 29 | +minimum idle, connection timeout, max lifetime, idle timeout — are no longer |
| 30 | +literals in code. The pool carries the name `sdkman-broker-pool` so it is |
| 31 | +identifiable in HikariCP log lines and JMX/metrics alongside |
| 32 | +`sdkman-state-pool`. The `DataSource` is constructed with |
| 33 | +`initializationFailTimeout = -1`, so the application boots even when Postgres |
| 34 | +is temporarily unreachable and surfaces the outage through `/meta/health` |
| 35 | +rather than failing at startup. On shutdown the pool is closed cleanly. Exposed |
| 36 | +is wired through a single `Database.connect` that sets |
| 37 | +`defaultIsolationLevel = TRANSACTION_READ_COMMITTED`, and every repository |
| 38 | +shares that one `Database` rather than connecting independently. |
| 39 | + |
| 40 | +None of this is observable to a download client. The HTTP contract in |
| 41 | +`postgres_version_repository.md` is unchanged; this spec only changes how the |
| 42 | +connection pool is configured and wired. |
| 43 | + |
| 44 | +## Business Rules |
| 45 | + |
| 46 | +1. **Pool tuning is externalised, with `sdkman-state`'s defaults.** The five |
| 47 | + HikariCP tunables are read from configuration, not hardcoded. The Broker |
| 48 | + keeps its existing `postgres.*` namespace (renaming to state's `database.*` |
| 49 | + would break already-deployed environment variables), so the new keys live |
| 50 | + under `postgres.pool.*` with `POSTGRES_POOL_*` overrides. The structure, |
| 51 | + key names, and **default values match `sdkman-state` exactly**: |
| 52 | + |
| 53 | + | Setting | Config key | Env override | Default | Meaning | |
| 54 | + |-----------------------|-------------------------------------|---------------------------------------|-------------|-----------------------------------------------------| |
| 55 | + | `maximumPoolSize` | `postgres.pool.maxSize` | `POSTGRES_POOL_MAX_SIZE` | `20` | Max total connections. | |
| 56 | + | `minimumIdle` | `postgres.pool.minIdle` | `POSTGRES_POOL_MIN_IDLE` | `2` | Warm idle connections kept ready. | |
| 57 | + | `connectionTimeout` | `postgres.pool.connectionTimeoutMs` | `POSTGRES_POOL_CONNECTION_TIMEOUT_MS` | `5000` | Fail fast when the pool is starved (ms). | |
| 58 | + | `maxLifetime` | `postgres.pool.maxLifetimeMs` | `POSTGRES_POOL_MAX_LIFETIME_MS` | `1800000` | Retire a connection after 30 min. | |
| 59 | + | `idleTimeout` | `postgres.pool.idleTimeoutMs` | `POSTGRES_POOL_IDLE_TIMEOUT_MS` | `600000` | Drop idle connections after 10 min. | |
| 60 | + |
| 61 | + This changes two current values: `maximumPoolSize` moves from a hardcoded |
| 62 | + `10` to a default of `20`, and `connectionTimeout` from a hardcoded |
| 63 | + `30000` to a default of `5000`. Both now match state and both are |
| 64 | + overridable per environment. |
| 65 | + |
| 66 | +2. **`AppConfig` exposes the pool tunables.** The `AppConfig` interface gains |
| 67 | + five members and `DefaultAppConfig` reads them with the existing |
| 68 | + `getIntOrDefault` / `getLongOrDefault` helpers so absent keys fall back to |
| 69 | + the defaults in Rule 1. Timeouts are `Long` (milliseconds); sizes are `Int`. |
| 70 | + `PostgresConnectivity` consumes these instead of literals. |
| 71 | + |
| 72 | +3. **The pool is named.** `HikariConfig.poolName = "sdkman-broker-pool"`, |
| 73 | + mirroring state's `sdkman-state-pool`. This makes the two pools |
| 74 | + distinguishable in logs and metrics when both services share infrastructure. |
| 75 | + |
| 76 | +4. **The pool boots even when Postgres is down.** Set |
| 77 | + `initializationFailTimeout = -1` so `HikariDataSource` construction does not |
| 78 | + block on or fail because of an unreachable database. This is a deliberate |
| 79 | + behaviour change: today `PostgresConnectivity.dataSource()` wraps |
| 80 | + construction in `Either.catch` and rethrows `IllegalStateException`, so a |
| 81 | + database that is down at boot kills the process. Under this rule the process |
| 82 | + starts and the outage is reported through the existing |
| 83 | + `PostgresHealthRepository` / `/meta/health` path instead. The `Either.catch` |
| 84 | + wrapper stays (it still guards against genuine misconfiguration such as a |
| 85 | + bad driver class), but connectivity failures no longer abort startup. |
| 86 | + |
| 87 | + > **Why this value.** The choice is driven purely by **parity with |
| 88 | + > `sdkman-state`**, which sets `-1`, not by a resilience requirement — |
| 89 | + > managed Postgres downtime is effectively a non-event in this deployment. |
| 90 | + > Fail-fast-at-boot would be an equally defensible standard; what matters is |
| 91 | + > that both services use the *same* one. If the teams ever prefer fail-fast, |
| 92 | + > change it in both services together, never in just one. |
| 93 | +
|
| 94 | +5. **One Exposed `Database`, with an explicit isolation level.** Replace the |
| 95 | + per-repository `Database.connect(dataSource)` (currently created `by lazy` |
| 96 | + independently in `PostgresVersionRepository` and `PostgresAuditRepository`) |
| 97 | + with a single `Database.connect` performed once at wiring time and shared by |
| 98 | + all repositories: |
| 99 | + |
| 100 | + ```kotlin |
| 101 | + Database.connect( |
| 102 | + datasource = dataSource, |
| 103 | + databaseConfig = DatabaseConfig { defaultIsolationLevel = Connection.TRANSACTION_READ_COMMITTED }, |
| 104 | + ) |
| 105 | + ``` |
| 106 | + |
| 107 | + Pinning `defaultIsolationLevel` matches state and makes the isolation level |
| 108 | + deterministic instead of driver-default; it also stops Exposed probing the |
| 109 | + database for its default isolation level on the first transaction. |
| 110 | + |
| 111 | + > **Note on motivation.** In `sdkman-state` this setting is load-bearing: it |
| 112 | + > runs queries inside `newSuspendedTransaction`, where a first-transaction |
| 113 | + > metadata probe against an unreachable database deadlocks the request |
| 114 | + > coroutine (state R9). The Broker today uses **blocking** `transaction(db)` |
| 115 | + > and its health check uses **raw JDBC** (`dataSource.connection`), so it is |
| 116 | + > not exposed to that coroutine deadlock. Here the setting is adopted for |
| 117 | + > consistency, deterministic isolation, and one fewer startup round-trip — |
| 118 | + > and as cheap insurance should the Broker later move to suspended |
| 119 | + > transactions. |
| 120 | +
|
| 121 | +6. **The pool is closed on shutdown.** The `HikariDataSource` must be closed |
| 122 | + when the application stops. Because the Broker builds the `DataSource` in |
| 123 | + `App.main` (outside the Ktor module, before `embeddedServer(...).start`), |
| 124 | + register a JVM shutdown hook — or subscribe to Ktor's `ApplicationStopped` |
| 125 | + monitor event — to call `dataSource.close()`. State closes its pool via |
| 126 | + `monitor.subscribe(ApplicationStopped) { dataSource.close() }`; the Broker |
| 127 | + should achieve the equivalent given its wiring. |
| 128 | + |
| 129 | +7. **The corrected SSL handling is preserved, not regressed.** The Broker's |
| 130 | + `buildConnectionString()` already appends a lowercase `?sslmode=...` (the |
| 131 | + parameter name the PostgreSQL JDBC driver actually honours) and is |
| 132 | + configurable via `postgres.sslmode` / `POSTGRES_SSLMODE`. This is **more |
| 133 | + correct** than state, which hardcodes `?sslMode=prefer` (camel-cased, and |
| 134 | + therefore likely ignored by pgjdbc). The Broker keeps its own handling |
| 135 | + unchanged; aligning state's URL is a separate, upstream concern (see Out of |
| 136 | + Scope). |
| 137 | + |
| 138 | +## Examples |
| 139 | + |
| 140 | +### `application.conf` — new `postgres.pool` block |
| 141 | + |
| 142 | +The `postgres` block gains a nested `pool` block; the existing connection keys |
| 143 | +are unchanged. |
| 144 | + |
| 145 | +```hocon |
| 146 | +postgres { |
| 147 | + database = "sdkman" |
| 148 | + database = ${?POSTGRES_DATABASE} |
| 149 | + host = "127.0.0.1" |
| 150 | + host = ${?POSTGRES_HOST} |
| 151 | + port = "5432" |
| 152 | + port = ${?POSTGRES_PORT} |
| 153 | + username = "postgres" |
| 154 | + username = ${?POSTGRES_USERNAME} |
| 155 | + password = "postgres" |
| 156 | + password = ${?POSTGRES_PASSWORD} |
| 157 | + sslmode = "disable" |
| 158 | + sslmode = ${?POSTGRES_SSLMODE} |
| 159 | +
|
| 160 | + pool { |
| 161 | + maxSize = 20 |
| 162 | + maxSize = ${?POSTGRES_POOL_MAX_SIZE} |
| 163 | + minIdle = 2 |
| 164 | + minIdle = ${?POSTGRES_POOL_MIN_IDLE} |
| 165 | + connectionTimeoutMs = 5000 |
| 166 | + connectionTimeoutMs = ${?POSTGRES_POOL_CONNECTION_TIMEOUT_MS} |
| 167 | + maxLifetimeMs = 1800000 |
| 168 | + maxLifetimeMs = ${?POSTGRES_POOL_MAX_LIFETIME_MS} |
| 169 | + idleTimeoutMs = 600000 |
| 170 | + idleTimeoutMs = ${?POSTGRES_POOL_IDLE_TIMEOUT_MS} |
| 171 | + } |
| 172 | +} |
| 173 | +``` |
| 174 | + |
| 175 | +### `AppConfig` additions |
| 176 | + |
| 177 | +```kotlin |
| 178 | +interface AppConfig { |
| 179 | + // existing postgres.* members … |
| 180 | + val postgresPoolMaxSize: Int |
| 181 | + val postgresPoolMinIdle: Int |
| 182 | + val postgresPoolConnectionTimeoutMs: Long |
| 183 | + val postgresPoolMaxLifetimeMs: Long |
| 184 | + val postgresPoolIdleTimeoutMs: Long |
| 185 | +} |
| 186 | + |
| 187 | +class DefaultAppConfig : AppConfig { |
| 188 | + // … |
| 189 | + override val postgresPoolMaxSize: Int = config.getIntOrDefault("postgres.pool.maxSize", 20) |
| 190 | + override val postgresPoolMinIdle: Int = config.getIntOrDefault("postgres.pool.minIdle", 2) |
| 191 | + override val postgresPoolConnectionTimeoutMs: Long = |
| 192 | + config.getLongOrDefault("postgres.pool.connectionTimeoutMs", 5_000L) |
| 193 | + override val postgresPoolMaxLifetimeMs: Long = |
| 194 | + config.getLongOrDefault("postgres.pool.maxLifetimeMs", 1_800_000L) |
| 195 | + override val postgresPoolIdleTimeoutMs: Long = |
| 196 | + config.getLongOrDefault("postgres.pool.idleTimeoutMs", 600_000L) |
| 197 | +} |
| 198 | +``` |
| 199 | + |
| 200 | +### `PostgresConnectivity.createDataSource` — before and after |
| 201 | + |
| 202 | +```kotlin |
| 203 | +// before — hardcoded literals, no pool name, fail-fast on a down database |
| 204 | +private fun createDataSource(connectionString: String): DataSource = |
| 205 | + with(HikariConfig()) { |
| 206 | + jdbcUrl = connectionString |
| 207 | + config.postgresUsername.map { username = it } |
| 208 | + config.postgresPassword.map { password = it } |
| 209 | + driverClassName = "org.postgresql.Driver" |
| 210 | + maximumPoolSize = 10 |
| 211 | + minimumIdle = 2 |
| 212 | + connectionTimeout = 30000 |
| 213 | + idleTimeout = 600000 |
| 214 | + maxLifetime = 1800000 |
| 215 | + HikariDataSource(this) |
| 216 | + } |
| 217 | + |
| 218 | +// after — sourced from AppConfig, named, boots even if Postgres is down |
| 219 | +private fun createDataSource(connectionString: String): DataSource = |
| 220 | + with(HikariConfig()) { |
| 221 | + jdbcUrl = connectionString |
| 222 | + config.postgresUsername.map { username = it } |
| 223 | + config.postgresPassword.map { password = it } |
| 224 | + driverClassName = "org.postgresql.Driver" |
| 225 | + maximumPoolSize = config.postgresPoolMaxSize |
| 226 | + minimumIdle = config.postgresPoolMinIdle |
| 227 | + connectionTimeout = config.postgresPoolConnectionTimeoutMs |
| 228 | + maxLifetime = config.postgresPoolMaxLifetimeMs |
| 229 | + idleTimeout = config.postgresPoolIdleTimeoutMs |
| 230 | + poolName = "sdkman-broker-pool" |
| 231 | + initializationFailTimeout = -1 |
| 232 | + HikariDataSource(this) |
| 233 | + } |
| 234 | +``` |
| 235 | + |
| 236 | +### Environment-variable overrides (operations) |
| 237 | + |
| 238 | +| Env variable | Overrides | Default | |
| 239 | +|---------------------------------------|------------------------|-----------| |
| 240 | +| `POSTGRES_POOL_MAX_SIZE` | `maximumPoolSize` | `20` | |
| 241 | +| `POSTGRES_POOL_MIN_IDLE` | `minimumIdle` | `2` | |
| 242 | +| `POSTGRES_POOL_CONNECTION_TIMEOUT_MS` | `connectionTimeout` | `5000` | |
| 243 | +| `POSTGRES_POOL_MAX_LIFETIME_MS` | `maxLifetime` | `1800000` | |
| 244 | +| `POSTGRES_POOL_IDLE_TIMEOUT_MS` | `idleTimeout` | `600000` | |
| 245 | + |
| 246 | +## Out of Scope |
| 247 | + |
| 248 | +- **The `postgres.*` → `database.*` namespace rename.** State uses `database.*`; |
| 249 | + the Broker keeps `postgres.*` to avoid breaking deployed environment |
| 250 | + variables. Only the pool sub-structure, keys, defaults, and behaviour are |
| 251 | + aligned — not the prefix. |
| 252 | +- **Aligning the JDBC driver version.** The Broker is on `postgresql:42.7.7`, |
| 253 | + state on `42.7.2`. Bumping is unrelated to pooling; if the teams want a |
| 254 | + matched driver it should be a deliberate, separate change (state catching up |
| 255 | + is preferable to the Broker downgrading). |
| 256 | +- **Migrating to a Gradle version catalog.** State declares HikariCP via |
| 257 | + `libs.versions.toml`; the Broker declares it inline in `build.gradle.kts`. |
| 258 | + Both already pin HikariCP `5.1.0`, so this is a build-hygiene preference, not |
| 259 | + a pooling concern. |
| 260 | +- **Fixing state's `sslMode=prefer` typo.** That lives in |
| 261 | + `sdkman-state`'s `ConfigExtensions.kt` and is an upstream fix; the Broker's |
| 262 | + correct lowercase `sslmode` handling is preserved as-is (Business Rule 7). |
| 263 | +- **The Mongo backend.** `MongoConnectivity` and the Mongo repositories are |
| 264 | + untouched. |
| 265 | +- **Query logic, the UPSERT, and single-transaction write path** from state's |
| 266 | + spec (R3–R6). The Broker is a read-only consumer of the `versions` table and |
| 267 | + performs no writes to it, so those requirements do not apply. |
| 268 | +- **Production Flyway wiring.** State runs Flyway at startup against its pool; |
| 269 | + the Broker owns no migrations (Flyway is a `testImplementation` only) because |
| 270 | + the schema is owned by `sdkman-state`. No production migration wiring is |
| 271 | + added. |
| 272 | + |
| 273 | +## Acceptance Criteria |
| 274 | + |
| 275 | +- [ ] The five HikariCP tunables are read from `postgres.pool.*` config with |
| 276 | + `POSTGRES_POOL_*` env overrides; no pool tunable remains a literal in |
| 277 | + `PostgresConnectivity`. |
| 278 | +- [ ] Defaults applied when env vars are absent match `sdkman-state`: maxSize |
| 279 | + `20`, minIdle `2`, connectionTimeout `5000`, maxLifetime `1800000`, |
| 280 | + idleTimeout `600000`. |
| 281 | +- [ ] `AppConfig` / `DefaultAppConfig` expose the five pool members and a unit |
| 282 | + test verifies they load from HOCON with defaults applied when env vars |
| 283 | + are unset. |
| 284 | +- [ ] The pool is named `sdkman-broker-pool`. |
| 285 | +- [ ] `initializationFailTimeout = -1` is set, and the application starts |
| 286 | + successfully against an unreachable Postgres, with `/meta/health` |
| 287 | + reporting the database as unhealthy rather than the process aborting at |
| 288 | + boot. |
| 289 | +- [ ] Exposed is wired through a single `Database.connect` with |
| 290 | + `defaultIsolationLevel = TRANSACTION_READ_COMMITTED`, shared by all |
| 291 | + Postgres repositories (no per-repository `Database.connect`). |
| 292 | +- [ ] The `HikariDataSource` is closed on application shutdown. |
| 293 | +- [ ] The existing lowercase `sslmode` connection-string handling is unchanged. |
| 294 | +- [ ] An integration test opens more concurrent queries than a small configured |
| 295 | + `maxSize` and asserts they all complete (queued requests served as |
| 296 | + connections free up), proving the pool is the active path. |
| 297 | +- [ ] Existing download and health acceptance/integration specs pass unchanged. |
| 298 | +- [ ] All quality gates pass (`./gradlew clean check` — compile, detekt, |
| 299 | + ktlint, tests). |
0 commit comments