feat(api): add Managed Postgres HTTP client#544
Conversation
c3c2cc9 to
16cf999
Compare
theory
left a comment
There was a problem hiding this comment.
Looks like it will work! Apologies for the panoply of comments; I assume you're new to Go, so made some suggestions, but also asked a bunch of questions where I found things confusing. I hope it's mostly helpful and not too annoying 😂
| t[k] = redactJSONValue(child) | ||
| } | ||
| return t | ||
| case []interface{}: |
There was a problem hiding this comment.
Prefer the newer and nicer-to-read any (unless the rest of the code base is littered with []inteface{}s.
| ) | ||
|
|
||
| func TestRedactSensitiveBody(t *testing.T) { | ||
| cases := []struct { |
There was a problem hiding this comment.
No tests for slices that I can see. Also worthwhile throwing in bools, numbers and nulls to make sure they're all passed through properly.
There was a problem hiding this comment.
Added bool/number/null pass-through cases. Slices were already covered in array of objects: each element's sensitive key redacted exercises arrays of objects, and the top-level ["a","b","c"] case covers arrays of scalars. Null on a sensitive key is in non-string sensitive value still redacted.
| } | ||
|
|
||
| for _, tc := range cases { | ||
| t.Run(tc.name, func(t *testing.T) { |
There was a problem hiding this comment.
Consider using t.Parallel() in all tests so accelerate testing.
There was a problem hiding this comment.
The repo doesn't use t.Parallel() as a convention and the api suite runs in under 2s so skipping for now
| respBody, err := c.doRequest(ctx, req) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| resp := ResponseWithResult[PostgresConfig]{} | ||
| if err := json.Unmarshal(respBody, &resp); err != nil { | ||
| return nil, err | ||
| } | ||
| return &resp.Result, nil |
There was a problem hiding this comment.
This pattern repeats a lot. Consider replacing it with a function
There was a problem hiding this comment.
Existing code has the same repeated pattern. will do a follow up to refactor to reduce repated code.
|
It’s right there in [go.mod](https://github.com/ClickHouse/terraform-provider-clickhouse/blob/7c3b3800d22cfe67f7f59bb8b35eb611d700d6a6/go.mod#L14)
|
| // Postgres instance state values. Mirrors ManagedPostgresInstanceStatuses | ||
| // in packages/cp-common/src/protocol/postgres/ManagedPostgres.ts:59-68. | ||
| const ( | ||
| PostgresStateCreating = "creating" |
There was a problem hiding this comment.
I think we should document which of these states are on primary only and which are on standby only
There was a problem hiding this comment.
Checked both ManagedPostgres.ts and the OpenAPI doc. It's a flat enum with a generic "Current state of the service" description. The primary-vs-standby breakdown isn't documented at the source, it's Ubicloud behavior. Rather not commit inferences as Go comments that could drift wrong. If we can get authoritative docs upstream, I'll backfill the annotations here.
Previously, TF_LOG=DEBUG logged the raw body of every HTTP request and response sent through doRequest. Bodies for password rotation, service create, and any future Postgres password / config flow contain plaintext secrets (password, newPasswordHash, newDoubleSha1Hash, tokenSecret), which then landed verbatim in the structured log output. Add a redactSensitiveBody walker that JSON-decodes the body, replaces values for a small set of sensitive keys with "REDACTED", and stamps the entire subtree under secrets/credentials containers. Wrap it in formatLogBody so the four log-call sites in doRequest collapse to two single-line tflog.SetField calls. Malformed JSON yields a placeholder rather than the raw bytes so the failure path can't leak either.
Hand-written API client for the ClickHouse Cloud Managed Postgres
endpoints under /v1/organizations/{orgId}/postgres. Follows the
existing service.go / clickpipe.go pattern: no codegen, ClientImpl
methods on a hand-written interface, minimock-generated mock.
Models (postgres_models.go) mirror the wire shapes in
control-plane/apps/openapi/src/protocol/v1/ManagedPostgresV1.ts.
storageSize is deliberately omitted (DEPRECATED server-side). State
constants match ManagedPostgresInstanceStatuses verbatim. PgConfigMap
has a custom UnmarshalJSON that accepts both string and numeric values
per the server's `[key: string]: string | number` shape.
Client methods (postgres.go):
- GET / LIST / CREATE / UPDATE / DELETE
- WaitForPostgresState (mirrors WaitForServiceState's 5s constant
backoff) and WaitForPostgresLeaveAndReturn (new variant for
post-PATCH transitions that may not have started yet)
- SetPostgresPassword
- GetPostgresConfig + ReplacePostgresConfig (POST, full replacement,
matches Phase 0 curl-verified empty-map behavior) + UpdatePostgresConfig
(PATCH, exposed for completeness)
- RestorePostgres + CreatePostgresReadReplica
- PostgresStateCommandSend (client-only; resource excludes operational
commands in v1)
- GetPostgresCaCertificates (raw PEM, bypasses JSON envelope decoding)
DeletePostgres retries 409 conflicts for up to 15 min but fails fast
when the response indicates a dependent replica exists.
Tests (postgres_test.go + postgres_models_test.go): 42 cases covering
happy paths, 404 -> IsNotFound, 409 retry, 409-with-dependent fail
fast, rate-limit honoring, state transitions including unknown values,
password rotation idempotency, empty-map config POST, raw PEM round
trip, and the full state-command table.
interface.go extends Client with the 15 new method signatures;
client_mock.go is regenerated via `make mock`.
Tighten the dependent-replica heuristic and document known limitations flagged by code review. Operational fixes (User-Agent on raw endpoint, minimum observation window for transition detection, anchoring the heuristic to a real server response) belong in the phases that have real consumers and real wire data; FIXMEs in each location point to the owning phase. - errIndicatesDependentReplica: AND instead of OR, removed redundant "depend" arm. Was matching `"replica"` alone (e.g. "replication slot exists") and would have failed fast on transient conflicts. New negative test asserts a 409 containing "replica" but not "depend" is retried. - WaitForPostgresLeaveAndReturn: doc note requiring callers to verify the mutating call returned 2xx before invoking. FIXME points Phase 2 at the residual race (server hasn't started transitioning when leave-detection times out). - GetPostgresCaCertificates: doc note enumerating the User-Agent / 429 / 5xx / tflog gaps vs other methods. FIXME points Phase 5 at the doRawRequest refactor when the data source lands.
… clearing
Three issues surfaced in code review, all in scope for this PR
because the affected types and behaviors live here.
- common.go: add `connectionString` (and snake_case variant) to
sensitiveBodyKeys. The Postgres create response embeds the generated
password directly in the URI alongside the redacted password field,
so TF_LOG=DEBUG was logging the credential through connectionString
even with the field-level redaction in place. New test exercises
the realistic create-response shape and asserts the password value
never appears in formatted output via any path.
- postgres.go: WaitForPostgresLeaveAndReturn was swallowing real
GetPostgres errors (404, 401/403, exhausted 5xx, context cancel)
as "no-op success" because the phase-1 wait loop only checked
whether the state had observed-ly left terminalState, not whether
the polling itself failed. Introduce errPostgresStateUnchanged
as a sentinel; only sentinel-caused budget exhaustion returns nil,
every other error propagates. Two new tests: 404 throughout phase
1 must return an IsNotFound-compatible error; the observed-stable
case (server returns terminal cleanly) is still a no-op success.
- postgres_models.go: PostgresUpdate.Tags changes from []Tag to
*[]Tag so callers can distinguish "leave existing tags alone"
(nil) from "clear all tags" (&[]Tag{}). With omitempty on a plain
slice, both intents marshal to the same `{}` body and an existing
tagged Postgres service can't be cleared via PATCH. Two new tests
cover the empty-list and populated-list shapes.
…as timeout Mirror of the fix already applied to WaitForPostgresLeaveAndReturn. Before: a 404 (or auth failure, exhausted 5xx, context cancel) during the polling loop would exhaust the retry budget and the wrapper would unconditionally rewrite the final error to "postgres ... did not reach the expected state in the allocated time (last seen state: )" — losing the real error and confusing callers who needed IsNotFound or auth-error semantics. Use a sentinel errPostgresStateNotYetTarget for the "polled OK, stateChecker false" case. Only that sentinel triggers the last-seen-state timeout rewrite; real errors from GetPostgres propagate verbatim. New test TestWaitForPostgresState_PropagatesGetErrors covers the 404-throughout scenario and asserts IsNotFound matches the returned error and the error message does not contain "did not reach the expected state". Existing tests (TimesOutWithLastSeenState, UnknownStateDoesNotCrash, TransitionsToRunning) continue to pass — the timeout-rewrite path is unchanged when polling succeeded and the state simply didn't match.
YAGNI cleanup. Both removals were "shipped for completeness" with no
in-tree consumer planned in any later phase.
- UpdatePostgresConfig (PATCH-merge variant): the Phase 3 resource
uses ReplacePostgresConfig (POST, full replacement) because the
PATCH-merge semantics don't fit Terraform's declarative model.
Nothing else calls the method. configMutate helper inlined into
ReplacePostgresConfig.
- PostgresStateCommandSend + PostgresStateCommandRequest +
PostgresCommand{Restart,Promote,Switchover} constants: the v1 plan
explicitly excludes operational commands from the Terraform surface
and the only "future tooling" justification is speculative. Easy
to re-add (~15 lines) if a real consumer ever materializes.
Also trimmed verbose comments throughout to focus on intent/WHY rather
than restating WHAT. FIXMEs and load-bearing caller-precondition notes
are preserved verbatim.
Mock regenerated; 53 tests still pass; golangci-lint clean.
Single rollup commit for inline review feedback from @theory. Style and idiom: - interface{} → any in common.go (matches dominant repo style) - Drop hand-rolled jsonStringContains; use strings.Contains - "iff" → "if" in two doc comments - Drop nonNegU64; inline uint64() with //nolint:gosec to match the convention used in service.go:125 - Trim verbose doc comments; rewrite phase-tagged FIXMEs to be self-contained - "fail-fasts" → "fails fast" grammar fix Type cleanup: - Postgres.Username, Postgres.Password, PostgresPassword.Password flipped from *string to plain string. Matches ClickPipe.Password and ServicePasswordUpdateResult.Password. The pointer was solving a problem that doesn't exist — the server validator rejects empty passwords, so omitempty on string gives the two meaningful states (absent → server generates, set → server adopts) without ambiguity. - CreatePostgres return type updated from *string to string for the password component. Defensive copy removed (Go strings are immutable, so the copy was no-op). Error wrapping: - Marshal/unmarshal errors wrapped with fmt.Errorf("…: %w", err) at 10 sites to match the clickpipe.go pattern. doRequest errors stay raw because IsNotFound / IsConflict / is5xx use strings.HasPrefix. Test improvements: - Switch PgConfigMap.UnmarshalJSON to json.Decoder.UseNumber() + type switch instead of two-pass try-unmarshal-with-fallback - Add bool/number/null pass-through cases to TestRedactSensitiveBody - Add reject tests for null, object, array values on PgConfigMap - Rename misleading test names (RoundTrips → Unmarshals, ClearsTags → MarshalsAsEmptyArray, OmitsAbsentableFields → OmitsNilPointerFields) - Remove tests redundant with type definitions (DoesNotIncludeNameField, NilTagsOmitsField) - Add t.Fatalf error checks on the four request-body decode sites in tests where ignored errors would mask the real assertion failure - Add doc comment explaining why CreatePostgres returns empty password when the user supplied their own - Soften docstring claim about GET endpoint not echoing password — per swagger, it's optional in the response client_mock.go regenerated via make mock for the CreatePostgres return-type change.
Rollup commit for @heavycrystal's inline comments. Style and naming: - Tighten the connectionString sensitive-key comment to a single line - Rename postgresDeleteRetryBudgetSecs (int) to postgresDeleteRetryAttempts (uint64). Eliminates the seconds/interval arithmetic and the matching //nolint:gosec at the DeletePostgres call site - Rename deletePostgresWithBudget to deletePostgresWithInterval so all three test seams share the *WithInterval naming and (interval, maxRetries) signature - Rename WaitForPostgresLeaveAndReturn to WaitForPostgresStateTransitionAndReturn (and the matching test seam + 3 tests) for clearer intent Correctness: - PgConfigMap.UnmarshalJSON now builds into a local map and assigns to the receiver only after the full pass succeeds. A mid-loop type mismatch no longer leaves the receiver in a half-populated state - Remove dead branch in waitForPostgresStateTransitionAndReturnWithInterval: with maxRetries >= 2, halfBudget = maxRetries / 2 is always >= 1, so the if halfBudget < 1 guard is unreachable - formatLogBody now takes a ctx and tflog.Warn-s if json.Indent fails on already-redacted output, instead of silently returning the placeholder. The branch is still unreachable by construction, but if anything ever breaks that invariant, an operator running TF_LOG=DEBUG will see the warning Test changes: - Remove TestPostgresState_ConstantsMatchWireValues — it only compared constants to literals I wrote myself; would not catch a server-side rename - Convert all `var calls int32` + `atomic.AddInt32(&calls, 1)` to `var calls atomic.Int32` + `calls.Add(1)` / `calls.Load()` so `calls++` is a compile error Mock regenerated for the WaitForPostgres*StateTransitionAndReturn rename.
dc6f705 to
9795ee1
Compare
Summary
/v1/organizations/{orgId}/postgres. Mirrors the existingservice.go/clickpipe.gopattern: models inpostgres_models.go, methods on*ClientImplinpostgres.go, interface extension ininterface.go, regeneratedclient_mock.go.WaitForPostgresStatemirrorsWaitForServiceState;WaitForPostgresLeaveAndReturnis a new variant for post-PATCH transitions that may not have started yet).DeletePostgresretries 409 conflicts for ~15 min but fails fast when the response indicates a dependent replica.PgConfigMaphas a customUnmarshalJSONthat accepts both string and numeric values, matching the server's[key: string]: string \| numbershape.common.go:doRequestfirst: sensitive body fields (password,newPassword,newPasswordHash,newDoubleSha1Hash,password_wo,tokenSecret, plussecrets/credentialscontainers) are stamped to"REDACTED"before going totflog. Fixes a pre-existing leak whereTF_LOG=DEBUGduring aclickhouse_servicepassword rotation logged the plaintext hash.Test plan
go test ./pkg/internal/api/...— 62 new cases (20 redaction + 42 Postgres) greengo test ./...— whole module greengo vet ./...cleangolangci-lint run ./...(pinnedv1.64.8from Makefile) cleanmake mockregeneratedclient_mock.gocleanly; all 15 new methods presentmake fmt && make docs && make build) green on both commits