Skip to content

fix(ci): resolve main Build & Test failures - pgx CVE, errcheck wave, config test regressions#1343

Merged
cristim merged 5 commits into
mainfrom
fix/ci-main-red
Jul 3, 2026
Merged

fix(ci): resolve main Build & Test failures - pgx CVE, errcheck wave, config test regressions#1343
cristim merged 5 commits into
mainfrom
fix/ci-main-red

Conversation

@cristim

@cristim cristim commented Jul 3, 2026

Copy link
Copy Markdown
Member

Closes #1339

"CI - Build & Test" has been red on main since 2026-06-11. This PR resolves the three failure groups in four commits:

Commits

  1. fix(deps): bump github.com/jackc/pgx/v5 to v5.9.2 for GO-2026-5004 - fixes the govulncheck finding (SQL injection via dollar-quoted string constants in the simple protocol).
  2. fix(lint): address all errcheck findings across cmd/ and internal/ - fixes all 107 errcheck findings (check-blank + check-type-assertions) across 29 files: deferred rollbacks now log, interactive CLI reads propagate errors, buildResponse drops its always-nil error return, best-effort operations log at warn/debug, 67 single-value type assertions in mocks/test helpers switched to the two-value form with panic-on-mismatch. errcheck is now at 0.
  3. fix(config): fail loud on missing execution and seed FK accounts in tests - fixes the three failing integration tests (root causes below).
  4. fix(api): filter ErrTxClosed rollback noise and harden CAS probe error handling - review follow-ups: swallow pgx.ErrTxClosed in the commitmentopts.Save deferred rollback, propagate hard DB errors from the TransitionExecutionStatus CAS probe instead of mapping them to ErrNotFound (with a pgxmock regression test verified failing pre-fix), document the GetExecutionByID contract on the interface, and tolerate io.EOF-with-data in the configure-azure/gcp interactive reads so piped input keeps working.

Root causes of the three failing integration tests

  • TestPostgresStore_PurchaseExecutions/Get_execution_by_ID_-_not_found: genuine silent-failure regression. PostgresStore.GetExecutionByID returned (nil, nil) for a missing row, so the not-found case produced no error and every caller had to nil-check (several forgot). It now returns an error wrapping config.ErrNotFound; a nil error guarantees a non-nil execution. All callers (purchase handlers, revoke, scoping, approvals/messages, store-internal CAS probes) switched to errors.Is(err, config.ErrNotFound), and the pre-existing pgxmock unit test TestGetExecutionByID_NotFound already asserted this error contract.
  • TestPostgresStore_UpsertRecommendations_AccountScopedEviction and ..._AmbientAndRegisteredCoexist: missing test pre-condition. recommendations.cloud_account_id carries an FK to cloud_accounts(id) (migration 000030), but the tests inserted rows for account UUIDs never created in cloud_accounts, failing with an FK violation. Production upserts only carry registered account IDs, so the tests now seed the accounts first via a seedRecommendationCloudAccount helper.

Notes

Verification

  • go build ./... exit 0; go vet ./... exit 0
  • golangci-lint run --enable-only errcheck ./... -> 0 issues, exit 0
  • go test ./... -> 5481 passed in 38 packages, exit 0
  • go test -tags=integration ./internal/config/ (dockerized Postgres testcontainers) -> 679 passed, exit 0; the three previously failing tests re-verified uncached against the committed code

Summary by CodeRabbit

  • Bug Fixes
    • Improved reliability of interactive setup prompts by surfacing input/reading errors immediately and ensuring follow-up confirmations behave consistently.
    • API now returns consistently valid JSON for nil-body responses and logs response-body write issues instead of failing silently.
    • “Execution not found” situations now produce clearer 404-style outcomes, with better error detection during async refresh and recovery flows.
    • Strengthened safeguards to reduce unexpected crashes from unexpected runtime values.
  • Chores
    • Updated a core PostgreSQL driver dependency.

cristim added 4 commits July 3, 2026 15:08
Fixes govulncheck finding GO-2026-5004 (SQL injection via
dollar-quoted string constants in the simple protocol) reported by CI
on main since 2026-06-11.
golangci-lint v2.10.1 errcheck with check-blank:true and
check-type-assertions:true flagged unchecked error returns and
single-value type assertions in 29 files.

Changes by category:
- Defer rollback errors: log instead of blank-assign (cleanup-lambda,
  rekey, commitmentopts/store_postgres)
- Interactive CLI ReadString errors: return error to caller
  (configure_azure, configure_gcp)
- buildResponse: drop unused error return; callers use single-return
  form; update two test callers accordingly (handler, handler_test,
  handler_coverage_test)
- Intentionally best-effort operations: capture error and log at warn/
  debug (handler.go resolveSourceIdentity STS, dashboard resolveAWS*,
  recommendations_refresh ClearCollectionStarted x2, registrations
  requireAdmin, ri_utilization_cache singleflight, service_mfa
  UpdateUser, scheduler errgroup.Wait x2 and GetRecommendations,
  execution SavePurchaseExecution, credentials resolver LoadRaw,
  server/http w.Write)
- singleflight.Do: use two-value if-init form so error is captured
  (service_apikeys, ri_utilization_cache)
- Type assertions: two-value form with panic on mismatch in mocks
  (stores, secretsmanager, ses, sns - 57 sites) and test_helpers
  (auth - 10 sites)
- Type assertion: validation.go gcp payload type field
- Connection pool: ok-check on sync.Map type assertion (connection.go)
- DevKey: panic on invalid compile-time hex constant (cipher.go)
- migrate.go: capture version check error before log
- testhelpers: log container Terminate error in cleanup path
- handler_accounts: add logging import, fail-closed on HasCredential
  error
…ests

Fixes the three integration tests failing on main since 2026-06-11
(issue #1339):

- TestPostgresStore_PurchaseExecutions/Get_execution_by_ID_-_not_found
  expected an error but GetExecutionByID returned (nil, nil) for a
  missing row. That silent-nil contract violates the fail-loud policy
  and forced every caller to nil-check. GetExecutionByID now returns
  an error wrapping config.ErrNotFound; a nil error guarantees a
  non-nil execution. All callers (handler_purchases, revoke, scoping,
  purchase approvals/messages, store-internal CAS probes) switched to
  errors.Is(err, config.ErrNotFound) and their dead nil-checks were
  removed. Mock fixtures that simulated not-found via (nil, nil) now
  return the sentinel error. The pgxmock unit test
  TestGetExecutionByID_NotFound already asserted the error contract.

- TestPostgresStore_UpsertRecommendations_AccountScopedEviction and
  TestPostgresStore_UpsertRecommendations_AmbientAndRegisteredCoexist
  failed with an FK violation: recommendations.cloud_account_id
  REFERENCES cloud_accounts(id) (migration 000030) but the tests never
  created the referenced accounts. Production upserts only carry
  registered account IDs, so the tests now seed the cloud accounts
  first via a new seedRecommendationCloudAccount helper.
…r handling

Review follow-ups on the errcheck/test-fix stack:

- commitmentopts.Save: the deferred rollback logged WARN on every
  successful Save because pgx Tx.Rollback after Commit returns
  pgx.ErrTxClosed. Swallow that single sentinel (same filter as
  config.WithTx) so only real rollback failures are logged.
- TransitionExecutionStatus CAS probe: a hard DB error from the
  zero-row follow-up probe was mapped to ErrNotFound, making a DB
  outage read as a benign race-loss to the purchase reaper. Only
  errors.Is(existErr, ErrNotFound) maps to the race-loss sentinel;
  other probe errors propagate. New pgxmock regression test asserts a
  hard probe error is not ErrNotFound (verified failing pre-fix).
- interfaces.go: document the GetExecutionByID contract on the
  interface method (wraps ErrNotFound; never returns (nil, nil)).
- configure-azure/configure-gcp: reader.ReadString('\n') returns
  (data, io.EOF) for a final unterminated line, so the new error
  returns discarded valid piped input (printf "r" | cudly
  configure-azure). New readTrimmedLine helper tolerates io.EOF when
  data was read; all interactive reads go through it.
@cristim cristim added triaged Item has been triaged priority/p1 Next up; this sprint severity/high Significant harm urgency/this-sprint Within the current sprint impact/internal Team-internal only effort/m Days type/bug Defect labels Jul 3, 2026
@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Important

Review skipped

No new commits to review since the last review.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9ccb66ab-676c-446d-b45e-b941be170baf

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Updates the pgx dependency, changes execution lookups to return wrapped ErrNotFound, adjusts callers and tests to the new not-found contract, and adds explicit handling for previously ignored errors across CLI, API, auth, database, scheduler, and mock code.

Changes

CI Red Main Fixes: CVE, errcheck, not-found regression

Layer / File(s) Summary
pgx dependency bump
go.mod
Updates github.com/jackc/pgx/v5 from v5.8.0 to v5.9.2.
GetExecutionByID not-found contract
internal/config/interfaces.go, internal/config/store_postgres.go, internal/config/store_postgres_db_test.go, internal/config/store_postgres_pgxmock_test.go
GetExecutionByID now returns a wrapped ErrNotFound instead of (nil, nil); the status-transition and cancel paths rely on that contract, and tests assert the new behavior.
Caller updates for ErrNotFound
internal/api/handler_purchases*.go, internal/api/scoping.go, internal/purchase/approvals*.go, internal/purchase/messages*.go
Handlers and purchase flows switch from nil-execution checks to errors.Is(err, config.ErrNotFound) and keep the same 404 or wrapped-error outcomes.
Recommendation FK test seeding
internal/config/store_postgres_recommendations_test.go
Adds seeded cloud_accounts rows before recommendation upserts that reference cloud_account_id.
CLI input and rollback handling
cmd/cleanup-lambda/main.go, cmd/rekey/main.go, cmd/configure_azure.go, cmd/configure_gcp.go
Rollback failures and interactive input reads are now checked and logged or returned, with shared trimmed-line helpers for prompts.
buildResponse refactor
internal/api/handler.go, internal/api/handler_test.go, internal/api/handler_coverage_test.go
buildResponse becomes single-return and always produces a response; nil-body success now serializes as {}.
API handler error handling
internal/api/handler_accounts.go, handler_dashboard.go, handler_recommendations_refresh.go, handler_registrations.go, ri_utilization_cache.go, validation.go
Previously ignored errors from config loads, credential checks, marshal, rollback, admin lookup, and singleflight are now handled explicitly.
Auth, credentials, database, server, scheduler error handling
internal/auth/*, internal/credentials/*, internal/commitmentopts/store_postgres.go, internal/database/*, internal/server/http.go, internal/scheduler/scheduler.go
Unchecked errors in last-used updates, MFA cleanup, rollback handling, WIF loading, advisory locks, migration logging, test cleanup, errgroup waits, and HTTP writes are now logged or propagated.
Mock hardening
internal/auth/test_helpers.go, internal/mocks/*
Mock return values are now type-checked before use, with descriptive panics on mismatches.

Estimated code review effort: 4 (Complex) | ~75 minutes

Possibly related issues

Possibly related PRs

  • LeanerCloud/CUDly#260: Shares the async recommendation refresh flow changed in internal/api/handler_recommendations_refresh.go.
  • LeanerCloud/CUDly#635: Shares the GetExecutionByID/TransitionExecutionStatus contract area in internal/config/store_postgres.go.
  • LeanerCloud/CUDly#804: Shares the revoke-purchase handling around GetExecutionByID and purchase-history fallback.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the CI/build-test fixes, pgx upgrade, errcheck cleanup, and config test regressions.
Linked Issues check ✅ Passed The changes address the linked goals: upgrade pgx, resolve errcheck issues, and fix the failing internal/config tests.
Out of Scope Changes check ✅ Passed No clear out-of-scope changes stand out; the edits all support the CI, lint, or config test fixes described in the issue.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/ci-main-red

Comment @coderabbitai help to get the list of available commands.

@cristim

cristim commented Jul 3, 2026

Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
cmd/configure_azure.go (1)

411-422: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Thread the existing reader into executeExplicitCommand.
A fresh bufio.Reader on os.Stdin can skip bytes already buffered by the caller’s reader, so the retry prompt can lose piped input after earlier prompts consume part of the stream. Pass the existing reader through instead.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/configure_azure.go` around lines 411 - 422, The retry prompt in
executeExplicitCommand is creating a new bufio.NewReader on os.Stdin instead of
reusing the caller’s reader, which can drop already-buffered input. Update
executeExplicitCommand to accept and use the existing reader passed in from the
surrounding flow, and thread that reader through the prompt logic so
readTrimmedLine consumes from the same stream consistently.
cmd/configure_gcp.go (1)

417-428: 🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Thread the existing reader through executeGCPCommand. Creating a fresh bufio.NewReader(os.Stdin) here can miss buffered piped input and skip the follow-up y/N response; pass the caller’s reader down instead.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/configure_gcp.go` around lines 417 - 428, Thread the existing input
reader through executeGCPCommand instead of creating a new
bufio.NewReader(os.Stdin) for the y/N prompt. Update the caller and the
executeGCPCommand flow so the same reader used by readTrimmedLine is reused for
the follow-up confirmation, preserving any buffered piped input. Use the
existing executeGCPCommand and readTrimmedLine symbols to locate the prompt
handling and replace the local reader creation with the passed-in reader.
internal/credentials/resolver.go (1)

408-417: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Treat LoadRaw errors as failures, not absence.

CredentialStore.LoadRaw returns nil, nil only when no credential exists. Logging the error and continuing can switch a workload-identity-federation account into the federated path when the stored WIF config couldn’t be read, silently changing auth behavior on a real store failure.

🐛 Proposed fix
 	var raw []byte
 	if store != nil {
 		var loadErr error
 		raw, loadErr = store.LoadRaw(ctx, account.ID, CredTypeGCPWIFConfig)
 		if loadErr != nil {
-			log.Printf("credentials: LoadRaw for account %s: %v (treating as absent)", account.ID, loadErr)
+			return nil, fmt.Errorf("credentials: check stored gcp credential for account %s: %w", account.ID, loadErr)
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/credentials/resolver.go` around lines 408 - 417, The credential
lookup in resolver.Load should not treat CredentialStore.LoadRaw failures as
“absent” WIF config. Update the code around the raw/loadErr handling so that
only a nil result means no stored CredTypeGCPWIFConfig, and any non-nil loadErr
is returned or otherwise surfaced as a real failure instead of falling back to
the federated path. Keep the account.ID context in the error path and preserve
the existing absence behavior only when LoadRaw returns nil, nil.
🧹 Nitpick comments (4)
internal/mocks/stores.go (1)

5-5: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚖️ Poor tradeoff

Consider a generic helper to eliminate the repeated cast-and-panic boilerplate.

The same 5-line pattern (args.Get(0).(T)ok check → formatted panic) is repeated across ~40 methods in this file, plus similarly in internal/auth/test_helpers.go, internal/mocks/secretsmanager.go, internal/mocks/ses.go, and internal/mocks/sns.go. A small generic helper would remove the duplication and guarantee message-format consistency (e.g., stores.go/secretsmanager.go/ses.go/sns.go use "mock: expected %s, got %T" while test_helpers.go uses "MockStore.Method: expected %s, got %T").

♻️ Example generic helper (requires Go 1.18+ generics support)
func mustCast[T any](v any) T {
	t, ok := v.(T)
	if !ok {
		panic(fmt.Sprintf("mock: expected %T, got %T", *new(T), v))
	}
	return t
}

Usage: v := mustCast[*config.GlobalConfig](args.Get(0))

This is a broad, cross-file mechanical refactor spanning 5 files and ~50+ call sites, so treat as optional/deferred rather than blocking.

Also applies to: 67-71, 86-90, 105-109, 133-137, 176-180, 199-203, 240-244, 253-257, 266-270, 299-303, 318-322, 331-335, 344-348, 357-361, 370-374, 411-415, 428-468, 488-504, 518-522, 531-535, 562-566, 575-579, 600-604, 631-635, 656-660, 695-699, 708-712, 721-725, 779-783, 797-801, 834-838, 885-889, 919-923, 950-964, 997-1001, 1012-1016, 1035-1039, 1060-1088, 1150-1154, 1169-1173, 1201-1205, 1214-1218, 1227-1231, 1240-1244, 1258-1262

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/mocks/stores.go` at line 5, Repeated cast-and-panic boilerplate
appears across the mock helpers in stores.go and the related test/mock files, so
extract it into a small generic helper to make the pattern consistent and reduce
duplication. Add a reusable helper (for example in the mocks package) that does
the type assertion and formatted panic once, then replace the repeated
args.Get(0).(T)/ok-check/panic blocks in stores.go, test_helpers.go,
secretsmanager.go, ses.go, and sns.go with calls to that helper, preserving each
file’s existing panic message style via the helper’s format string or wrapper.
internal/api/handler.go (2)

426-436: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Non-nil err path silently swallows the original error.

The marshal-failure branch below logs via logging.Errorf before returning 500, but this err != nil branch does not log the incoming error at all — it's discarded entirely. Since every current production caller passes nil (making this branch test-only for now), the gap isn't user-visible yet, but if a future caller starts passing a real error here, the root cause will be lost with no observability trail.

♻️ Proposed fix to log the swallowed error
 func (h *Handler) buildResponse(statusCode int, headers map[string]string, body any, err error) *events.LambdaFunctionURLResponse {
 	if err != nil {
+		logging.Errorf("buildResponse called with error: %v", err)
 		return &events.LambdaFunctionURLResponse{
 			StatusCode: 500,
 			Headers:    headers,
 			Body:       `{"error": "internal server error"}`,
 		}
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/api/handler.go` around lines 426 - 436, The non-nil err path in
Handler.buildResponse currently returns a 500 without recording the incoming
error, so the original failure is lost. Update the buildResponse method to log
the provided err before returning the internal server error response, using the
same logging style already used in the marshal-failure branch, so future callers
get an observability trail for this path.

371-380: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value

Drop the always-nil error return from HandleRequest

executeRequest folds every route failure into a response, so HandleRequest never returns a non-nil error. The checks in internal/server/http.go and internal/server/lambda.go are dead code; either simplify the signature in a follow-up or document the invariant.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/api/handler.go` around lines 371 - 380, `executeRequest` always
converts routing failures into a response and returns a nil error, so the error
path from `HandleRequest` is unreachable. Update the handler flow around
`executeRequest` and `HandleRequest` to reflect that invariant by either
removing the always-nil error from the return contract or clearly documenting
that it never returns an error. Then simplify the callers in
`internal/server/http.go` and `internal/server/lambda.go` so they no longer
branch on a non-existent error.
internal/api/handler_purchases.go (1)

241-247: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Correct ErrNotFound migration; consider extracting a shared helper.

All six call sites correctly replace the execution == nil check with errors.Is(err, config.ErrNotFound) before falling through to the generic err != nil wrap, consistent with the new fail-loud GetExecutionByID contract. Logic is correct in each case.

The same three-line pattern (GetExecutionByIDerrors.Is(..., ErrNotFound)err != nil wrap) is now duplicated six times in this file. Consider extracting a small helper to reduce repetition and centralize the not-found message:

♻️ Suggested helper
+func (h *Handler) getExecutionOr404(ctx context.Context, execID string) (*config.PurchaseExecution, error) {
+	execution, err := h.config.GetExecutionByID(ctx, execID)
+	if errors.Is(err, config.ErrNotFound) {
+		return nil, NewClientError(404, "execution not found")
+	}
+	if err != nil {
+		return nil, fmt.Errorf("failed to get execution: %w", err)
+	}
+	return execution, nil
+}

Callers with a bespoke 404 message (e.g. cancelOrRecoverExecution's formatted message, or authorizeExecutionManagement's errNotFound sentinel) can keep their own inline handling.

Also applies to: 383-389, 433-440, 832-838, 1152-1158, 1457-1466

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@internal/api/handler_purchases.go` around lines 241 - 247, The
GetExecutionByID error handling pattern is duplicated across multiple call
sites, so extract a small shared helper to centralize the common errors.Is(err,
config.ErrNotFound) check and generic wrap. Update the repeated logic in the
execution-loading paths in handler_purchases.go to call the helper, while
preserving any bespoke 404 behavior in places like cancelOrRecoverExecution and
authorizeExecutionManagement. Use the existing GetExecutionByID,
config.ErrNotFound, and errNotFound symbols to keep the refactor aligned with
the current contract.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@cmd/configure_azure.go`:
- Around line 411-422: The retry prompt in executeExplicitCommand is creating a
new bufio.NewReader on os.Stdin instead of reusing the caller’s reader, which
can drop already-buffered input. Update executeExplicitCommand to accept and use
the existing reader passed in from the surrounding flow, and thread that reader
through the prompt logic so readTrimmedLine consumes from the same stream
consistently.

In `@cmd/configure_gcp.go`:
- Around line 417-428: Thread the existing input reader through
executeGCPCommand instead of creating a new bufio.NewReader(os.Stdin) for the
y/N prompt. Update the caller and the executeGCPCommand flow so the same reader
used by readTrimmedLine is reused for the follow-up confirmation, preserving any
buffered piped input. Use the existing executeGCPCommand and readTrimmedLine
symbols to locate the prompt handling and replace the local reader creation with
the passed-in reader.

In `@internal/credentials/resolver.go`:
- Around line 408-417: The credential lookup in resolver.Load should not treat
CredentialStore.LoadRaw failures as “absent” WIF config. Update the code around
the raw/loadErr handling so that only a nil result means no stored
CredTypeGCPWIFConfig, and any non-nil loadErr is returned or otherwise surfaced
as a real failure instead of falling back to the federated path. Keep the
account.ID context in the error path and preserve the existing absence behavior
only when LoadRaw returns nil, nil.

---

Nitpick comments:
In `@internal/api/handler_purchases.go`:
- Around line 241-247: The GetExecutionByID error handling pattern is duplicated
across multiple call sites, so extract a small shared helper to centralize the
common errors.Is(err, config.ErrNotFound) check and generic wrap. Update the
repeated logic in the execution-loading paths in handler_purchases.go to call
the helper, while preserving any bespoke 404 behavior in places like
cancelOrRecoverExecution and authorizeExecutionManagement. Use the existing
GetExecutionByID, config.ErrNotFound, and errNotFound symbols to keep the
refactor aligned with the current contract.

In `@internal/api/handler.go`:
- Around line 426-436: The non-nil err path in Handler.buildResponse currently
returns a 500 without recording the incoming error, so the original failure is
lost. Update the buildResponse method to log the provided err before returning
the internal server error response, using the same logging style already used in
the marshal-failure branch, so future callers get an observability trail for
this path.
- Around line 371-380: `executeRequest` always converts routing failures into a
response and returns a nil error, so the error path from `HandleRequest` is
unreachable. Update the handler flow around `executeRequest` and `HandleRequest`
to reflect that invariant by either removing the always-nil error from the
return contract or clearly documenting that it never returns an error. Then
simplify the callers in `internal/server/http.go` and
`internal/server/lambda.go` so they no longer branch on a non-existent error.

In `@internal/mocks/stores.go`:
- Line 5: Repeated cast-and-panic boilerplate appears across the mock helpers in
stores.go and the related test/mock files, so extract it into a small generic
helper to make the pattern consistent and reduce duplication. Add a reusable
helper (for example in the mocks package) that does the type assertion and
formatted panic once, then replace the repeated args.Get(0).(T)/ok-check/panic
blocks in stores.go, test_helpers.go, secretsmanager.go, ses.go, and sns.go with
calls to that helper, preserving each file’s existing panic message style via
the helper’s format string or wrapper.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 78dbcb9b-5e86-4ff6-9d9b-210c14000a7e

📥 Commits

Reviewing files that changed from the base of the PR and between 5f712a0 and 3435ebc.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (44)
  • cmd/cleanup-lambda/main.go
  • cmd/configure_azure.go
  • cmd/configure_gcp.go
  • cmd/rekey/main.go
  • go.mod
  • internal/api/handler.go
  • internal/api/handler_accounts.go
  • internal/api/handler_coverage_test.go
  • internal/api/handler_dashboard.go
  • internal/api/handler_purchases.go
  • internal/api/handler_purchases_revoke.go
  • internal/api/handler_purchases_revoke_test.go
  • internal/api/handler_purchases_test.go
  • internal/api/handler_recommendations_refresh.go
  • internal/api/handler_registrations.go
  • internal/api/handler_test.go
  • internal/api/ri_utilization_cache.go
  • internal/api/scoping.go
  • internal/api/validation.go
  • internal/auth/service_apikeys.go
  • internal/auth/service_mfa.go
  • internal/auth/test_helpers.go
  • internal/commitmentopts/store_postgres.go
  • internal/config/interfaces.go
  • internal/config/store_postgres.go
  • internal/config/store_postgres_db_test.go
  • internal/config/store_postgres_pgxmock_test.go
  • internal/config/store_postgres_recommendations_test.go
  • internal/credentials/cipher.go
  • internal/credentials/resolver.go
  • internal/database/connection.go
  • internal/database/postgres/migrations/migrate.go
  • internal/database/postgres/testhelpers/postgres.go
  • internal/mocks/secretsmanager.go
  • internal/mocks/ses.go
  • internal/mocks/sns.go
  • internal/mocks/stores.go
  • internal/purchase/approvals.go
  • internal/purchase/approvals_test.go
  • internal/purchase/execution.go
  • internal/purchase/messages.go
  • internal/purchase/messages_test.go
  • internal/scheduler/scheduler.go
  • internal/server/http.go

…tial load errors

CodeRabbit review follow-ups on PR #1343:

- configure-azure/configure-gcp: executeExplicitCommand and
  executeGCPCommand created a fresh bufio.NewReader(os.Stdin) for the
  "Continue anyway?" retry prompt, which can drop input already
  buffered by the caller's reader and break piped input after earlier
  prompts. The caller's reader is now threaded through so all prompts
  consume one consistent buffered stream.
- credentials resolver: a LoadRaw failure while peeking at the stored
  WIF config was logged and treated as absence, silently flipping a
  stored-JSON workload-identity-federation account onto the federated
  path on a store outage (auth behavior change on failure). LoadRaw
  returns (nil, nil) only for genuine absence, so store errors now
  propagate. New regression test asserts the error surfaces instead of
  falling through (verified failing pre-fix).
@cristim

cristim commented Jul 3, 2026

Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@cristim cristim merged commit a2d8e11 into main Jul 3, 2026
13 of 16 checks passed
cristim added a commit that referenced this pull request Jul 3, 2026
Four compatibility fixes required after rebasing onto main which included
PRs #1343 (errcheck), #1346 (ladder-sp-coverage), and #1340 (ladder-allocate):

- cmd/cleanup-lambda: remove unused `errors` and `pgx` imports left over
  from conflict resolution (main's rollback pattern doesn't use them)
- internal/mocks/stores.go: update ListStoredRecommendations signature to
  pointer (*RecommendationFilter) matching the interface change in #1343
- internal/scheduler/scheduler.go: pass &params (pointer) to
  GetRecommendations after PR #1346 changed the function signature
- providers/aws/service_client_test.go: add GetSavingsPlansCoverage and
  GetSavingsPlansUtilization stub methods to mockCostExplorerClient after
  PR #1346 extended the CostExplorerAPI interface
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

effort/m Days impact/internal Team-internal only priority/p1 Next up; this sprint severity/high Significant harm triaged Item has been triaged type/bug Defect urgency/this-sprint Within the current sprint

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix(ci): main Build & Test red since 2026-06-11 - pgx CVE GO-2026-5004, errcheck wave, 3 failing internal/config tests

1 participant