Skip to content

fix: resolve AWS profile SSO interference and --identity flag regression#2087

Merged
aknysh merged 6 commits intomainfrom
aknysh/atmos-auth-fixes-4
Feb 19, 2026
Merged

fix: resolve AWS profile SSO interference and --identity flag regression#2087
aknysh merged 6 commits intomainfrom
aknysh/atmos-auth-fixes-4

Conversation

@aknysh
Copy link
Copy Markdown
Member

@aknysh aknysh commented Feb 18, 2026

what

  • Fix AWS [default] profile interfering with SSO authentication by isolating AWS SDK config loading from ~/.aws/config
  • Fix --identity flag not accepting space-separated values (--identity my-identity was broken, only --identity=my-identity worked)
  • Normalize --identity <value> to --identity=<value> before Cobra/pflag parsing to support standard CLI conventions
  • Add comprehensive test coverage for both fixes

why

  • Users reported that having a [default] profile with SSO settings in ~/.aws/config caused misleading SSO authentication failures — the AWS SDK v2 loaded ~/.aws/config via a hardcoded $HOME path even after env var sanitization
  • The original isolation used config.WithSharedConfigProfile("") which is a no-op in the AWS SDK v2 (empty string → found=false → falls back to "default" profile). Replaced with config.WithSharedConfigFiles([]string{}) which provides a non-nil empty slice that the SDK respects
  • Users reported that --identity my-identity (space-separated) caused "Too many command line arguments" errors from Terraform/OpenTofu, forcing them to use --identity=my-identity or ATMOS_IDENTITY=my-identity as workarounds
  • The --identity flag was registered with NoOptDefVal to support interactive selection mode (--identity with no value), which caused pflag to treat it like a boolean flag and not consume the next argument as its value

references

  • Fix documentation: docs/fixes/2026-02-17-aws-default-profile-sso-interference.md
  • Fix documentation: docs/fixes/2026-02-17-identity-flag-space-separated-value.md

Fix two user-reported issues with Atmos Auth:

1. AWS_PROFILE / default profile SSO interference: Isolate AWS SDK config
   loading to prevent ~/.aws/config profiles from interfering with SSO
   authentication. Add WarnIfAWSProfileSet() to alert users when
   AWS_PROFILE is set.

2. --identity flag space-separated value: Normalize `--identity value` to
   `--identity=value` before Cobra/pflag parsing. The NoOptDefVal setting
   caused pflag to treat the flag like a boolean, leaving the value as an
   orphaned positional argument passed to Terraform.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aknysh aknysh added the patch A minor, backward compatible change label Feb 18, 2026
@aknysh aknysh requested a review from a team as a code owner February 18, 2026 05:25
@aknysh aknysh added the patch A minor, backward compatible change label Feb 18, 2026
@github-actions github-actions bot added the size/l Large size PR label Feb 18, 2026
@aknysh aknysh self-assigned this Feb 18, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Feb 18, 2026

Dependency Review

✅ No vulnerabilities or license issues found.

Scanned Files

None

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Feb 18, 2026

📝 Walkthrough

Walkthrough

Adds per-command normalization of NoOptDefVal flags by consulting a command-specific FlagRegistry (which can rewrite os.Args before Cobra parsing), wires Terraform's registry, and strengthens AWS SSO environment isolation with deferred env restoration and a WarnIfAWSProfileSet helper; includes tests and documentation updates.

Changes

Cohort / File(s) Summary
Flag normalization & registry
cmd/root.go, cmd/internal/registry.go, cmd/terraform/terraform.go, cmd/internal/registry_test.go
Adds per-command FlagRegistry storage and Register/Get APIs, invokes registry.PreprocessNoOptDefValArgs during preprocessCompatibilityFlags to normalize NoOptDefVal-style flags (may replace os.Args / RootCmd.SetArgs), wires Terraform to register its registry, and adds unit tests for registry behavior and concurrency.
AWS env isolation & warnings
pkg/auth/cloud/aws/env.go, pkg/auth/cloud/aws/env_test.go, pkg/auth/providers/aws/sso.go
Refactors isolation helpers to use defer-based restoration, switches to using empty SharedConfigFiles/SharedCredentialsFiles to prevent shared-file loading, introduces exported WarnIfAWSProfileSet(), invokes it before SSO config load, enhances SSO error hints, and adds extensive tests covering isolation, restoration, panic cases, and warnings.
SSO provider tests & messaging
pkg/auth/providers/aws/sso_test.go
Adds tests for SSO provider realm handling and non-interactive authentication error messaging; updates assertions to reflect added hints about AWS_PROFILE/default profiles.
Documentation / Fix notes
docs/fixes/2026-02-17-aws-default-profile-sso-interference.md, docs/fixes/2026-02-17-identity-flag-space-separated-value.md
Adds detailed fix documentation describing root causes, implemented fixes, and associated tests for AWS default-profile SSO interference and NoOptDefVal flag normalization via per-command registry.

Sequence Diagram(s)

mermaid
sequenceDiagram
actor User
participant CLI as Atmos CLI
participant Registry as FlagRegistry (cmd/internal)
participant Parser as Cobra/pflag
participant Command as Subcommand (e.g., terraform)

User->>CLI: invoke command with NoOptDefVal flag (e.g., --identity value)
CLI->>Registry: lookup FlagRegistry for target command
CLI->>Registry: PreprocessNoOptDefValArgs(os.Args)
Registry-->>CLI: normalized args (possibly modified)
CLI->>CLI: set os.Args / RootCmd.SetArgs(normalized)
CLI->>Parser: hand off args for parsing
Parser->>Command: parsed flags and args delivered

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • osterman
  • milldr
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 27.08% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title directly summarizes both main fixes: AWS profile SSO interference and --identity flag regression, aligning with the changeset's core objectives.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch aknysh/atmos-auth-fixes-4

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (6)
cmd/root.go (2)

1547-1576: Three identical SetArgs guards — extract to a helper or single pre-check.

The pattern if len(osArgs) != len(os.Args[1:]) { RootCmd.SetArgs(osArgs) } appears in three early-return branches. A single resolved variable at the top keeps intent clear and reduces repetition.

♻️ Suggested refactor
+    normalized := len(osArgs) != len(os.Args[1:])
+    if normalized && targetCmd == nil {
+        RootCmd.SetArgs(osArgs)
+    }
 
     targetCmd, _, _ := RootCmd.Find(osArgs)
     if targetCmd == nil {
-        // Even if we can't find the command, apply normalization.
-        if len(osArgs) != len(os.Args[1:]) {
-            RootCmd.SetArgs(osArgs)
-        }
         return
     }
     ...
     if cmdName == "" {
-        // Apply normalization even if we can't determine the command.
-        if len(osArgs) != len(os.Args[1:]) {
-            RootCmd.SetArgs(osArgs)
-        }
         return
     }
     ...
     if len(compatFlags) == 0 {
-        // No compat flags, but still apply normalization if args changed.
-        if len(osArgs) != len(os.Args[1:]) {
-            RootCmd.SetArgs(osArgs)
-        }
         return
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/root.go` around lines 1547 - 1576, There are three identical guards
calling RootCmd.SetArgs(osArgs) when len(osArgs) != len(os.Args[1:]); refactor
by computing a single boolean (e.g., argsChanged := len(osArgs) !=
len(os.Args[1:])) once at the start and replace the repeated blocks with either
a single pre-check that calls RootCmd.SetArgs(osArgs) before early returns or a
helper function (e.g., applyNormalizedArgsIfChanged(argsChanged, osArgs)) that
you call in each early-return branch; update references in this function around
targetCmd, cmdName, and the compatFlags check
(internal.GetCompatFlagsForCommand) to use the new argsChanged/helper instead of
duplicating the length check.

1485-1487: optionalValueFlags needs a maintenance pointer to its flag registration site.

The map is a manually-maintained parallel to flags registered with NoOptDefVal. There's no compile-time link between this map and the flag definitions (e.g., cmd/terraform/flags.go). A future contributor adding a flag with NoOptDefVal might not know to update this map too, silently breaking space-separated syntax for that flag.

✨ Suggested addition
 // This map is used by normalizeOptionalValueFlags to convert "--flag value" to "--flag=value"
 // before Cobra parses, ensuring both "--flag=value" and "--flag value" work correctly.
+//
+// MAINTENANCE NOTE: Whenever a new flag is registered with NoOptDefVal (see cmd/auth.go and
+// cmd/terraform/flags.go), add its "--flag-name" entry here to preserve space-separated syntax.
 var optionalValueFlags = map[string]bool{
     "--identity": true,
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/root.go` around lines 1485 - 1487, optionalValueFlags is a
manually-maintained mirror of flags registered with NoOptDefVal and lacks a
maintenance pointer; add a concise comment immediately above the
optionalValueFlags declaration that tells maintainers to update this map
whenever they register a flag with NoOptDefVal (mention the exact search term
"NoOptDefVal" and the specific flag name "--identity") and point them to the
flag registration locations by advising to search the repo for NoOptDefVal usage
so future contributors know where to update this map; keep the comment short and
actionable (e.g., "If you add a flag using NoOptDefVal, update
optionalValueFlags accordingly — search for 'NoOptDefVal' to find registrations
such as the --identity flag").
pkg/auth/cloud/aws/env_test.go (2)

301-313: Stale comment and test name doesn't match what the test actually does.

Line 308–309 says "We can't import config in the test directly since we're in the same package." — but config is imported at line 9 of this file and used in other tests. The test is also named TestLoadIsolatedAWSConfig_WithRegionOption yet calls LoadIsolatedAWSConfig(ctx) with zero options, never exercising any region option. A dedicated test for this path already exists as TestLoadIsolatedAWSConfig_RespectsRegionOption (line 421).

♻️ Proposed fix — rename the test to match its actual intent
-func TestLoadIsolatedAWSConfig_WithRegionOption(t *testing.T) {
-	// Verify that explicit options are still respected.
-	t.Setenv("AWS_PROFILE", "some-profile")
-	t.Setenv("AWS_ACCESS_KEY_ID", "AKIA_SHOULD_BE_IGNORED")
-
-	ctx := context.Background()
-
-	// We can't import config in the test directly since we're in the same package.
-	// But we can verify the function works with the region option.
-	cfg, err := LoadIsolatedAWSConfig(ctx)
-	require.NoError(t, err)
-	assert.NotNil(t, cfg)
-}
+func TestLoadIsolatedAWSConfig_SucceedsWithExternalVarsSet(t *testing.T) {
+	// Verify that LoadIsolatedAWSConfig succeeds even when AWS_PROFILE and
+	// credential env vars are set in the environment (they are cleared internally).
+	t.Setenv("AWS_PROFILE", "some-profile")
+	t.Setenv("AWS_ACCESS_KEY_ID", "AKIA_SHOULD_BE_IGNORED")
+
+	ctx := context.Background()
+	cfg, err := LoadIsolatedAWSConfig(ctx)
+	require.NoError(t, err)
+	assert.NotNil(t, cfg)
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/auth/cloud/aws/env_test.go` around lines 301 - 313, The test named
TestLoadIsolatedAWSConfig_WithRegionOption is misnamed and contains a stale
comment about importing config; rename the test to reflect its actual behavior
(e.g., TestLoadIsolatedAWSConfig_DefaultBehavior or
TestLoadIsolatedAWSConfig_NoOptions) and remove or update the stale comment ("We
can't import config...") so it no longer references importing config; ensure the
test continues to call LoadIsolatedAWSConfig(ctx) with no options and keep the
existing assertion checks, and note that region-specific behavior is already
covered by TestLoadIsolatedAWSConfig_RespectsRegionOption.

229-254: os.Unsetenv without cleanup leaks state to subsequent tests.

Line 233 uses os.Unsetenv("AWS_ACCESS_KEY_ID") directly. If the real process environment (or a prior test) had this variable set, calling os.Unsetenv here without registering a cleanup leaves it unset for all subsequent tests in the run. The same concern applies to lines 340–342 and lines 503–505.

The preferred pattern is to register explicit cleanup so the state is restored regardless of test outcome:

♻️ Preferred pattern
+// If the var was set before this test, restore it afterward.
+if val, ok := os.LookupEnv("AWS_ACCESS_KEY_ID"); ok {
+    t.Cleanup(func() { os.Setenv("AWS_ACCESS_KEY_ID", val) })
+}
 os.Unsetenv("AWS_ACCESS_KEY_ID")

Apply the same pattern at lines 340–342 and the loop at lines 503–505.

As per learnings: "tests should prefer t.Setenv for environment variable setup/teardown instead of os.Setenv/Unsetenv to ensure test-scoped isolation."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/auth/cloud/aws/env_test.go` around lines 229 - 254, Replace direct calls
to os.Unsetenv("AWS_ACCESS_KEY_ID") with a test-scoped cleanup that restores the
original environment state: capture the prior value using os.LookupEnv, register
t.Cleanup to either os.Setenv(key, val) if it existed or os.Unsetenv(key) if it
did not, then call os.Unsetenv for the test; apply this change in
TestWithIsolatedAWSEnv_RestoresOnlySetVars (replace the os.Unsetenv at the start
of that test) and make the same replacement for the other occurrences referenced
(the block at lines ~340–342 and the loop at ~503–505) so each unset is properly
restored after the test.
pkg/auth/cloud/aws/env.go (2)

119-125: Unreachable if err != nil block — dead code.

isolateErr is the direct return value of WithIsolatedAWSEnv(func() error { ...; return err }), so isolateErr == err always. If err != nil, then isolateErr != nil too, and the function already returns on line 120–121. The second check on lines 123–125 is never reached.

♻️ Simplify to a single error check
 	isolateErr := WithIsolatedAWSEnv(func() error {
 		cfg, err = config.LoadDefaultConfig(ctx, isolatedOptFns...)
 		return err
 	})

 	if isolateErr != nil {
 		return aws.Config{}, fmt.Errorf("%w: %w", errUtils.ErrLoadAWSConfig, isolateErr)
 	}

-	if err != nil {
-		return aws.Config{}, fmt.Errorf("%w: %w", errUtils.ErrLoadAWSConfig, err)
-	}
-
 	return cfg, nil
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/auth/cloud/aws/env.go` around lines 119 - 125, The duplicate error check
is unreachable because isolateErr is the return value of WithIsolatedAWSEnv and
always equals err; remove the second `if err != nil` block and only check/return
the wrapped error using `isolateErr` (or directly return the error from
`WithIsolatedAWSEnv`) in the function that calls `WithIsolatedAWSEnv` in
pkg/auth/cloud/aws/env.go so you don't have dead code; update the return to wrap
`isolateErr` with `errUtils.ErrLoadAWSConfig` (refer to `WithIsolatedAWSEnv`,
`isolateErr`, and `err` to locate the logic).

54-86: Panics inside the closure will leave the environment permanently dirty.

WithIsolatedAWSEnv restores env vars sequentially after fn() returns, but skips restoration if fn() panics. This leaves the global process environment in an inconsistent state for any subsequent operations. The test TestWithIsolatedAWSEnv_RestoresAfterPanic explicitly documents this known gap rather than fixing it.

♻️ Proposed fix using defer
 func WithIsolatedAWSEnv(fn func() error) error {
 	originalValues := make(map[string]string)
 	ignoredVars := make([]string, 0)
 	for _, key := range problematicAWSEnvVars {
 		if value, exists := os.LookupEnv(key); exists {
 			originalValues[key] = value
 			ignoredVars = append(ignoredVars, key)
 		}
 	}

 	if len(ignoredVars) > 0 {
 		log.Debug("Ignoring external AWS environment variables during authentication", "variables", ignoredVars)
 	}

 	for _, key := range problematicAWSEnvVars {
 		os.Unsetenv(key)
 	}

+	// Restore original values even if fn panics.
+	defer func() {
+		for _, key := range problematicAWSEnvVars {
+			if value, exists := originalValues[key]; exists {
+				os.Setenv(key, value)
+			}
+		}
+	}()
+
-	err := fn()
-
-	// Restore original values.
-	for _, key := range problematicAWSEnvVars {
-		if value, exists := originalValues[key]; exists {
-			os.Setenv(key, value)
-		}
-	}
-
-	return err
+	return fn()
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/auth/cloud/aws/env.go` around lines 54 - 86, WithIsolatedAWSEnv can leave
the process env dirty if fn panics because restoration happens after fn returns;
change the function to install a deferred restoration immediately after
capturing originalValues/ignoredVars so restoration runs on both normal return
and panic. Specifically, after building originalValues (and possibly logging
ignoredVars), add a defer that iterates problematicAWSEnvVars and: if a key
exists in originalValues call os.Setenv(key, value) else call os.Unsetenv(key);
then proceed to clear the problematicAWSEnvVars and invoke fn(); keep using the
same symbols (WithIsolatedAWSEnv, problematicAWSEnvVars, originalValues,
ignoredVars) so the env is always restored even when fn panics.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/fixes/2026-02-17-aws-default-profile-sso-interference.md`:
- Around line 32-41: The two fenced code blocks producing markdownlint MD040
need language tags: change each opening fence from ``` to ```text (the first
block containing "Error: failed to load AWS config" and the second containing
"Failed to load AWS configuration for SSO authentication in region
'us-west-2'"); also preserve the existing closing fences and keep the suggested
trailing two spaces after "## Explanation" to maintain the intended line break
formatting.

In `@docs/fixes/2026-02-17-identity-flag-space-separated-value.md`:
- Around line 101-108: Update the table to reflect the actual test structure by
using the real test function and subtest names from cmd/root_test.go: replace
the listed test names (e.g., TestNormalizeOptionalValueFlags_IdentityWithValue,
TestNormalizeOptionalValueFlags_IdentityWithEquals, etc.) with the top-level
test function name TestNormalizeOptionalValueFlags and the subtest names used
inside it (e.g., "identity with space-separated value", "identity with equals",
"identity without value", "identity followed by flag", "identity after end of
options", "no optional-value flags") so readers can match the table entries to
the actual tests and use go test -run with the correct names.

In `@pkg/auth/cloud/aws/env.go`:
- Around line 285-307: The exported function WarnIfAWSProfileSet is missing the
perf span; add a deferred perf.Track call at the top of the function: insert
defer perf.Track(nil, "pkg.auth.cloud.aws.WarnIfAWSProfileSet")() as the first
statement and ensure a blank line follows it, mirroring the pattern used in
LoadAtmosManagedAWSConfig and PrepareEnvironment so the function opens and
tracks a perf span for telemetry.

In `@pkg/auth/providers/aws/sso_test.go`:
- Around line 908-929: The test
TestSSOProvider_Authenticate_ErrorContainsProfileHint never reaches the AWS
config-loading path because isInteractive() (which calls
term.IsTTYSupportForStderr()) is false in CI; update the test to force the
interactive gate and provoke a config-load failure: stub or monkeypatch
isInteractive()/term.IsTTYSupportForStderr to return true (or otherwise bypass
the interactive check), inject or mock the SSO provider's config-loading path
(e.g., make LoadIsolatedAWSConfig or the sso client return a deterministic
error) so Authenticate actually exercises the error construction in sso.go, and
assert the specific profile-hint substring is present; alternatively, replace
the integration-style call with a focused unit test that calls the
error-building function from sso.go directly and verifies the hint text.

---

Nitpick comments:
In `@cmd/root.go`:
- Around line 1547-1576: There are three identical guards calling
RootCmd.SetArgs(osArgs) when len(osArgs) != len(os.Args[1:]); refactor by
computing a single boolean (e.g., argsChanged := len(osArgs) !=
len(os.Args[1:])) once at the start and replace the repeated blocks with either
a single pre-check that calls RootCmd.SetArgs(osArgs) before early returns or a
helper function (e.g., applyNormalizedArgsIfChanged(argsChanged, osArgs)) that
you call in each early-return branch; update references in this function around
targetCmd, cmdName, and the compatFlags check
(internal.GetCompatFlagsForCommand) to use the new argsChanged/helper instead of
duplicating the length check.
- Around line 1485-1487: optionalValueFlags is a manually-maintained mirror of
flags registered with NoOptDefVal and lacks a maintenance pointer; add a concise
comment immediately above the optionalValueFlags declaration that tells
maintainers to update this map whenever they register a flag with NoOptDefVal
(mention the exact search term "NoOptDefVal" and the specific flag name
"--identity") and point them to the flag registration locations by advising to
search the repo for NoOptDefVal usage so future contributors know where to
update this map; keep the comment short and actionable (e.g., "If you add a flag
using NoOptDefVal, update optionalValueFlags accordingly — search for
'NoOptDefVal' to find registrations such as the --identity flag").

In `@pkg/auth/cloud/aws/env_test.go`:
- Around line 301-313: The test named TestLoadIsolatedAWSConfig_WithRegionOption
is misnamed and contains a stale comment about importing config; rename the test
to reflect its actual behavior (e.g., TestLoadIsolatedAWSConfig_DefaultBehavior
or TestLoadIsolatedAWSConfig_NoOptions) and remove or update the stale comment
("We can't import config...") so it no longer references importing config;
ensure the test continues to call LoadIsolatedAWSConfig(ctx) with no options and
keep the existing assertion checks, and note that region-specific behavior is
already covered by TestLoadIsolatedAWSConfig_RespectsRegionOption.
- Around line 229-254: Replace direct calls to os.Unsetenv("AWS_ACCESS_KEY_ID")
with a test-scoped cleanup that restores the original environment state: capture
the prior value using os.LookupEnv, register t.Cleanup to either os.Setenv(key,
val) if it existed or os.Unsetenv(key) if it did not, then call os.Unsetenv for
the test; apply this change in TestWithIsolatedAWSEnv_RestoresOnlySetVars
(replace the os.Unsetenv at the start of that test) and make the same
replacement for the other occurrences referenced (the block at lines ~340–342
and the loop at ~503–505) so each unset is properly restored after the test.

In `@pkg/auth/cloud/aws/env.go`:
- Around line 119-125: The duplicate error check is unreachable because
isolateErr is the return value of WithIsolatedAWSEnv and always equals err;
remove the second `if err != nil` block and only check/return the wrapped error
using `isolateErr` (or directly return the error from `WithIsolatedAWSEnv`) in
the function that calls `WithIsolatedAWSEnv` in pkg/auth/cloud/aws/env.go so you
don't have dead code; update the return to wrap `isolateErr` with
`errUtils.ErrLoadAWSConfig` (refer to `WithIsolatedAWSEnv`, `isolateErr`, and
`err` to locate the logic).
- Around line 54-86: WithIsolatedAWSEnv can leave the process env dirty if fn
panics because restoration happens after fn returns; change the function to
install a deferred restoration immediately after capturing
originalValues/ignoredVars so restoration runs on both normal return and panic.
Specifically, after building originalValues (and possibly logging ignoredVars),
add a defer that iterates problematicAWSEnvVars and: if a key exists in
originalValues call os.Setenv(key, value) else call os.Unsetenv(key); then
proceed to clear the problematicAWSEnvVars and invoke fn(); keep using the same
symbols (WithIsolatedAWSEnv, problematicAWSEnvVars, originalValues, ignoredVars)
so the env is always restored even when fn panics.

- Replace hardcoded normalizeOptionalValueFlags with registry-based
  FlagRegistry.PreprocessNoOptDefValArgs (reuses existing infrastructure)
- Add RegisterCommandFlagRegistry/GetCommandFlagRegistry to command registry
- Use defer for env var restoration in WithIsolatedAWSEnv (panic-safe)
- Remove unreachable error check in LoadIsolatedAWSConfig
- Add missing perf.Track to WithIsolatedAWSEnv, LoadIsolatedAWSConfig,
  WarnIfAWSProfileSet
- Fix test isolation: use t.Setenv before os.Unsetenv, clear AWS_REGION
  in CI-sensitive test
- Remove misleading test that couldn't reach the intended code path
- Add language specifier to fenced code block in fix doc

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@aknysh aknysh requested a review from osterman February 18, 2026 06:30
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
pkg/auth/cloud/aws/env.go (1)

160-182: LoadAtmosManagedAWSConfig still uses non-deferred restoration — inconsistency with the new pattern.

WithIsolatedAWSEnv was upgraded to defer-based restoration for panic safety, but LoadAtmosManagedAWSConfig still restores credential vars inline after config.LoadDefaultConfig. If LoadDefaultConfig panics the credential vars (AWS_ACCESS_KEY_ID, etc.) won't be restored. This is a pre-existing gap but worth aligning in a follow-up.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/auth/cloud/aws/env.go` around lines 160 - 182, LoadAtmosManagedAWSConfig
currently clears AWS credential env vars into originalValues and restores them
after calling config.LoadDefaultConfig, but that inline restore is not
panic-safe; change the restoration to a deferred function immediately after
saving originalValues (like WithIsolatedAWSEnv) so the environment is always
restored even if config.LoadDefaultConfig panics or returns an error, and ensure
the deferred restore also logs restored variables (use
clearedVars/originalValues and reference LoadAtmosManagedAWSConfig,
originalValues, clearedVars, and config.LoadDefaultConfig to locate the code).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/fixes/2026-02-17-identity-flag-space-separated-value.md`:
- Line 29: Two fenced code blocks in the proposed fix are missing language
specifiers (MD040); update the two triple-backtick fences that enclose the " $
atmos terraform plan tfstate-backend -s core-usw2-root --identity my-identity"
example and the "# Workaround 1: Use = syntax" example by adding appropriate
language identifiers (e.g., "console", "bash" or "shell") immediately after the
opening ``` so the blocks become fenced as ```bash or ```shell (or ```text for
plain output) to satisfy MD040.

In `@pkg/auth/cloud/aws/env_test.go`:
- Around line 497-508: Remove the dead local variable profileDuringLoad and its
noop assignment from TestLoadAtmosManagedAWSConfig_DoesNotClearProfileVar; leave
the t.Setenv call, the call to LoadAtmosManagedAWSConfig(ctx,
config.WithRegion("us-east-1")), and the final assert that checks
os.Getenv("AWS_PROFILE") so the test still verifies the environment is preserved
but without the unused profileDuringLoad declaration or the `_ =
profileDuringLoad` line.

---

Nitpick comments:
In `@pkg/auth/cloud/aws/env.go`:
- Around line 160-182: LoadAtmosManagedAWSConfig currently clears AWS credential
env vars into originalValues and restores them after calling
config.LoadDefaultConfig, but that inline restore is not panic-safe; change the
restoration to a deferred function immediately after saving originalValues (like
WithIsolatedAWSEnv) so the environment is always restored even if
config.LoadDefaultConfig panics or returns an error, and ensure the deferred
restore also logs restored variables (use clearedVars/originalValues and
reference LoadAtmosManagedAWSConfig, originalValues, clearedVars, and
config.LoadDefaultConfig to locate the code).

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 18, 2026

Codecov Report

❌ Patch coverage is 84.37500% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.23%. Comparing base (1204712) to head (81a31bb).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
cmd/root.go 37.50% 4 Missing and 1 partial ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2087      +/-   ##
==========================================
+ Coverage   76.18%   76.23%   +0.05%     
==========================================
  Files         829      829              
  Lines       78534    78555      +21     
==========================================
+ Hits        59829    59885      +56     
+ Misses      14973    14941      -32     
+ Partials     3732     3729       -3     
Flag Coverage Δ
unittests 76.23% <84.37%> (+0.05%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
cmd/internal/registry.go 88.98% <100.00%> (+1.02%) ⬆️
cmd/terraform/terraform.go 29.34% <100.00%> (+0.77%) ⬆️
pkg/auth/cloud/aws/env.go 100.00% <100.00%> (+37.03%) ⬆️
pkg/auth/providers/aws/sso.go 51.55% <ø> (ø)
cmd/root.go 67.83% <37.50%> (-0.31%) ⬇️

... and 4 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

aknysh and others added 2 commits February 18, 2026 09:42
- Use defer-based env restoration in LoadAtmosManagedAWSConfig (panic-safe)
- Remove dead profileDuringLoad variable in test
- Add language specifiers to fenced code blocks in fix doc
- Add tests for RegisterCommandFlagRegistry/GetCommandFlagRegistry
  (registration, not-found, overwrite, multi-provider, concurrent)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
pkg/auth/cloud/aws/env_test.go (3)

394-415: Contract tests for the env-var lists are a nice safety net.

They guard against accidental removal of entries from problematicAWSEnvVars and environmentVarsToClear. The nested-loop lookup could be replaced with slices.Contains for readability, but that's purely cosmetic.

Also applies to: 546-567

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/auth/cloud/aws/env_test.go` around lines 394 - 415, Replace the
nested-loop membership checks with slices.Contains to improve readability: in
TestProblematicAWSEnvVars_ContainsExpectedVars, use
slices.Contains(problematicAWSEnvVars, expected) instead of the manual loop;
likewise update the analogous test that asserts entries in
environmentVarsToClear (the test around lines 546-567) to use
slices.Contains(environmentVarsToClear, expected). Ensure you import
golang.org/x/exp/slices or the standard slices package (Go 1.21+) as
appropriate.

144-158: Non-mutation assertion is trivially true.

Line 156 compares value (from the range) against tt.inputEnv[key] — same map, same key. It always passes regardless of whether the input was mutated. TestPrepareEnvironment_DoesNotMutateInput below already covers this properly by saving values beforehand, so this isn't a gap, just dead weight.

Consider either removing this block or snapshotting the original map before calling PrepareEnvironment so the comparison is meaningful.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/auth/cloud/aws/env_test.go` around lines 144 - 158, The non-mutation
check inside the test is meaningless because it compares the ranged value
against the same map entry (value vs tt.inputEnv[key]); either remove this dead
block or make it a real check by snapshotting tt.inputEnv before calling
PrepareEnvironment: create a shallow copy (e.g., orig :=
make(map[string]string); copy keys/values from tt.inputEnv), then call
PrepareEnvironment and assert that for keys not in environmentVarsToClear the
values remain equal to orig[key]; reference PrepareEnvironment and the test that
currently contains the loop when making the change.

334-372: WarnIfAWSProfileSet tests only assert no-panic.

Since log output isn't captured, these tests verify the function doesn't blow up but can't confirm the correct warning was emitted. That's acceptable for now. If you ever want stronger guarantees, injecting a logger (or using a test hook) would let you assert on the actual messages.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/auth/cloud/aws/env_test.go` around lines 334 - 372, Tests for
WarnIfAWSProfileSet currently only assert no-panic and don't verify emitted log
messages; to allow asserting the warning, refactor WarnIfAWSProfileSet to accept
an injectable logger (or use a package-level logger variable that tests can
replace) and update the TestWarnIfAWSProfileSet_* tests to provide a test
logger/hook that captures output and asserts expected messages (e.g., that a
warning is logged when AWS_PROFILE is set and that debug messages are logged
when AWS_CONFIG_FILE or AWS_SHARED_CREDENTIALS_FILE are present); keep backward
compatibility by providing a wrapper WarnIfAWSProfileSet() that calls the new
function with the default logger so existing callers/tests that only check
no-panic still work.
pkg/auth/cloud/aws/env.go (1)

146-193: os.Setenv / os.Unsetenv errors silently dropped.

Lines 167 and 174 discard errors from os.Unsetenv and os.Setenv. These almost never fail (only on invalid key names, which are constants here), so this is low-risk. Still, if you ever want belt-and-suspenders, a log.Debug on failure in the defer would surface surprises without changing behavior.

Not blocking — just flagging for awareness.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/auth/cloud/aws/env.go` around lines 146 - 193, The defer that restores
environment variables in LoadAtmosManagedAWSConfig currently ignores errors from
os.Setenv (and the earlier os.Unsetenv calls are also unchecked); update both
places to capture and log any returned error (e.g., after each os.Unsetenv and
inside the defer for os.Setenv use a short if err != nil {
log.Debug("...setenv/unsetenv failed", "key", key, "error", err) } ) so failures
are surfaced without changing behavior, referencing the
clearedVars/originalValues logic and the os.Unsetenv/os.Setenv calls.
cmd/internal/registry_test.go (1)

590-600: Optional: rename local registry to avoid shadowing the package-level registry variable.

The local variable registry (a *flags.FlagRegistry) shadows the package-level registry (the commandRegistry). It compiles fine, but it's easy to mistake one for the other during maintenance.

✏️ Suggested rename
-	registry := flags.NewFlagRegistry()
-	registry.Register(&flags.StringFlag{
+	flagReg := flags.NewFlagRegistry()
+	flagReg.Register(&flags.StringFlag{
 		Name:        "identity",
 		NoOptDefVal: "__SELECT__",
 	})

-	RegisterCommandFlagRegistry("terraform", registry)
+	RegisterCommandFlagRegistry("terraform", flagReg)

 	got := GetCommandFlagRegistry("terraform")
 	require.NotNil(t, got, "registered flag registry should be retrievable")
-	assert.Same(t, registry, got, "should return the exact same registry instance")
+	assert.Same(t, flagReg, got, "should return the exact same registry instance")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/internal/registry_test.go` around lines 590 - 600, The local variable
named `registry` in the test should be renamed to avoid shadowing the
package-level `registry` (the command registry); update the variable returned by
`flags.NewFlagRegistry()` and all uses in this snippet (the
`RegisterCommandFlagRegistry("terraform", ...)`, `got :=
GetCommandFlagRegistry("terraform")` and the assertions) to a non-shadowing name
such as `flagRegistry` or `localRegistry` so the test clearly distinguishes the
local `*flags.FlagRegistry` from the package-level `registry`.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/fixes/2026-02-17-identity-flag-space-separated-value.md`:
- Around line 66-67: The doc comment references a brittle line number for
processArgsAndFlags (the manual identity extraction for `--identity`) which will
drift; update the doc to remove the numeric line reference and instead reference
the function name `processArgsAndFlags` (or add a stable comment anchor like `//
ANCHOR: identity-extraction` adjacent to the manual parsing block) and note that
Cobra/pflag already consumes `--identity`, so point readers to that function or
anchor rather than a line number.
- Around line 114-125: Update the documentation table so the test case names
exactly match the real subtest names in the code: use the full subtest names
from TestFlagRegistry_PreprocessNoOptDefValArgs (e.g., "identity flag with space
syntax → equals syntax", "identity flag with equals syntax", "identity at end of
args", "identity flag followed by another flag", "empty registry", "multiple
NoOptDefVal flags") and remove or relocate the "double dash prefix" row to
reference TestFlagRegistry_PreprocessNoOptDefValArgs_EdgeCases (or label it
exactly as the edge-case subtest name), ensuring the doc entries match the
actual subtest strings so `go test -run` calls will find them.

---

Duplicate comments:
In `@pkg/auth/cloud/aws/env.go`:
- Around line 285-309: The WarnIfAWSProfileSet function correctly checks
AWS_PROFILE, AWS_CONFIG_FILE, and AWS_SHARED_CREDENTIALS_FILE and uses
appropriate log levels; no code changes are needed—leave the function
(WarnIfAWSProfileSet) as-is and keep the perf.Track defer in place.

---

Nitpick comments:
In `@cmd/internal/registry_test.go`:
- Around line 590-600: The local variable named `registry` in the test should be
renamed to avoid shadowing the package-level `registry` (the command registry);
update the variable returned by `flags.NewFlagRegistry()` and all uses in this
snippet (the `RegisterCommandFlagRegistry("terraform", ...)`, `got :=
GetCommandFlagRegistry("terraform")` and the assertions) to a non-shadowing name
such as `flagRegistry` or `localRegistry` so the test clearly distinguishes the
local `*flags.FlagRegistry` from the package-level `registry`.

In `@pkg/auth/cloud/aws/env_test.go`:
- Around line 394-415: Replace the nested-loop membership checks with
slices.Contains to improve readability: in
TestProblematicAWSEnvVars_ContainsExpectedVars, use
slices.Contains(problematicAWSEnvVars, expected) instead of the manual loop;
likewise update the analogous test that asserts entries in
environmentVarsToClear (the test around lines 546-567) to use
slices.Contains(environmentVarsToClear, expected). Ensure you import
golang.org/x/exp/slices or the standard slices package (Go 1.21+) as
appropriate.
- Around line 144-158: The non-mutation check inside the test is meaningless
because it compares the ranged value against the same map entry (value vs
tt.inputEnv[key]); either remove this dead block or make it a real check by
snapshotting tt.inputEnv before calling PrepareEnvironment: create a shallow
copy (e.g., orig := make(map[string]string); copy keys/values from tt.inputEnv),
then call PrepareEnvironment and assert that for keys not in
environmentVarsToClear the values remain equal to orig[key]; reference
PrepareEnvironment and the test that currently contains the loop when making the
change.
- Around line 334-372: Tests for WarnIfAWSProfileSet currently only assert
no-panic and don't verify emitted log messages; to allow asserting the warning,
refactor WarnIfAWSProfileSet to accept an injectable logger (or use a
package-level logger variable that tests can replace) and update the
TestWarnIfAWSProfileSet_* tests to provide a test logger/hook that captures
output and asserts expected messages (e.g., that a warning is logged when
AWS_PROFILE is set and that debug messages are logged when AWS_CONFIG_FILE or
AWS_SHARED_CREDENTIALS_FILE are present); keep backward compatibility by
providing a wrapper WarnIfAWSProfileSet() that calls the new function with the
default logger so existing callers/tests that only check no-panic still work.

In `@pkg/auth/cloud/aws/env.go`:
- Around line 146-193: The defer that restores environment variables in
LoadAtmosManagedAWSConfig currently ignores errors from os.Setenv (and the
earlier os.Unsetenv calls are also unchecked); update both places to capture and
log any returned error (e.g., after each os.Unsetenv and inside the defer for
os.Setenv use a short if err != nil { log.Debug("...setenv/unsetenv failed",
"key", key, "error", err) } ) so failures are surfaced without changing
behavior, referencing the clearedVars/originalValues logic and the
os.Unsetenv/os.Setenv calls.

- Remove tautological non-mutation assertion in TestPrepareEnvironment
- Rename shadowed registry variable to flagReg in test
- Remove brittle line number reference in fix doc
- Fix inaccurate test names in fix doc, reference actual test functions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
coderabbitai[bot]
coderabbitai bot previously approved these changes Feb 18, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
pkg/auth/cloud/aws/env_test.go (2)

401-412: Prefer assert.ErrorIs over string-matching the error message.

err.Error() string checks couple the test to error message wording. The error is wrapped with %w so assert.ErrorIs, which is a wrapper for errors.Is, walks the full error chain and is a cleaner match against the sentinel.

♻️ Proposed refactor
+import errUtils "github.com/cloudposse/atmos/errors"

 _, err := LoadIsolatedAWSConfig(ctx, invalidOpt)
-assert.Error(t, err, "LoadIsolatedAWSConfig should return error from invalid option")
-assert.Contains(t, err.Error(), "failed to load AWS config")
+assert.ErrorIs(t, err, errUtils.ErrLoadAWSConfig, "LoadIsolatedAWSConfig should wrap ErrLoadAWSConfig")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/auth/cloud/aws/env_test.go` around lines 401 - 412, Update the test
TestLoadIsolatedAWSConfig_ErrorFromInvalidOption to use assert.ErrorIs rather
than string-matching the error text: after calling LoadIsolatedAWSConfig(ctx,
invalidOpt) keep the assert.Error check, then replace the
assert.Contains(err.Error(), "failed to load AWS config") assertion with
assert.ErrorIs(t, err, assert.AnError) to verify the wrapped sentinel error is
present (this uses errors.Is to walk the chain); reference the test function
name and LoadIsolatedAWSConfig when making the change.

378-399: Verbose "contains" loops can use assert.Contains directly.

assert.Contains supports slice containment, making the nested loops unnecessary.

♻️ Proposed refactor (applies to both tests)
-	for _, expected := range expectedVars {
-		found := false
-		for _, actual := range problematicAWSEnvVars {
-			if actual == expected {
-				found = true
-				break
-			}
-		}
-		assert.True(t, found, "problematicAWSEnvVars should contain %s", expected)
-	}
+	for _, expected := range expectedVars {
+		assert.Contains(t, problematicAWSEnvVars, expected, "problematicAWSEnvVars should contain %s", expected)
+	}

Apply the same pattern to TestEnvironmentVarsToClear_ContainsExpectedVars replacing environmentVarsToClear.

Also applies to: 530-552

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/auth/cloud/aws/env_test.go` around lines 378 - 399, Replace the manual
nested "contains" loops in TestProblematicAWSEnvVars_ContainsExpectedVars by
calling assert.Contains for each expected entry against the
problematicAWSEnvVars slice; specifically, in
TestProblematicAWSEnvVars_ContainsExpectedVars iterate expectedVars and use
assert.Contains(t, problematicAWSEnvVars, expected, ...) instead of the inner
loop, and apply the same refactor to
TestEnvironmentVarsToClear_ContainsExpectedVars by asserting each expected item
is in environmentVarsToClear.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@pkg/auth/cloud/aws/env_test.go`:
- Around line 444-455: In
TestLoadAtmosManagedAWSConfig_ReturnsErrorOnInvalidOption, replace the
string-based assertions with a sentinel-based check using assert.ErrorIs so the
test verifies the original error returned by the invalid option; specifically,
keep the invalidOpt that returns assert.AnError, call
LoadAtmosManagedAWSConfig(ctx, invalidOpt), then use assert.ErrorIs(t, err,
assert.AnError) (and remove the assert.Contains check). This mirrors the pattern
used in TestLoadIsolatedAWSConfig_ErrorFromInvalidOption and ensures the test
asserts the wrapped error correctly.

---

Nitpick comments:
In `@pkg/auth/cloud/aws/env_test.go`:
- Around line 401-412: Update the test
TestLoadIsolatedAWSConfig_ErrorFromInvalidOption to use assert.ErrorIs rather
than string-matching the error text: after calling LoadIsolatedAWSConfig(ctx,
invalidOpt) keep the assert.Error check, then replace the
assert.Contains(err.Error(), "failed to load AWS config") assertion with
assert.ErrorIs(t, err, assert.AnError) to verify the wrapped sentinel error is
present (this uses errors.Is to walk the chain); reference the test function
name and LoadIsolatedAWSConfig when making the change.
- Around line 378-399: Replace the manual nested "contains" loops in
TestProblematicAWSEnvVars_ContainsExpectedVars by calling assert.Contains for
each expected entry against the problematicAWSEnvVars slice; specifically, in
TestProblematicAWSEnvVars_ContainsExpectedVars iterate expectedVars and use
assert.Contains(t, problematicAWSEnvVars, expected, ...) instead of the inner
loop, and apply the same refactor to
TestEnvironmentVarsToClear_ContainsExpectedVars by asserting each expected item
is in environmentVarsToClear.

- Remove misleading error hints about AWS_PROFILE and ~/.aws/config from
  SSO error builder since isolation is now complete
- Remove LogExternalAWSConfig function and tests (unnecessary with full
  isolation via WithSharedConfigFiles)
- Rewrite fix doc with detailed AWS SDK v2 trace showing the three-step
  leak mechanism (WithSharedConfigProfile("") no-op, nil SharedConfigFiles
  fallback, hardcoded $HOME/.aws/config loading)
- Update auth-realm-architecture PRD: Proposed → Implemented, document
  opt-in design with empty default requiring explicit realm configuration
- Update auth-user-credential-realm-isolation PRD status to Implemented
- Add realm architecture references to aws-auth-file-isolation PRD
- Add sso.go to fix doc files changed table

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions github-actions bot added size/xl Extra large size PR and removed size/l Large size PR labels Feb 19, 2026
@mergify
Copy link
Copy Markdown

mergify bot commented Feb 19, 2026

Warning

This PR exceeds the recommended limit of 1,000 lines.

Large PRs are difficult to review and may be rejected due to their size.

Please verify that this PR does not address multiple issues.
Consider refactoring it into smaller, more focused PRs to facilitate a smoother review process.

@aknysh aknysh requested a review from osterman February 19, 2026 06:54
@aknysh aknysh merged commit 501dadc into main Feb 19, 2026
57 checks passed
@aknysh aknysh deleted the aknysh/atmos-auth-fixes-4 branch February 19, 2026 21:21
@github-actions
Copy link
Copy Markdown

These changes were released in v1.207.0.

@github-actions
Copy link
Copy Markdown

These changes were released in v1.208.0-rc.0.

@github-actions
Copy link
Copy Markdown

These changes were released in v1.208.0-test.15.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

patch A minor, backward compatible change size/xl Extra large size PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants