diff --git a/docs/fixes/terraform-output-issues.md b/docs/fixes/terraform-output-issues.md new file mode 100644 index 0000000000..c3a8341552 --- /dev/null +++ b/docs/fixes/terraform-output-issues.md @@ -0,0 +1,297 @@ +# Fixing User-Reported Issues + +This document tracks analysis and fixes for user-reported issues in Atmos. + +## Issue #1030: Missing Component Results in Silent Failure + +**GitHub Issue:** https://github.com/cloudposse/atmos/issues/1030 + +**Reporter:** joshAtRula + +**Status:** Complete + +### Problem Statement + +When using template functions like `atmos.Component()` or `!terraform.output` to reference outputs from other components, if the **referenced component is removed from the Atmos configuration** (not just the terraform state), Atmos exits with code 1 but produces **confusing or no error message**, even with `ATMOS_LOGS_LEVEL=Trace/Debug` enabled. + +### Important Distinction + +There are **two different scenarios** that must be handled differently: + +1. **Component exists in config, but output doesn't exist** (e.g., terraform output not defined) + - Should return `nil` (backward compatible) + - This is the existing behavior and is correct + +2. **Component removed from Atmos configuration** (referenced component not in stack manifest) + - Should return a **clear error** explaining the component is not found + - This was broken: error was swallowed due to component type fallback logic + +### Reproduction Steps + +1. Configure Component A to pull outputs from Component B using template functions +2. Verify normal operation (plan/apply works correctly) +3. Remove or comment out Component B references in the manifest (leaving state/resources intact) +4. Attempt to run plan/apply/output on Component A +5. Observe silent/confusing failure with exit code 1 + +### Root Cause Analysis + +The actual root cause was in **component type detection fallback logic**, not in the YAML functions themselves: + +#### Error Chain Contamination + +When `!terraform.output` or `atmos.Component()` fails because a referenced component is not found: + +1. `DescribeComponent` returns `ErrInvalidComponent` error +2. This error was wrapped with `%w`, preserving the error chain +3. In `detectComponentType()`, when checking if the main component exists: + ```go + if !errors.Is(err, errUtils.ErrInvalidComponent) { + return result, err + } + // Try Helmfile... then Packer... + ``` +4. Since the error from YAML function processing **also chained to `ErrInvalidComponent`**, the fallback was triggered +5. The system then tried to find the main component as Helmfile, then Packer +6. The **final error** was about the main component not being found as Helmfile/Packer, **not** about the missing referenced component + +#### The Chain Problem + +``` +Original error: ErrInvalidComponent: Could not find component-B in stack... + ↓ (wrapped with %w) +Executor error: failed to describe component-B: + ↓ (wrapped with %w) +YAML function error: failed to get terraform output...: + ↓ +detectComponentType sees ErrInvalidComponent → triggers fallback → confusing error +``` + +### Solution Design + +#### Breaking the Error Chain + +The fix is to **break the `ErrInvalidComponent` chain** when wrapping errors from `DescribeComponent` in YAML functions and template functions. This ensures that: + +1. Errors about the main component not being found → triggers type fallback (correct) +2. Errors about referenced components not being found → returned immediately with clear message (fixed) + +#### Affected Functions + +The fix applies to all three functions that can reference other components: + +| Function | Location | Status | +|----------|----------|--------| +| `!terraform.output` | `pkg/terraform/output/executor.go` | ✅ Fixed - added `wrapDescribeError()` | +| `!terraform.state` | `internal/exec/terraform_state_utils.go` | ✅ Already correct - uses `%v` instead of `%w` | +| `atmos.Component()` | `internal/exec/template_funcs_component.go` | ✅ Fixed - added `wrapComponentFuncError()` | + +#### Implementation Details + +**File: `pkg/terraform/output/executor.go`** (for `!terraform.output`) + +Added `wrapDescribeError()` helper that breaks the `ErrInvalidComponent` chain: + +```go +func wrapDescribeError(component, stack string, err error) error { + if errors.Is(err, errUtils.ErrInvalidComponent) { + // Break the ErrInvalidComponent chain by using ErrDescribeComponent as the base. + // This ensures that errors from YAML function processing don't trigger + // fallback to try other component types. + return fmt.Errorf("%w: component '%s' in stack '%s': %s", + errUtils.ErrDescribeComponent, component, stack, err.Error()) + } + // For other errors, preserve the full chain. + return fmt.Errorf("failed to describe component %s in stack %s: %w", component, stack, err) +} +``` + +**File: `internal/exec/template_funcs_component.go`** (for `atmos.Component()`) + +Added `wrapComponentFuncError()` helper with the same pattern for `atmos.Component()`. + +**File: `internal/exec/terraform_state_utils.go`** (for `!terraform.state`) + +Already correctly breaks the chain using `%v` instead of `%w` for the underlying error (no changes needed): + +```go +er := fmt.Errorf("%w `%s` in stack `%s`\nin YAML function: `%s`\n%v", + errUtils.ErrDescribeComponent, component, stack, yamlFunc, err) +// ^^-- uses %v, not %w +``` + +The key difference is `%v` formats the error as a string, breaking the error chain, while `%w` would preserve the chain and allow `errors.Is()` to match the wrapped error. + +**File: `internal/exec/yaml_func_utils.go`** + +Added debug logging (was already in place from earlier fixes): + +```go +case string: + result, err := processCustomTagsWithContext(...) + if err != nil { + log.Debug("Error processing YAML function", + "value", v, + "stack", currentStack, + "error", err.Error(), + ) + firstErr = err + return v + } + return result +``` + +### Files Modified + +| File | Change | +|------|--------| +| `pkg/terraform/output/executor.go` | Added `wrapDescribeError()` helper, updated `GetOutput` and `fetchAndCacheOutputs` | +| `internal/exec/template_funcs_component.go` | Added `wrapComponentFuncError()` helper, updated `componentFunc` | +| `pkg/terraform/output/executor_test.go` | Added `TestWrapDescribeError_BreaksErrInvalidComponentChain` | +| `internal/exec/template_funcs_component_test.go` | Added `TestWrapComponentFuncError_BreaksErrInvalidComponentChain` | + +### Testing + +1. **Unit tests:** Added tests verifying `ErrInvalidComponent` chain is broken +2. **Error chain verification:** Tests use `assert.NotErrorIs(t, result, errUtils.ErrInvalidComponent)` +3. **Backward compatibility:** + - Missing outputs still return `nil` (backward compatible) + - Other errors (network, auth) preserve full error chain + +### Key Insight + +The issue was NOT about silent nil returns or error swallowing in YAML functions. Those behaviors are **correct**: +- Missing output with no default → return `nil` (backward compatible) +- Errors properly propagate up + +The issue was that the **component type fallback logic** was incorrectly triggered when a **referenced** component was not found, because the error chain included `ErrInvalidComponent`. + +--- + +## Issue #1921: Panic in !terraform.output with Authentication + +**GitHub Issue:** https://github.com/cloudposse/atmos/issues/1921 + +**Reporter:** leoagueci + +**Status:** Complete + +### Problem Statement + +When using `!terraform.output` YAML function with authentication configured (e.g., AWS SSO provider), Atmos panics with the error: + +``` +panic: authContextWrapper.GetChain should not be called +``` + +This occurs when processing nested component references that require authentication context propagation. + +### Reproduction Steps + +1. Configure Atmos authentication (e.g., AWS SSO provider in `atmos.yaml`) +2. Create a component that uses `!terraform.output` to reference another component +3. Run `atmos terraform plan` on the component +4. Observe the panic exception + +### Root Cause Analysis + +The `authContextWrapper` is a minimal `AuthManager` implementation used to propagate authentication context through nested component processing. It's designed to only provide `GetStackInfo()` for passing `AuthContext` to `ExecuteDescribeComponent`. + +When a nested component has its own auth configuration with a `default: true` identity, the `resolveAuthManagerForNestedComponent()` function tries to inherit the identity from the parent: + +```go +// In createComponentAuthManager(): +if parentAuthManager != nil { + chain := parentAuthManager.GetChain() // ← PANIC here + if len(chain) > 0 { + identityName = chain[len(chain)-1] + } +} +``` + +The `authContextWrapper.GetChain()` method was implemented as a panic stub because it was originally assumed this method would never be called on the wrapper. + +### Call Chain Leading to Panic + +``` +!terraform.output processing + ↓ +componentFunc() creates authContextWrapper from AuthContext + ↓ +ExecuteDescribeComponent(authManager: wrapper) + ↓ +Component has auth config with default identity + ↓ +resolveAuthManagerForNestedComponent(parentAuthManager: wrapper) + ↓ +createComponentAuthManager(parentAuthManager: wrapper) + ↓ +parentAuthManager.GetChain() → PANIC! +``` + +### Solution Design + +Changed `GetChain()` in `authContextWrapper` to return an empty slice instead of panicking: + +```go +func (a *authContextWrapper) GetChain() []string { + defer perf.Track(nil, "exec.authContextWrapper.GetChain")() + + // Return empty slice instead of panicking. + // This wrapper doesn't track the authentication chain; it only propagates auth context. + // When used in resolveAuthManagerForNestedComponent, an empty chain means + // no inherited identity, so the component will use its own defaults. + return []string{} +} +``` + +An empty chain means no inherited identity from the wrapper, so the nested component will use its own default identity configuration. This is the correct behavior. + +### Files Modified + +| File | Change | +|------|--------| +| `internal/exec/terraform_output_utils.go` | Changed `GetChain()` to return empty slice instead of panic | +| `internal/exec/terraform_output_utils_test.go` | Updated test to expect empty slice; added regression test | + +### Testing + +1. **Regression test:** Added `TestAuthContextWrapper_GetChain_NoLongerPanics` to verify no panic +2. **Behavior verification:** Updated `TestAuthContextWrapper_PanicMethods` to expect empty slice for `GetChain()` +3. **Unit test:** Verifies `GetChain()` returns empty slice + +### Key Insight + +The `authContextWrapper` was originally designed with panic stubs for all methods except `GetStackInfo()`. However, the auth system evolved to call `GetChain()` on any `AuthManager` passed to nested component resolution. The fix ensures backward compatibility while supporting the auth chain inheritance logic. + +--- + +## Template for Additional Issues + +### Issue #XXXX: [Title] + +**GitHub Issue:** https://github.com/cloudposse/atmos/issues/XXXX + +**Reporter:** [username] + +**Status:** [Not Started | In Progress | Complete] + +#### Problem Statement + +[Description of the issue] + +#### Root Cause Analysis + +[Technical analysis of what's causing the issue] + +#### Solution Design + +[Proposed fix approach] + +#### Files to Modify + +[List of files and changes] + +#### Testing + +[How the fix will be tested] diff --git a/errors/error_funcs.go b/errors/error_funcs.go index 6c7282a877..e80d3e9e0f 100644 --- a/errors/error_funcs.go +++ b/errors/error_funcs.go @@ -363,3 +363,23 @@ func CheckErrorPrintAndExit(err error, title string, suggestion string) { func Exit(exitCode int) { OsExit(exitCode) } + +// WrapComponentDescribeError wraps an error from DescribeComponent, breaking the +// ErrInvalidComponent chain to prevent triggering component type fallback in detectComponentType. +// +// This is critical for proper error propagation when a referenced component is not found. +// Without breaking the chain, detectComponentType incorrectly tries Helmfile/Packer fallbacks, +// producing confusing "component not found as Helmfile/Packer" errors instead of clear +// "referenced component missing" errors. +// +// The context parameter describes where the error occurred (e.g., "atmos.Component", "!terraform.output"). +func WrapComponentDescribeError(component, stack string, err error, context string) error { + if goerrors.Is(err, ErrInvalidComponent) { + // Break the ErrInvalidComponent chain by using ErrDescribeComponent as the base. + // The original error message is preserved for debugging via %s (not %w). + return fmt.Errorf("%w: %s(%s, %s): %s", + ErrDescribeComponent, context, component, stack, err.Error()) + } + // For other errors, preserve the full chain. + return fmt.Errorf("%s(%s, %s) failed: %w", context, component, stack, err) +} diff --git a/internal/exec/template_env_function_test.go b/internal/exec/template_env_function_test.go new file mode 100644 index 0000000000..03612b0242 --- /dev/null +++ b/internal/exec/template_env_function_test.go @@ -0,0 +1,155 @@ +package exec + +import ( + "testing" + + "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + + "github.com/cloudposse/atmos/pkg/schema" +) + +// TestProcessTmplWithDatasources_EnvFunction tests that the Sprig `env` template function +// correctly retrieves environment variables when they are set. +// +// The `env` function is provided by Sprig (github.com/Masterminds/sprig/v3) and simply +// calls os.Getenv() to retrieve environment variable values. +// +// This test verifies that templates like: +// +// role_arn: 'arn:aws:iam::{{ env "ACCOUNT_ID" }}:role/{{ .vars.namespace }}-{{ env "ROLE_SUFFIX" }}' +// +// Work correctly when the environment variables are set. +func TestProcessTmplWithDatasources_EnvFunction(t *testing.T) { + // Set up test environment variable + t.Setenv("TEST_ACCOUNT_ID", "123456789012") + t.Setenv("TEST_ROLE_SUFFIX", "terraform-backend") + + atmosConfig := &schema.AtmosConfiguration{ + Templates: schema.Templates{ + Settings: schema.TemplatesSettings{ + Enabled: true, + Sprig: schema.TemplatesSettingsSprig{ + Enabled: true, + }, + Gomplate: schema.TemplatesSettingsGomplate{ + Enabled: true, + }, + }, + }, + } + + configAndStacksInfo := &schema.ConfigAndStacksInfo{} + settingsSection := schema.Settings{} + + // Template that uses env function similar to user's use case + tmplValue := ` +terraform: + backend: + s3: + assume_role: + role_arn: 'arn:aws:iam::{{ env "TEST_ACCOUNT_ID" }}:role/{{ .vars.namespace }}-{{ env "TEST_ROLE_SUFFIX" }}' +` + + tmplData := map[string]any{ + "vars": map[string]any{ + "namespace": "acme", + }, + } + + result, err := ProcessTmplWithDatasources( + atmosConfig, + configAndStacksInfo, + settingsSection, + "test-template", + tmplValue, + tmplData, + true, + ) + + require.NoError(t, err) + + // Check that env values are populated + assert.Contains(t, result, "123456789012", "env TEST_ACCOUNT_ID should be populated") + assert.Contains(t, result, "terraform-backend", "env TEST_ROLE_SUFFIX should be populated") + assert.Contains(t, result, "acme", ".vars.namespace should be populated") + + // The full ARN should be correct + assert.Contains(t, result, "arn:aws:iam::123456789012:role/acme-terraform-backend") + + t.Log("Result:", result) +} + +// TestProcessTmplWithDatasources_EnvFunction_UnsetEnvVar tests the expected behavior when +// environment variables are NOT set. +// +// IMPORTANT: When an environment variable is not set, Sprig's `env` function returns an +// empty string. This is the standard behavior of os.Getenv() and is NOT a bug. +// +// This test documents the behavior reported in GitHub issue where users see templates like: +// +// arn:aws:iam::{{ env "ACCOUNT_ID" }}:role/... +// +// Rendered as: +// +// arn:aws:iam:::role/... +// +// The solution is to ensure the required environment variables are set before running +// atmos commands. The `env` template function reads from the OS environment at the time +// the template is processed. +func TestProcessTmplWithDatasources_EnvFunction_UnsetEnvVar(t *testing.T) { + // Intentionally NOT setting the env vars to test what happens when they're missing. + // This demonstrates the expected behavior - NOT a bug. + + atmosConfig := &schema.AtmosConfiguration{ + Templates: schema.Templates{ + Settings: schema.TemplatesSettings{ + Enabled: true, + Sprig: schema.TemplatesSettingsSprig{ + Enabled: true, + }, + Gomplate: schema.TemplatesSettingsGomplate{ + Enabled: true, + }, + }, + }, + } + + configAndStacksInfo := &schema.ConfigAndStacksInfo{} + settingsSection := schema.Settings{} + + // Template that uses env function - env vars NOT set + tmplValue := ` +terraform: + backend: + s3: + assume_role: + role_arn: 'arn:aws:iam::{{ env "UNSET_ACCOUNT_ID" }}:role/{{ .vars.namespace }}-{{ env "UNSET_ROLE_SUFFIX" }}' +` + + tmplData := map[string]any{ + "vars": map[string]any{ + "namespace": "acme", + }, + } + + result, err := ProcessTmplWithDatasources( + atmosConfig, + configAndStacksInfo, + settingsSection, + "test-template", + tmplValue, + tmplData, + true, + ) + + require.NoError(t, err) + + // When env var is not set, Sprig's env function returns empty string + // This is the EXPECTED behavior - but matches the user's issue! + // The result will be: 'arn:aws:iam:::role/acme-' + t.Log("Result with unset env vars:", result) + + // This is what the user is experiencing: + assert.Contains(t, result, "arn:aws:iam:::role/acme-", "env returns empty string when env var not set") +} diff --git a/internal/exec/template_funcs_component.go b/internal/exec/template_funcs_component.go index c5c90c293e..79bdf0ba85 100644 --- a/internal/exec/template_funcs_component.go +++ b/internal/exec/template_funcs_component.go @@ -6,6 +6,7 @@ import ( "github.com/samber/lo" + errUtils "github.com/cloudposse/atmos/errors" "github.com/cloudposse/atmos/pkg/auth" cfg "github.com/cloudposse/atmos/pkg/config" log "github.com/cloudposse/atmos/pkg/logger" @@ -59,7 +60,7 @@ func componentFunc( AuthManager: authMgr, }) if err != nil { - return nil, err + return nil, errUtils.WrapComponentDescribeError(component, stack, err, "atmos.Component") } // Process Terraform remote state. @@ -81,7 +82,7 @@ func componentFunc( } terraformOutputs, err = tfoutput.ExecuteWithSections(atmosConfig, component, stack, sections, authContext) if err != nil { - return nil, err + return nil, fmt.Errorf("atmos.Component(%s, %s) failed to get terraform outputs: %w", component, stack, err) } } diff --git a/internal/exec/template_funcs_component_test.go b/internal/exec/template_funcs_component_test.go index 02d019982f..3c3f08e4f2 100644 --- a/internal/exec/template_funcs_component_test.go +++ b/internal/exec/template_funcs_component_test.go @@ -1,15 +1,19 @@ package exec import ( + "errors" + "fmt" "os" "os/exec" "path/filepath" "testing" - log "github.com/cloudposse/atmos/pkg/logger" "github.com/stretchr/testify/assert" + "github.com/stretchr/testify/require" + errUtils "github.com/cloudposse/atmos/errors" cfg "github.com/cloudposse/atmos/pkg/config" + log "github.com/cloudposse/atmos/pkg/logger" "github.com/cloudposse/atmos/pkg/schema" u "github.com/cloudposse/atmos/pkg/utils" ) @@ -89,3 +93,67 @@ func TestComponentFunc(t *testing.T) { // Helmfile components don't have `outputs` (terraform output) - this should result in `` assert.Contains(t, y, "baz: ") } + +// TestWrapComponentDescribeError_BreaksErrInvalidComponentChain tests that WrapComponentDescribeError +// correctly breaks the ErrInvalidComponent chain to prevent component type fallback. +// This is a regression test for https://github.com/cloudposse/atmos/issues/1030. +func TestWrapComponentDescribeError_BreaksErrInvalidComponentChain(t *testing.T) { + tests := []struct { + name string + inputErr error + wantErrDescribe bool + wantErrInvalid bool + wantMsgContains string + }{ + { + name: "ErrInvalidComponent chain is broken", + // Use fmt.Errorf with %w for proper error wrapping (causality chain). + inputErr: fmt.Errorf("component not found: %w", errUtils.ErrInvalidComponent), + wantErrDescribe: true, + wantErrInvalid: false, // Chain should be broken + wantMsgContains: "atmos.Component", + }, + { + name: "wrapped ErrInvalidComponent chain is broken", + // Use fmt.Errorf with %w to express "this happened because of that". + inputErr: fmt.Errorf("outer error: %w", errUtils.ErrInvalidComponent), + wantErrDescribe: true, + wantErrInvalid: false, // Chain should be broken + wantMsgContains: "atmos.Component", + }, + { + name: "other errors preserve chain", + inputErr: errors.New("network timeout"), + wantErrDescribe: false, + wantErrInvalid: false, + wantMsgContains: "network timeout", + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + result := errUtils.WrapComponentDescribeError("test-comp", "test-stack", tt.inputErr, "atmos.Component") + require.Error(t, result) + + // Check ErrDescribeComponent. + if tt.wantErrDescribe { + assert.ErrorIs(t, result, errUtils.ErrDescribeComponent, + "Expected ErrDescribeComponent in error chain") + } + + // Check ErrInvalidComponent - should NOT be in chain for broken cases. + if tt.wantErrInvalid { + assert.ErrorIs(t, result, errUtils.ErrInvalidComponent, + "Expected ErrInvalidComponent in error chain") + } else { + assert.NotErrorIs(t, result, errUtils.ErrInvalidComponent, + "ErrInvalidComponent should NOT be in error chain (chain should be broken)") + } + + // Check message content. + if tt.wantMsgContains != "" { + assert.Contains(t, result.Error(), tt.wantMsgContains) + } + }) + } +} diff --git a/internal/exec/terraform_output_utils.go b/internal/exec/terraform_output_utils.go index c1d9672bdf..bca58ef5af 100644 --- a/internal/exec/terraform_output_utils.go +++ b/internal/exec/terraform_output_utils.go @@ -75,7 +75,11 @@ func (a *authContextWrapper) Logout(ctx context.Context, identityName string, de func (a *authContextWrapper) GetChain() []string { defer perf.Track(nil, "exec.authContextWrapper.GetChain")() - panic("authContextWrapper.GetChain should not be called") + // Return empty slice instead of panicking. + // This wrapper doesn't track the authentication chain; it only propagates auth context. + // When used in resolveAuthManagerForNestedComponent, an empty chain means + // no inherited identity, so the component will use its own defaults. + return []string{} } func (a *authContextWrapper) ListIdentities() []string { diff --git a/internal/exec/terraform_output_utils_test.go b/internal/exec/terraform_output_utils_test.go index 82c8c5ea70..69a0059e8b 100644 --- a/internal/exec/terraform_output_utils_test.go +++ b/internal/exec/terraform_output_utils_test.go @@ -71,6 +71,30 @@ func TestAuthContextWrapper_AuthenticateProvider(t *testing.T) { assert.ErrorIs(t, err, errUtils.ErrNotImplemented) } +// TestAuthContextWrapper_GetChain_NoLongerPanics is a regression test for Issue #1921. +// When !terraform.output processes a component with auth configured, it creates an authContextWrapper +// to propagate auth context. If the nested component has its own auth config with a default identity, +// resolveAuthManagerForNestedComponent calls GetChain() on the parentAuthManager to inherit the identity. +// Previously, GetChain() panicked, causing the reported bug. Now it returns an empty slice. +func TestAuthContextWrapper_GetChain_NoLongerPanics(t *testing.T) { + authContext := &schema.AuthContext{ + AWS: &schema.AWSAuthContext{ + Profile: "test-profile", + Region: "us-east-1", + }, + } + + wrapper := newAuthContextWrapper(authContext) + require.NotNil(t, wrapper) + + // This should NOT panic (was the bug in #1921). + require.NotPanics(t, func() { + chain := wrapper.GetChain() + // Should return empty slice, indicating no inherited identity chain. + assert.Empty(t, chain) + }) +} + func TestAuthContextWrapper_PanicMethods(t *testing.T) { wrapper := &authContextWrapper{ stackInfo: &schema.ConfigAndStacksInfo{}, @@ -112,10 +136,11 @@ func TestAuthContextWrapper_PanicMethods(t *testing.T) { _ = wrapper.Logout(ctx, "test", false) }) - // Test that GetChain panics. - assert.Panics(t, func() { - _ = wrapper.GetChain() - }) + // Test that GetChain returns empty slice (no panic). + // Fixed in #1921: GetChain is called by resolveAuthManagerForNestedComponent, + // so it must not panic. An empty slice means no inherited identity. + chain := wrapper.GetChain() + assert.Empty(t, chain, "GetChain should return empty slice") // Test that ListIdentities panics. assert.Panics(t, func() { diff --git a/internal/exec/yaml_func_utils.go b/internal/exec/yaml_func_utils.go index 5a13c7673b..bbff90f6d6 100644 --- a/internal/exec/yaml_func_utils.go +++ b/internal/exec/yaml_func_utils.go @@ -4,6 +4,7 @@ import ( "fmt" "strings" + log "github.com/cloudposse/atmos/pkg/logger" "github.com/cloudposse/atmos/pkg/perf" "github.com/cloudposse/atmos/pkg/schema" u "github.com/cloudposse/atmos/pkg/utils" @@ -73,6 +74,11 @@ func processNodesWithContext( case string: result, err := processCustomTagsWithContext(atmosConfig, v, currentStack, skip, resolutionCtx, stackInfo) if err != nil { + log.Debug("Error processing YAML function", + "value", v, + "stack", currentStack, + "error", err.Error(), + ) firstErr = err return v } diff --git a/pkg/terraform/output/executor.go b/pkg/terraform/output/executor.go index d5f29b7c2f..7139568756 100644 --- a/pkg/terraform/output/executor.go +++ b/pkg/terraform/output/executor.go @@ -33,6 +33,11 @@ const ( maxLogValueLen = 100 ) +// wrapDescribeError wraps an error from DescribeComponent using the shared helper. +func wrapDescribeError(component, stack string, err error) error { + return errUtils.WrapComponentDescribeError(component, stack, err, "component") +} + // terraformOutputsCache caches terraform outputs by stack-component slug. var terraformOutputsCache = sync.Map{} @@ -249,7 +254,7 @@ func (e *Executor) GetOutput( }) if err != nil { u.PrintfMessageToTUI(terminal.EscResetLine+"%s %s\n", theme.Styles.XMark, message) - return nil, false, fmt.Errorf("failed to describe component %s in stack %s: %w", component, stack, err) + return nil, false, wrapDescribeError(component, stack, err) } // Check for static remote state backend. @@ -323,7 +328,7 @@ func (e *Executor) fetchAndCacheOutputs( ProcessYamlFunctions: true, }) if err != nil { - return nil, fmt.Errorf("failed to describe component %s in stack %s: %w", component, stack, err) + return nil, wrapDescribeError(component, stack, err) } // Check for static remote state backend. diff --git a/pkg/terraform/output/executor_test.go b/pkg/terraform/output/executor_test.go index f246d3c409..dca6bec003 100644 --- a/pkg/terraform/output/executor_test.go +++ b/pkg/terraform/output/executor_test.go @@ -3,6 +3,7 @@ package output import ( "context" "errors" + "fmt" "testing" "github.com/hashicorp/terraform-exec/tfexec" @@ -657,3 +658,74 @@ func TestGetOutputVariable(t *testing.T) { }) } } + +// TestWrapDescribeError_BreaksErrInvalidComponentChain tests that wrapDescribeError +// correctly breaks the ErrInvalidComponent chain to prevent component type fallback. +// This is a regression test for https://github.com/cloudposse/atmos/issues/1030. +func TestWrapDescribeError_BreaksErrInvalidComponentChain(t *testing.T) { + tests := []struct { + name string + inputErr error + wantErrDescribe bool + wantErrInvalid bool + wantMsgContains string + }{ + { + name: "ErrInvalidComponent chain is broken", + // Use fmt.Errorf with %w for proper error wrapping (causality chain). + inputErr: fmt.Errorf("component not found: %w", errUtils.ErrInvalidComponent), + wantErrDescribe: true, + wantErrInvalid: false, // Chain should be broken + wantMsgContains: "component not found", + }, + { + name: "wrapped ErrInvalidComponent chain is broken", + // Use fmt.Errorf with %w to express "this happened because of that". + inputErr: fmt.Errorf("outer error: %w", errUtils.ErrInvalidComponent), + wantErrDescribe: true, + wantErrInvalid: false, // Chain should be broken + wantMsgContains: "component", + }, + { + name: "other errors preserve chain", + inputErr: errors.New("network timeout"), + wantErrDescribe: false, + wantErrInvalid: false, + wantMsgContains: "network timeout", + }, + { + name: "ErrTerraformStateNotProvisioned preserves chain", + inputErr: errUtils.ErrTerraformStateNotProvisioned, + wantErrDescribe: false, + wantErrInvalid: false, + wantMsgContains: "not provisioned", + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + result := wrapDescribeError("test-comp", "test-stack", tt.inputErr) + require.Error(t, result) + + // Check ErrDescribeComponent. + if tt.wantErrDescribe { + assert.ErrorIs(t, result, errUtils.ErrDescribeComponent, + "Expected ErrDescribeComponent in error chain") + } + + // Check ErrInvalidComponent - should NOT be in chain for broken cases. + if tt.wantErrInvalid { + assert.ErrorIs(t, result, errUtils.ErrInvalidComponent, + "Expected ErrInvalidComponent in error chain") + } else { + assert.NotErrorIs(t, result, errUtils.ErrInvalidComponent, + "ErrInvalidComponent should NOT be in error chain (chain should be broken)") + } + + // Check message content. + if tt.wantMsgContains != "" { + assert.Contains(t, result.Error(), tt.wantMsgContains) + } + }) + } +} diff --git a/website/blog/2025-12-15-base-path-behavior-change.md b/website/blog/2025-12-15-base-path-behavior-change.md new file mode 100644 index 0000000000..830b9e78fe --- /dev/null +++ b/website/blog/2025-12-15-base-path-behavior-change.md @@ -0,0 +1,122 @@ +--- +slug: base-path-behavior-change +title: "Breaking Change: Empty base_path No Longer Defaults to Current Directory" +authors: + - aknysh +tags: + - breaking-change +date: 2025-12-15T00:00:00.000Z +release: v1.202.0 +--- + +Starting with Atmos v1.202.0, empty or omitted `base_path` values in `atmos.yaml` now trigger git root discovery instead of defaulting to the current directory. Users with multiple Atmos projects in a single repository, or where the Atmos project root differs from the git root, must explicitly set `base_path: "."`. + + + +## What Changed + +The interpretation of `base_path` in `atmos.yaml` has changed: + +**Before (v1.201.x and earlier):** +```yaml +# Both treated as "current directory" +base_path: "" +base_path: "." +``` + +**After (v1.202.0+):** +```yaml +# Triggers git root discovery with fallback +base_path: "" + +# Explicitly uses config file directory relative to the location of the `atmos.yaml` +base_path: "." +``` + +These are no longer equivalent. An empty `base_path` now means "find the git repository root and use that", while `"."` explicitly means "use the directory where `atmos.yaml` is located". + +## Who Is Affected + +You may be affected if: + +1. **Multiple Atmos projects in one repository** - Each project has its own `atmos.yaml` in a subdirectory +2. **Atmos project root differs from git root** - Your `atmos.yaml` is not at the repository root +3. **Using refarch-scaffold or similar patterns** - Where `atmos.yaml` is in a nested directory + +### Symptoms + +If affected, you'll see errors like: + +```text +The atmos.yaml CLI config file specifies the directory for Atmos stacks +as stacks, but the directory does not exist. +``` + +Or: + +```text +atmos exited with code 1 +``` + +## How to Fix + +### Option 1: Explicitly Set base_path (Recommended) + +Update your `atmos.yaml` to explicitly specify the base path: + +```yaml +# Before (no longer works as expected) +base_path: "" + +# After (explicit current directory) +base_path: "." +``` + +### Option 2: Use Relative Paths + +If your stacks and components are relative to the config file: + +```yaml +base_path: "." + +stacks: + base_path: "stacks" + +components: + terraform: + base_path: "components/terraform" +``` + +### Option 3: Navigate to Project Root + +If you prefer empty `base_path`, ensure you run Atmos from the git repository root where `stacks/` and `components/` directories exist. + +## Path Resolution Semantics + +For reference, here's how different `base_path` values are now interpreted: + +| `base_path` value | Resolves to | Use case | +|-------------------|-------------|----------| +| `""` (empty/unset) | Git repo root, fallback to config dir | Default - single project at repo root | +| `"."` | Directory containing `atmos.yaml` | Explicit config-relative paths | +| `".."` | Parent of config directory | Config in subdirectory | +| `"./foo"` | config-dir/foo | Explicit relative path | +| `"foo"` | git-root/foo with fallback | Simple relative path | +| `"/absolute/path"` | As specified | Absolute path override | + +## Why This Change? + +This change was made to support running `atmos` commands from anywhere within a repository, similar to how `git` commands work. The git root discovery enables: + +- Running `atmos terraform plan vpc -s dev` from any subdirectory +- Consistent behavior regardless of current working directory +- Better alignment with developer workflows + +For users with non-standard project layouts, the explicit `base_path: "."` provides the previous behavior. + +## References + +- [PR #1872: Correct base path resolution semantics](https://github.com/cloudposse/atmos/pull/1872) +- [PR #1868: Fix base path resolution and fallback order](https://github.com/cloudposse/atmos/pull/1868) +- [Issue #1858: Path resolution regression](https://github.com/cloudposse/atmos/issues/1858) +- [CLI Configuration Documentation](https://atmos.tools/cli/configuration)