-
-
Notifications
You must be signed in to change notification settings - Fork 149
Description
Describe the Bug
When using Just-In-Time (JIT) source provisioning, the source.ttl cleanup runs concurrently with — or before — the tofu subprocess, not after it completes. If the TTL expires at any point while tofu init, tofu plan, or any other tofu command is executing, Atmos deletes the varfiles and backend configuration out from under the running process.
The most reliable way to trigger this is ttl: "0s", which expires immediately and causes a deterministic failure every time. However, any positive TTL short enough to expire before the tofu subprocess finishes (e.g. "30s" on a slow network or large module download) will produce the same failure.
The result is a hard failure from tofu because the generated varfile (and/or backend file) no longer exists on disk:
Error: Failed to read variables file
│
│ Given variables file /tmp/atmos-workdir-*/component.tfvars.json does not exist.
Expected Behavior
The TTL cleanup should be scoped to between invocations, not during one. Provisioned files should never be deleted while the subprocess that depends on them is still running. Specifically:
- TTL expiry should only be evaluated before provisioning (stale cache check), not during or after subprocess execution.
- The provisioned workdir should be treated as a lock for the duration of the current command — held open until the subprocess exits, then subject to TTL-based cleanup on the next invocation.
A source.ttl: "0s" is the degenerate case that makes this deterministic, but the fix must cover all TTL values.
Actual Behavior
Atmos generates the varfiles and backend, the TTL of 0s immediately expires them, Atmos wipes them, and tofu fails:
│ Error: Failed to read variables file
│
│ Given variables file demo-null-label.terraform.tfvars.json does not exist.
Steps to Reproduce
The script below is fully self-contained. It requires only atmos and tofu on PATH and network access to GitHub. Save it as repro.sh and run it.
#!/usr/bin/env bash
# ============================================================
# REPRO: JIT ttl:"0s" deletes varfiles before tofu can read them
# ============================================================
set -euo pipefail
WORKDIR="$(mktemp -d -t atmos-repro-XXXXXX)"
echo "Working in: ${WORKDIR}"
cd "${WORKDIR}"
# --- 1) atmos.yaml ---
cat <<'EOF' > atmos.yaml
base_path: "."
components:
terraform:
base_path: "components/terraform"
command: "tofu"
workspaces_enabled: true
apply_auto_approve: false
deploy_run_init: true
init_run_reconfigure: true
auto_generate_backend_file: true
stacks:
name_template: "{{ .vars.name }}"
base_path: "stacks"
included_paths:
- "**/*"
EOF
# --- 2) Stack with ttl: "0s" on the JIT source ---
mkdir -p stacks
cat <<'EOF' > stacks/demo.yaml
vars:
name: demo
terraform:
backend_type: local
components:
terraform:
null-label:
vars:
# terraform-null-label variables
namespace: "eg"
stage: "test"
name: "demo"
enabled: true
source:
uri: "git::https://github.com/cloudposse/terraform-null-label.git"
version: "0.25.0"
ttl: "0s" # <-- triggers the bug: files are wiped before tofu reads them
provision:
workdir:
enabled: true
EOF
echo
echo "== tree =="
find . -maxdepth 4 -type f -print | sed 's|^\./||'
echo
echo "== discovered stacks =="
atmos describe stacks
echo
echo "== describe component =="
atmos describe component null-label -s demo
echo
echo "== init (this is where the failure occurs with ttl:0s) =="
atmos terraform init null-label -s demo
echo
echo "== plan =="
atmos terraform plan null-label -s demo
echo "Done. Workspace preserved at: ${WORKDIR}"Run:
bash repro.sh 2>&1 | tee repro.logScreenshots
No response
Environment
Atmos 1.212.0 on darwin/arm64
Additional Context
No response