Skip to content

Ctrl+C leaves orphaned GCS state lock - SIGKILL sent before OpenTofu can release lock #5170

@bradenwright

Description

@bradenwright

Describe the bug

When pressing Ctrl+C during a Terraform/OpenTofu operation with a GCS backend, Terragrunt logs "Gracefully shutting down..." but sends SIGKILL to the subprocess within ~3ms. This doesn't give OpenTofu enough time to release the remote state lock, leaving it orphaned in GCS.

The same test with a local backend works correctly - OpenTofu receives the signal and shuts down gracefully.

Steps To Reproduce

main.tf:

terraform {
  required_providers {
    time = { source = "hashicorp/time", version = "~> 0.9" }
  }
}

resource "time_sleep" "wait" {
  create_duration = "10s"
}

terragrunt.hcl:

terraform { source = "./" }

remote_state {
  backend = "gcs"
  config = {
    bucket   = "your-bucket"
    prefix   = "test"
    location = "us-east1"
  }
  generate = { path = "backend.tf", if_exists = "overwrite_terragrunt" }
}

Commands:

terragrunt init
terragrunt apply -auto-approve
# Press Ctrl+C once while "time_sleep.wait: Creating..." is shown
terragrunt apply -auto-approve  # Fails with lock error

Expected behavior

After pressing Ctrl+C once, OpenTofu should have enough time to release the GCS state lock before being terminated. The subsequent terragrunt apply should not fail with a lock error.

Must haves

  • Steps for reproduction provided.

Nice to haves

  • Terminal output

Terminal output:

23:35:48.882 STDOUT tofu: time_sleep.wait: Creating...
23:35:50.423 STDOUT tofu: Interrupt received.
23:35:50.423 INFO   Interrupt signal received. Gracefully shutting down...
23:35:50.423 STDOUT tofu: Please wait for OpenTofu to exit or data loss may occur.
23:35:50.424 STDOUT tofu: Gracefully shutting down...
23:35:50.424 STDOUT tofu: Stopping operation...
23:35:50.427 ERROR  error occurred:
* Failed to execute "tofu apply -auto-approve" in .terragrunt-cache-test/...
  signal: killed

Note: Only ~3ms between "Gracefully shutting down" and "signal: killed"

Subsequent apply fails:

23:35:58.343 STDERR tofu: │ Error: Error acquiring the state lock
23:35:58.344 STDERR tofu: │ Lock Info:
23:35:58.344 STDERR tofu: │   ID:        1764830148129739
23:35:58.344 STDERR tofu: │   Path:      gs://starwatch-terraform-backend/test-signal-handling/test/default.tflock
23:35:58.344 STDERR tofu: │   Operation: OperationTypeApply

Versions

  • Terragrunt version: v0.93.12
  • OpenTofu/Terraform version: OpenTofu v1.10.7
  • Environment details: macOS 15 (darwin/arm64)

Additional context

  • With a local backend, the same test works correctly - OpenTofu gracefully shuts down and no state corruption occurs.
  • The issue appears to be that Terragrunt sends SIGKILL to the subprocess too quickly after receiving SIGINT, not giving OpenTofu time to release the GCS lock.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions