Description
Community Note
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request. Searching for pre-existing feature requests helps us consolidate datapoints for identical requirements into a single place, thank you!
- Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.
- If you are interested in working on this issue or have submitted a pull request, please leave a comment.
Overview of the Issue
We are running Atlantis 0.31.0 in fargate. Our VCS is Gitlab (on premise). We use terragrunt with terraform, and in Atlantis we use a custom workflow to enable run-all. The Atlantis configuration is generated automatically via terragrunt-atlantis-config.
When we run atlantis apply
via a comment in a PR, all apply
pipelines are executed. Going all pipelines' states to "succeeded", the main pipeline, atlantis/apply
will remain running.
The logs of the container will show something like this:
{"level":"info","ts":"2025-01-19T16:47:25.659Z","caller":"events/instrumented_project_command_runner.go:88","msg":"apply success. output available at: https://mygitserver.example.com/rikstv/sre/rikstv.terraform.infra.atlantistesting/-/merge_requests/15","json":{"repo":"rikstv/sre/rikstv.terraform.infra.atlantistesting","pull":"15"}}
{"level":"info","ts":"2025-01-19T16:47:27.254Z","caller":"vcs/gitlab_client.go:408","msg":"Updating GitLab commit status for 'atlantis/apply' to 'running'","json":{"repo":"rikstv/sre/rikstv.terraform.infra.atlantistesting","pull":"15"}}
{"level":"info","ts":"2025-01-19T16:47:27.343Z","caller":"vcs/gitlab_client.go:433","msg":"Pipeline found for commit 682ac035b55d8193a729b02edef6f8e71c8944ab, setting pipeline ID to 202381","json":{"repo":"rikstv/sre/rikstv.terraform.infra.atlantistesting","pull":"15"}}
{"level":"warn","ts":"2025-01-19T16:47:27.438Z","caller":"events/apply_command_runner.go:223","msg":"unable to update commit status: POST https://mygitserver.example.com/api/v4/projects/rikstv/sre/rikstv.terraform.infra.atlantistesting/statuses/682ac035b55d8193a729b02edef6f8e71c8944ab: 400 {message: Cannot transition status via :run from :running (Reason(s): Status cannot transition via \"run\")}","json":{"repo":"rikstv/sre/rikstv.terraform.infra.atlantistesting","pull":"15"},"stacktrace":"github.com/runatlantis/atlantis/server/events.(*ApplyCommandRunner).updateCommitStatus\n\tgithub.com/runatlantis/atlantis/server/events/apply_command_runner.go:223\ngithub.com/runatlantis/atlantis/server/events.(*ApplyCommandRunner).Run\n\tgithub.com/runatlantis/atlantis/server/events/apply_command_runner.go:181\ngithub.com/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunCommentCommand\n\tgithub.com/runatlantis/atlantis/server/events/command_runner.go:383"}
I did some digging in the Go code, tag release-0.31
. The error happens in atlantis/server/events/apply_command_runner.go
, lines 214-224, running applyCommandRunner.commitStatusUpdater.UpdateCombinedCount
.
Unless I am mistaken, that will call the method UpdateCombinedCount
from atlantis/server/events/commit_status_updater.go
, lines 69-83. In there, we have this definition:
src := fmt.Sprintf("%s/%s", d.StatusName, cmdName.String())
The string src
is passed on to some CommitStatusUpdater
's method Client.UpdateStatus
, whose result is what will be sent back to UpdateCombinedCount
.
There is a number of Client
s defined, one for each VCS supported by Atlantis. Since we are using Gitlab, I went into atlantis/server/events/vcs/gitlab_client.go
. In line 411 we have:
logger.Info("Updating GitLab commit status for '%s' to '%s'", src, gitlabState)
Now, I was supposed to find a log line with severity info in the logs, and it's actually the second line in the log above, and that line puts all the pieces back together:
{"level":"info","ts":"2025-01-19T16:47:27.254Z","caller":"vcs/gitlab_client.go:408","msg":"Updating GitLab commit status for 'atlantis/apply' to 'running'","json":{"repo":"rikstv/sre/rikstv.terraform.infra.atlantistesting","pull":"15"}}
So, basically, Atlantis is trying to transition a pipeline, whose current status is running
, to running
again, which is wrong and is rejected by Gitlab. In fact, the first line in the log I posted reports a success, and Atlantis should transition the atlantis/apply
pipeline to success
instead, not running
:
{"level":"info","ts":"2025-01-19T16:47:25.659Z","caller":"events/instrumented_project_command_runner.go:88","msg":"apply success. output available at: https://mygitserver.example.com/rikstv/sre/rikstv.terraform.infra.atlantistesting/-/merge_requests/15","json":{"repo":"rikstv/sre/rikstv.terraform.infra.atlantistesting","pull":"15"}}
So there is something in the Atlantis' logic that selects the destination state that is wrong (from running, it should go to success or failed, definitely not running).
I don't know where that logic is and what should be changed, you guys know better.
Reproduction Steps
- as a simple test, have a terraform stack that just emits outputs, without creating any infrastructure
- have a few terragrunt units, each one with its own state, that use the code mentioned above
In our case, the stack would be located at the path apps/sre/__selftest__
, the terraform code would be in the _stack
subdirectory, the units would be in environment-related directories, e.g. dev
, prod
..., each one with a terragrunt.hcl
file that refers to the _stack
directory. E.g.:
# terragrunt.hcl
include "root" {
path = find_in_parent_folders()
}
terraform {
source = "..//_stack"
}
# _stack/main.tf
output "foo" {
value = "foo"
}
output "bar" {
value = "bar"
}
You will find our custom workflow in the "Environment details" section. Keep in mind that the workflow is tightly bound to the repository's directory structure: if you use a different directory scheme than the one suggested here, you'll have to update the workflow, too.
Logs
{"level":"info","ts":"2025-01-19T16:47:25.659Z","caller":"events/instrumented_project_command_runner.go:88","msg":"apply success. output available at: https://mygitserver.example.com/rikstv/sre/rikstv.terraform.infra.atlantistesting/-/merge_requests/15","json":{"repo":"rikstv/sre/rikstv.terraform.infra.atlantistesting","pull":"15"}}
{"level":"info","ts":"2025-01-19T16:47:27.254Z","caller":"vcs/gitlab_client.go:408","msg":"Updating GitLab commit status for 'atlantis/apply' to 'running'","json":{"repo":"rikstv/sre/rikstv.terraform.infra.atlantistesting","pull":"15"}}
{"level":"info","ts":"2025-01-19T16:47:27.343Z","caller":"vcs/gitlab_client.go:433","msg":"Pipeline found for commit 682ac035b55d8193a729b02edef6f8e71c8944ab, setting pipeline ID to 202381","json":{"repo":"rikstv/sre/rikstv.terraform.infra.atlantistesting","pull":"15"}}
{"level":"warn","ts":"2025-01-19T16:47:27.438Z","caller":"events/apply_command_runner.go:223","msg":"unable to update commit status: POST https://mygitserver.example.com/api/v4/projects/rikstv/sre/rikstv.terraform.infra.atlantistesting/statuses/682ac035b55d8193a729b02edef6f8e71c8944ab: 400 {message: Cannot transition status via :run from :running (Reason(s): Status cannot transition via \"run\")}","json":{"repo":"rikstv/sre/rikstv.terraform.infra.atlantistesting","pull":"15"},"stacktrace":"github.com/runatlantis/atlantis/server/events.(*ApplyCommandRunner).updateCommitStatus\n\tgithub.com/runatlantis/atlantis/server/events/apply_command_runner.go:223\ngithub.com/runatlantis/atlantis/server/events.(*ApplyCommandRunner).Run\n\tgithub.com/runatlantis/atlantis/server/events/apply_command_runner.go:181\ngithub.com/runatlantis/atlantis/server/events.(*DefaultCommandRunner).RunCommentCommand\n\tgithub.com/runatlantis/atlantis/server/events/command_runner.go:383"}
Environment details
- Atlantis version: 0.31.0
- Deployment method: Atlantis on AWS Fargate Terraform Module; uses an EFS
filesystem for persistent storage; - have you tried to reproduce this issue on the latest version: no, as it appears that there are bugs in 0.32 that would impact us
- Atlantis flags: N/A
Atlantis server-side config file (snippet, as the file is autogenerated through terragrunt-atlantis-config and includes one project per stack in the repo):
automerge: false
parallel_apply: true
parallel_plan: false
projects:
- autoplan:
enabled: true
when_modified:
- '*.hcl'
- '*.tf*'
- '**/*.hcl'
- '**/*.tf*'
- ../../../terragrunt.hcl
dir: apps/sre/__selftest__
name: apps_sre___selftest__
workspace: apps_sre___selftest__
Repo atlantis.yaml
file:
repos:
- id: git.rikstv.no/rikstv/sre/rikstv.terraform.infra
workflow: terragrunt
apply_requirements: [mergeable, approved]
pre_workflow_hooks:
- run: |
terragrunt-atlantis-config generate --project-hcl-files context.hcl --output atlantis.yaml --autoplan --create-project-name --create-workspace --parallel="false"
yq -i '.parallel_apply = true' atlantis.yaml
yq eval 'del(.projects[] | select(.name | contains("atlantis")))' atlantis.yaml -i
# This should be identical to the real one (rikstv.terraform.infra), but is used for testing atlantis
# The webhook in the 'atlantistesting' repo is not configured to trigger atlantis in prod
- id: git.rikstv.no/rikstv/sre/rikstv.terraform.infra.atlantistesting
workflow: terragrunt
apply_requirements: [mergeable, approved]
pre_workflow_hooks:
- run: |
terragrunt-atlantis-config generate --project-hcl-files context.hcl --output atlantis.yaml --autoplan --create-project-name --create-workspace --parallel="false"
yq -i '.parallel_apply = true' atlantis.yaml
yq eval 'del(.projects[] | select(.name | contains("atlantis")))' atlantis.yaml -i
# Workflow adapted from:
# https://www.runatlantis.io/docs/custom-workflows.html#terragrunt
workflows:
terragrunt:
plan:
steps:
- env:
# Reduce Terraform suggestion output
name: TF_IN_AUTOMATION
value: "true"
- env:
name: TERRAGRUNT_NON_INTERACTIVE
value: "true"
- run:
# Allow for targetted plans/applies as not supported for Terraform wrappers by default
command: terragrunt run-all plan -input=false $(printf '%s' $COMMENT_ARGS | sed 's/,/ /g' | tr -d '\\') -no-color -out ../../../../atlantis.tfplan
output: hide
- run: |
terragrunt run-all show --terragrunt-parallelism=1 ../../../../atlantis.tfplan
apply:
steps:
- env:
# Reduce Terraform suggestion output
name: TF_IN_AUTOMATION
value: "true"
- env:
name: TERRAGRUNT_NON_INTERACTIVE
value: "true"
- run: terragrunt run-all apply -input=false ../../../../atlantis.tfplan
import:
steps:
- env:
name: TF_VAR_author
command: 'git show -s --format="%ae" $HEAD_COMMIT'
# Allow for imports as not supported for Terraform wrappers by default
- run: terragrunt run-all import -input=false $(printf '%s' $COMMENT_ARGS | sed 's/,/ /' | tr -d '\\')
state_rm:
steps:
- run: terragrunt run-all state rm $(printf '%s' $COMMENT_ARGS | sed 's/,/ /' | tr -d '\\')
Additional Context
- Thread in Slack