run `terraform plan` when setting `--dry-run` flag on `experimental-deploy` goal #20488

agoblet · 2024-02-05T13:08:24Z

Changed the following things to achieve this

Add a --dry-run flag to the experimental-deploy goal to handle dry runs such as terraform plan
Run plan rather than apply when setting the --dry-run flag while deploying a terraform_deployment
Changed Helm deployments to use the new --dry-run flag instead of a passthrough arg for dry-running

Tested

the terraform change via a unit test and also tried it out manually in my own project.
the helm change via a unit test.

Closes #18490

src/python/pants/core/goals/deploy.py

tgolsson · 2024-02-06T16:08:41Z

Hey,

Thanks for the PR! I'm not too familiar with the terraform backend, so I'll hold off reviewing it for another maintainer/contributor who's more well-versed in that.

With that said, I'd like to leave a few thoughts as someone "in the vincinity" -- I build a k8s plugin. There's a whole set of verbs that make sense in the infrastructure domain that don't necessarily map well to Pants actions. This was raised in the original topic as well. In a fully pants-managed world, what does the user-story look like there? --dry-run is benign because that's likely fairly mappable, but even there a k8s user might ask "on server or client"?

Going further, what about terraform show? terraform destroy? Are those "actions" on deploy? Other goals? It doesn't matter if we get it wrong at this stage since it's an experiental goal -- but I read your comment about how helm works, and figured I'd add more thoughts. I'm not sure if experimental-deploy can run on a glob/multiple targets, but what if you have heterogenous targets? That probably means we have to do flags, not passthrough args.

And taking that one step further... deploy --dry-run is sensible. deploy --action=destroy? deploy --action=scale, but what if terraform, etc.

What I do right now without adopting experimental-deploy yet, is target generation. So if I have a k8s_object(name="k8s"), I generate k8s#get, k8s#apply, and so on. These are all run targets however, so single invocation. To work around that I do a k8s_objects wrapper that handles multiplexing. Terrible hacks all the way down.

agoblet · 2024-02-06T16:49:40Z

Good points @tgolsson! The infra and Pants domains do not map really well indeed. I went with the --dry-run idea because that was proposed in the ticket. There's various edge cases popping up now though now I am working on it and read your comments.

What to do with other operations such as destroy is still an open question I guess. For the way we use terraform, plan and apply are the only operations we really use in an automated way. Destroy is more of a one-off thing that's only necessary every now and then. Could be supported eventually but might not be necessary yet in this experimental stage.

Let's see what other maintainers have to say about it.

agoblet · 2024-02-06T16:59:11Z

I'm not sure if experimental-deploy can run on a glob/multiple targets, but what if you have heterogenous targets?

you can run it for multiple targets I think, pants experimental-deploy :: works. In that case, adding —helm-args might not work indeed when terraform targets are deployed as well. Not sure though, as I’m still a Pants noob 😁

agoblet · 2024-02-06T17:09:29Z

What I do right now without adopting experimental-deploy yet, is target generation. So if I have a k8s_object(name="k8s"), I generate k8s#get, k8s#apply, and so on. These are all run targets however, so single invocation. To work around that I do a k8s_objects wrapper that handles multiplexing. Terrible hacks all the way down.

I have to wrap my head around what that means. We are not yet using Pants. I am investigating how our codebase with lots of terraform, lots of python, and some docker can be managed using Pants. terraform plan is an important part of that puzzle and I thought I could easily implement this —dry-run feature to support that. But maybe using run targets works as well without requiring new features in Pants

lilatomic · 2024-02-06T18:22:17Z

Thanks for the contribution! I had a quick look and it looks good. I'll be able to do a full review tomorrow.
I agree with both of you that there are currently 2 "big picture" questions:

what to do with backends that don't support dry-run
what to do with all the extra "infra" verbs

I think it makes sense to add --dry-run at the goal level. It's common enough in infra that I'd want it standardised. I think we could add a field on DeployFieldSet which language-specific backends could set for whether they support --dry-run. We could then skip those in the run_deploy goal rule.

As for all the extra verbs, I think that at some point it's reasonable to eject from Pants with export and just use the tool. I think as a starting point it's good to never lose functionality by adopting Pants. But also, often these advanced actions don't really map to anything in the Pants domain or across tools, so we don't really add anything by adding them to Pants. For illustration, run just accepts an argument list, since Pants doesn't really have anything to add to that; even the run goal for Docker just has --docker-run-args.

I'm not sure how much discussion has to happen before we could merge this. That is, I'm hoping that we don't have to hash out exactly what's common enough to standardise and what should be left to backends, as I think --dry-run is uncontroversial for that. Implementing a per-backend opt-in/out with DeployFieldSet.supports_dry_run should be simple enough (could even be a fast-follow MR).

agoblet · 2024-02-06T19:54:56Z

Thanks for the feedback @lilatomic !

I think it makes sense to add --dry-run at the goal level.

Yes I think so too. In that case, I’ll extend the PR tomorrow to make sure Helm also uses this flag rather than the passthrough arguments.

As for all the extra verbs, I think that at some point it's reasonable to eject from Pants with export and just use the tool.

This will be something for another PR / issue then.

I'm hoping that we don't have to hash out exactly what's common enough to standardise and what should be left to backends

Not sure how the decision making works here. For me the current idea of the dry-run flag is sufficient for our terraform needs right now.

Implementing a per-backend opt-in/out with DeployFieldSet.supports_dry_run should be simple enough (could even be a fast-follow MR).

From a quick search I could only find helm and terraform as targets for deploy, but please correct me if I’m wrong. If I’m right, then we can omit this field for now, since both helm and terraform support dry running.

agoblet · 2024-02-07T08:44:00Z

@lilatomic based on our discussion yesterday I changed helm to use the new goal-level flag as well.

lilatomic

Thank you for the MR! It looks good to me. I appreciate the thorough tests.
I'll see about getting the CI running and a review from a Maintainer.

lilatomic · 2024-02-08T06:06:53Z

src/python/pants/backend/helm/subsystems/helm.py

@@ -165,7 +164,7 @@ class HelmSubsystem(TemplatedExternalTool):
    )

    args = ArgsListOption(
-        example="--dry-run",
+        example="--force",


I think we should also mention the correct way to pass --dry-run, something like:

use the --experimental-deploy-dry-run flag to pass --dry-run

Good catch. I added it in another place, but I think that was an error message. Added it to this description as well.

Also added information on dry-running into the helm and terraform docs.

agoblet · 2024-02-13T10:59:14Z

ping @alonsodomin @lilatomic

alonsodomin · 2024-02-13T13:44:22Z

Sorry for the delay chiming on here. The addition of the --dry-run flag to deploy-like goal makes sense to me, as @lilatomic said, it's common enough in the deployment domains and it translates well into the different tools (helm has a flag, terraform can use the plan command and ansible has a check mode that does the same).

The implementation here looks good to me but leaves me wondering if it's enough. Meaning that when we normally want to run in a dry run mode we expect to get some output from the tool apart from just an OK or KO return code. As is right now I believe that output will be omitted (current implementation of the goal only prints if the action succeeded or not).

I think we should make the experimental-deploy goal print to stdout the output we receive from the given tool, at least in the case in which the execution has failed. A similar implementation to what the test goal does is what I'm thinking right now.

On the wider topic of the existence of the experimental-deploy goal, I agree that there are many kind of operations common in the infrastructure domain that do not map well to Pants. We initially added this goal though as we thought that there are some benefits here in which Pants can help a lot by leveraging the rule graph, i.e.: we can trace all the artifacts that depend on a given deployment and publish them just to make sure that they are available to the infrastructure when we run the deployment. I also see the action of publishing an artifact into a repository as an infrastructure-related action is so ubiquitous on build tools that it raises less eyebrows.

At the moment it's true that experimental-deploy can not handle multiple targets. I did start some work in the past to try to amend that in #19260 and the result is that it is possible although the implementation is really ugly. The main idea would be to use Pants' engine to traverse the dependency graph and perform the right action at each node (similar way as how dependencies are resolved). The limitation here is that, since deploying and publishing are side-effecting actions (and therefore an InteractiveProcess) we can only run them at a @goal_rule and not at a normal @rule, which leaves you with either having to build and traverse the dependency graph yourself or to hack it unwrapping the Process inside the InteractiveProcess and run it as it was not side-effecting (or both!). Either one very ugly options.

That PR is currently parked until I find the time to come up with a better idea (or we totally give up on experimental-deploy). If anyone has any suggestions I'd be happy to receive some input.

PS: Sorry for the wall of text.

agoblet · 2024-02-13T14:46:02Z

Thanks for the review @alonsodomin . If I understand you correctly the only thing we have to do for this PR is the printing of the output. I'm not sure what exactly is missing here, because for terraform_deployment I can already see the output. I did not test this for Helm.

Can you elaborate what's still needed? Do we perhaps need to disable something on the Terraform backend and move that to the Deploy goal?

❯ PANTS_SOURCE=../pants pants experimental-deploy --dry-run ::
Pantsd has been turned off via Env.
15:41:39.32 [INFO] Deploying targets...
module.lambda_function.data.aws_region.current: Reading...
module.lambda_function.data.aws_caller_identity.current: Reading...
module.lambda_function.data.aws_region.current: Read complete after 0s [id=eu-west-1]
module.lambda_function.data.aws_iam_policy_document.assume_role[0]: Reading...
module.some_cool_module.data.aws_iam_policy_document.test_assume_role: Reading...
module.lambda_function.data.aws_partition.current: Reading...
module.lambda_function.data.aws_iam_policy_document.assume_role[0]: Read complete after 0s [id=2690255455]
module.some_cool_module.data.aws_iam_policy_document.test_assume_role: Read complete after 0s [id=1229436035]
module.lambda_function.data.aws_partition.current: Read complete after 0s [id=aws]
module.lambda_function.data.aws_caller_identity.current: Read complete after 0s [id=740837425924]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create
 <= read (data resources)

Terraform will perform the following actions:

  # module.lambda_function.data.aws_iam_policy_document.logs[0] will be read during apply
  # (config refers to values not yet known)
 <= data "aws_iam_policy_document" "logs" {
      + id   = (known after apply)
      + json = (known after apply)

      + statement {
          + actions   = [
              + "logs:CreateLogGroup",
              + "logs:CreateLogStream",
              + "logs:PutLogEvents",
            ]
          + effect    = "Allow"
          + resources = (known after apply)
        }
    }

  # module.lambda_function.aws_cloudwatch_log_group.lambda[0] will be created
  + resource "aws_cloudwatch_log_group" "lambda" {
      + arn               = (known after apply)
      + id                = (known after apply)
      + log_group_class   = (known after apply)
      + name              = "/aws/lambda/my-lambda1"
      + name_prefix       = (known after apply)
      + retention_in_days = 0
      + skip_destroy      = false
      + tags_all          = (known after apply)
    }

  # module.lambda_function.aws_iam_policy.logs[0] will be created
  + resource "aws_iam_policy" "logs" {
      + arn         = (known after apply)
      + id          = (known after apply)
      + name        = "my-lambda1-logs"
      + name_prefix = (known after apply)
      + path        = "/"
      + policy      = (known after apply)
      + policy_id   = (known after apply)
      + tags_all    = (known after apply)
    }

  # module.lambda_function.aws_iam_role.lambda[0] will be created
  + resource "aws_iam_role" "lambda" {
      + arn                   = (known after apply)
      + assume_role_policy    = jsonencode(
            {
              + Statement = [
                  + {
                      + Action    = "sts:AssumeRole"
                      + Effect    = "Allow"
                      + Principal = {
                          + Service = "lambda.amazonaws.com"
                        }
                    },
                ]
              + Version   = "2012-10-17"
            }
        )
      + create_date           = (known after apply)
      + force_detach_policies = true
      + id                    = (known after apply)
      + managed_policy_arns   = (known after apply)
      + max_session_duration  = 3600
      + name                  = "my-lambda1"
      + name_prefix           = (known after apply)
      + path                  = "/"
      + tags_all              = (known after apply)
      + unique_id             = (known after apply)
    }

  # module.lambda_function.aws_iam_role_policy_attachment.logs[0] will be created
  + resource "aws_iam_role_policy_attachment" "logs" {
      + id         = (known after apply)
      + policy_arn = (known after apply)
      + role       = "my-lambda1"
    }

  # module.lambda_function.aws_lambda_function.this[0] will be created
  + resource "aws_lambda_function" "this" {
      + architectures                  = (known after apply)
      + arn                            = (known after apply)
      + description                    = "My awesome lambda function"
      + filename                       = "../../../dist/src.python.some_lambda/lambda.zip"
      + function_name                  = "my-lambda1"
      + handler                        = "lambda_function.handler"
      + id                             = (known after apply)
      + invoke_arn                     = (known after apply)
      + last_modified                  = (known after apply)
      + memory_size                    = 128
      + package_type                   = "Zip"
      + publish                        = false
      + qualified_arn                  = (known after apply)
      + qualified_invoke_arn           = (known after apply)
      + reserved_concurrent_executions = -1
      + role                           = (known after apply)
      + runtime                        = "python3.9"
      + signing_job_arn                = (known after apply)
      + signing_profile_version_arn    = (known after apply)
      + skip_destroy                   = false
      + source_code_hash               = (known after apply)
      + source_code_size               = (known after apply)
      + tags_all                       = (known after apply)
      + timeout                        = 3
      + version                        = (known after apply)

      + ephemeral_storage {
          + size = 512
        }

      + logging_config {
          + log_format = "Text"
          + log_group  = (known after apply)
        }
    }

  # module.some_cool_module.aws_iam_role.test will be created
  + resource "aws_iam_role" "test" {
      + arn                   = (known after apply)
      + assume_role_policy    = jsonencode(
            {
              + Statement = [
                  + {
                      + Action    = "sts:AssumeRole"
                      + Effect    = "Allow"
                      + Principal = {
                          + Service = "codebuild.amazonaws.com"
                        }
                    },
                ]
              + Version   = "2012-10-17"
            }
        )
      + create_date           = (known after apply)
      + force_detach_policies = false
      + id                    = (known after apply)
      + managed_policy_arns   = (known after apply)
      + max_session_duration  = 3600
      + name                  = "test-pants-role"
      + name_prefix           = (known after apply)
      + path                  = "/"
      + tags_all              = (known after apply)
      + unique_id             = (known after apply)
    }

Plan: 6 to add, 0 to change, 0 to destroy.

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so Terraform can't guarantee to take exactly these actions if you run "terraform apply" now.

✓ src/terraform/some_cool_root_module:some_cool_root_module_deployment deployed

alonsodomin · 2024-02-13T15:09:03Z

agh, my bad, was checking the implementation of the deploy goal in core and assumed the output was not being emitted. It's ok as it is.

agoblet · 2024-02-16T09:26:16Z

@alonsodomin ping. The CI passes. Do you need me to merge main into this? can you get PR merged?

alonsodomin · 2024-02-16T10:24:47Z

just approved and merged, many thanks!

lilatomic · 2024-02-19T00:23:03Z

Thanks again for the contribution!

agoblet added 2 commits February 5, 2024 08:54

run terraform plan when setting dry-run flag

b0f2006

add unit test

d03c6a8

agoblet marked this pull request as ready for review February 5, 2024 16:45

agoblet commented Feb 6, 2024

View reviewed changes

src/python/pants/core/goals/deploy.py Show resolved Hide resolved

Merge branch 'main' into feature/18490-terraform-plan

4a805e4

change helm dry-run to depend on goal flag

311d03f

agoblet added 2 commits February 7, 2024 09:45

Merge branch 'main' into feature/18490-terraform-plan

bb20edb

Merge branch 'main' into feature/18490-terraform-plan

0f6b863

lilatomic approved these changes Feb 8, 2024

View reviewed changes

lilatomic requested a review from alonsodomin February 8, 2024 06:13

huonw added the category:new feature label Feb 8, 2024

agoblet and others added 10 commits February 8, 2024 10:25

Merge branch 'main' into feature/18490-terraform-plan

9817ef9

fix mypy issues in tests

478b8c6

add documentation about dry running helm

5656e49

update helm and terraform docs

c2036a6

fix flake8

b0d9563

Merge branch 'main' into feature/18490-terraform-plan

3333b1a

Merge branch 'main' into feature/18490-terraform-plan

6f051ad

Merge branch 'main' into feature/18490-terraform-plan

dbc91d2

Merge branch 'main' into feature/18490-terraform-plan

9c35c4a

Merge branch 'main' into feature/18490-terraform-plan

841a71f

alonsodomin approved these changes Feb 16, 2024

View reviewed changes

alonsodomin merged commit 7c1e0b7 into pantsbuild:main Feb 16, 2024

agoblet deleted the feature/18490-terraform-plan branch February 16, 2024 11:56

purajit mentioned this pull request Jun 26, 2024

Bump default terraform version to 1.9.0, add all known terraform versions inbetween #21110

Merged

Uh oh!

run terraform plan when setting --dry-run flag on experimental-deploy goal #20488

run terraform plan when setting --dry-run flag on experimental-deploy goal #20488

Uh oh!

Conversation

agoblet commented Feb 5, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

tgolsson commented Feb 6, 2024

Uh oh!

agoblet commented Feb 6, 2024

Uh oh!

agoblet commented Feb 6, 2024

Uh oh!

agoblet commented Feb 6, 2024

Uh oh!

lilatomic commented Feb 6, 2024

Uh oh!

agoblet commented Feb 6, 2024

Uh oh!

agoblet commented Feb 7, 2024

Uh oh!

lilatomic left a comment

Choose a reason for hiding this comment

Uh oh!

lilatomic Feb 8, 2024

Choose a reason for hiding this comment

Uh oh!

agoblet Feb 8, 2024

Choose a reason for hiding this comment

Uh oh!

agoblet Feb 8, 2024

Choose a reason for hiding this comment

Uh oh!

agoblet commented Feb 13, 2024

Uh oh!

alonsodomin commented Feb 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

agoblet commented Feb 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alonsodomin commented Feb 13, 2024

Uh oh!

agoblet commented Feb 16, 2024

Uh oh!

alonsodomin commented Feb 16, 2024

Uh oh!

lilatomic commented Feb 19, 2024

Uh oh!

Uh oh!

run `terraform plan` when setting `--dry-run` flag on `experimental-deploy` goal #20488

run `terraform plan` when setting `--dry-run` flag on `experimental-deploy` goal #20488

agoblet commented Feb 5, 2024 •

edited

Loading

alonsodomin commented Feb 13, 2024 •

edited

Loading

agoblet commented Feb 13, 2024 •

edited

Loading