Skip to content

Stacks #3313

@yhakbar

Description

@yhakbar

Summary

To reduce code repetition and make it easier to manage Terragrunt codebases, this proposal introduces a layer of abstraction above terragrunt.hcl files called Stacks. Stacks are defined using files named terragrunt.stack.hcl.

Users will interact with Stacks using commands prefixed with terragrunt stack, which will allow them to create, manage, and destroy Stacks.

Motivation

Many users using Terragrunt experience repetition with terragrunt.hcl files in their repositories.

One reason for this might be that, while Terragrunt configurations provide an abstraction for DRY (Don't Repeat Yourself) OpenTofu/Terraform modules, the ability to abstract the Terragrunt configuration itself is somewhat limited.

Users typically use a collection of terragrunt.hcl files, each of which are relevant to managing an OpenTofu/Terraform module for a single state file. Repeatedly provisioning the same module across multiple environments, or multiple times within the same environment currently necessitates replication of the same terragrunt.hcl file for each instantiation of that module.

Users have experienced complications with synchronizing updates across multiple terragrunt.hcl files, and have expressed a desire for a more streamlined way to synchronize updates across multiple terragrunt.hcl files.

In addition, Terragrunt code re-use has been largely limited to Terragrunt configurations found on local filesystems. Expanding tooling so that Terragrunt configurations can be shared across repositories would be beneficial, both for the scalability of Terragrunt codebases, and to expand the ways in which Gruntwork customers can leverage configurations maintained by Gruntwork.

Proposal

Introduce a new terragrunt.stack.hcl configuration file that can be used by Terragrunt to manage a Stack.

terragrunt.stack.hcl

The terragrunt.stack.hcl file will have configurations that entirely focus on generating a stack of terragrunt.hcl files. These terragrunt.hcl files will use the same syntax as current Terragrunt configurations, and use existing tooling to integrate into the stack.

An example terragrunt.stack.hcl file might look like this:

locals {
    version = "v0.0.1"
    environment = "dev"
}
 
unit "service" {
    source = "github.com/gruntwork-io/terragrunt-stacks//stacks/mock/service?ref=${local.version}"
    path = "service" # default would be github.com/gruntwork-io/terragrunt-stacks/stacks/mock/service
}
 
unit "db" {
    source = "github.com/gruntwork-io/terragrunt-stacks//stacks/mock/db?ref=${local.version}"
    path = "db" # default would be github.com/gruntwork-io/terragrunt-stacks/stacks/mock/db
}
 
unit "api" {
    source = "github.com/gruntwork-io/terragrunt-stacks//stacks/mock/api?ref=${local.version}"
    path = "api" # default would be github.com/gruntwork-io/terragrunt-stacks/stacks/mock/api
}

In this example, the terragrunt.stack.hcl file defines three Units: service, db, and api. Each Unit is the path to a directory containing a terragrunt.hcl file, using go-getter to load the configurations locally or from a remote source.

Quick Detour on "Units"

The term "Unit" is language that we haven't standardized externally, but is something that we've been using internally at Gruntwork. It's a way to refer to a single instantiation of an OpenTofu/Terraform module, and we believe the best way to do that is with a terragrunt.hcl file. Whenever you see reference to "Unit", you can mentally replace that with a terragrunt.hcl file. It's a unit of infrastructure, with its own state, potentially integrated into a larger system.

We have yet to standardize this term throughout Terragrunt tooling and documentation, but we believe it's a useful concept to introduce in this proposal.

If you have feedback on this terminology, please let share it!

terragrunt.stack.hcl Configuration Continued

Those unit configuration blocks are used to instantiate Terragrunt Units. The two things that are required for a Unit to be instantiated are:

  1. The source attribute (Required): The way in which Terragrunt is going to fetch the relevant directory containing the terragrunt.hcl file.
  2. The path attribute (Optional): The path to the directory where the unit is going to be generated. If not provided, the default path determined by the source will be used. More on this will be discussed later.

The locals block is one that most Terragrunt users are familiar with. It's a way to define reusable variables throughout a Terragrunt stack.

terragrunt stack Commands

In tandem with introducing a new configuration file, Terragrunt will also have a new set of commands that will allow users to interact with Stacks. These commands will be prefixed with terragrunt stack.

  • terragrunt stack generate: This command will generate the stack of Units, using the configurations in the terragrunt.stack.hcl file.

    What this will do is create a .terragrunt-stack directory next to the terragrunt.stack.hcl file, and populate it with content from the Units defined in the terragrunt.stack.hcl file.

    The paths to the units in the .terragrunt-stack directory will be determined by the path attribute in the unit configuration blocks. If the path attribute is not provided, a default path will be determined based on parsing the source attribute.

    e.g.

    # terragrunt.stack.hcl
    unit "service" {
        source = "github.com/gruntwork-io/terragrunt-stacks//stacks/mock/service"
        path   = "service"
    }
    
    unit "db" {
        source = "github.com/gruntwork-io/terragrunt-stacks//stacks/mock/db"
        path   = "db"
    }

    Would generate the file structure:

      .terragrunt-stack
      ├── service
      │   └── terragrunt.hcl
      ├── db
          └── terragrunt.hcl
    
  • terragrunt stack run *: Similar to the run-all command, the stack run command allows users to run commands across all Units recursively discovered in a directory with the run command. A significant difference to the run-all command is that the stack run command will run those commands within the context of the .terragrunt-stack directory. The suffix * is a wildcard that is forwarded to the underlying wrapped binary that Terragrunt is orchestrating (OpenTofu/Terraform), just like it does with run-all.

    e.g.

    terragrunt stack run plan

    Would run terraform plan in each of the Units in the .terragrunt-stack directory. If the .terragrunt-stack directory does not exist, the stack run command will generate it first.

    To ensure that users have full control over this process, the stack run command will have a --terragrunt-generate-stack=false flag that will prevent the .terragrunt-stack directory from being generated. The verbosity of this flag is not ideal, but it is in-line with the verbosity of other Terragrunt flags. This is something to revisit in the future.

  • terragrunt stack output: In order to be able to interact with the Units within a stack outside of it, the stack output command will be introduced. This command will take the outputs of the Units in the Stack and stitch them together into a single output. This will allow users to interact with the stack as a single unit, rather than having to interact with each Unit individually.

    e.g.

    $ terragrunt stack output
    service.output1 = "output1"
    service.output2 = "output2"
    db.output1 = "output1"
    db.output2 = "output2"

    This will allow users to access the outputs of the Units in the Stack, without having to navigate to each Unit individually.

How terragrunt.hcl Files Are Impacted

One of the main goals of this proposal is to make it so that users can take the exact same terragrunt.hcl files they are using today, and use them as part of a Stack. To that end, users should not expect any special syntax used in terragrunt.hcl files used in a Stack.

Units are already frequently written with relative paths for their dependency blocks to reference each other.

e.g.

dependency "db" {
  config_path = "../db"
}

A unit with that dependency block would expect to find a folder named db sibling to it in the directory structure. Stacks take advantage of that, allowing them to be generated dynamically using the path attribute in the unit configuration blocks, and the relative paths in the dependency blocks will work within the context of the .terragrunt-stack directory.

In addition, users frequently use the path from a terragrunt.hcl file at the root of the repository or the .git directory to determine where state files are stored for individual units.

In the context of a Stack, the path simply includes the .terragrunt-stack directory with no changes to how the path is currently calculated.

e.g.

Given the following file structure:

/path/to/dir/service/terragrunt.hcl
/path/to/dir/db/terragrunt.hcl

Replacing the contents of dir with:

/path/to/dir/terragrunt.stack.hcl

Will result in the following file structure once a terragrunt stack command is run:

/path/to/dir/terragrunt.stack.hcl
/path/to/dir/.terragrunt-stack/service/terragrunt.hcl
/path/to/dir/.terragrunt-stack/db/terragrunt.hcl

The service and db units will be generated in the .terragrunt-stack directory, and the dependency block in the service unit will be able to reference the db unit using the same relative path ../db.

The implication to existing terragrunt.hcl files is that they cannot necessarily be easily refactored into Stacks with the initial release due to the need to move state, but it should be trivial to generate new instances of the same Units in a new Stack.

In the future, additional tooling can be explored to help users migrate to Stacks from existing Terragrunt configurations.

How Stacks Use Shared Configurations

A common pattern seen with modern Terragrunt configurations is that they frequently rely on shared configurations via the include configuration block. Users may be familiar with canonical _envcommon directories, that are designed for this in the Gruntwork library. This can be a useful pattern, and one that doesn't have to be abandoned when adopting Stacks.

One benefit of this design, however, is that all Units have a natural alternate location to store shared configurations that they rely on: the terragrunt.stack.hcl file. Units can leverage existing functions like read_terragrunt_config to read configurations from the terragrunt.stack.hcl file.

e.g.

locals {
  stack_config = read_terragrunt_config(find_in_parent_folders("terragrunt.stack.hcl"))

  environment = local.stack_config.locals.environment
}

There may be benefits to introducing new functionality that makes it easier to share configurations across Units in a Stack in the future, but the initial release will not need include anything besides what Terragrunt can do today.

Nesting Stacks

To mitigate the risk of Stacks becoming too large, or repeated, Stacks are designed to be nestable.

e.g.

locals {
    version = "v0.0.1"
    environment = "dev"
}
 
stack "services" {
    source = "github.com/gruntwork-io/terragrunt-stacks//stacks/mock/services?ref=${local.version}"
    path = "services"
}
 
unit "db" {
    source = "github.com/gruntwork-io/terragrunt-stacks//stacks/mock/db?ref=${local.version}"
    path = "db"
}

In this example, the services stack will be generated at .terragrunt-stack/services, and the db unit will be generated at .terragrunt-stack/db. Once the services stack is generated, Terragrunt will recursively generate a stack using the contents of the .terragrunt-stack/services/terragrunt.stack.hcl file until it fully generates the stack.

Any terragrunt stack run * commands will run on the top-level stack, picking up all the nested stacks as part of the process.

Technical Details

To support the introduction of Stacks, the following have to be achieved:

  • A new configuration file will be accepted by Terragrunt: terragrunt.stack.hcl

    The terragrunt.stack.hcl will follow the spec outlined in this proposal. It will support locals and unit blocks.

  • A new command will be introduced: terragrunt stack

    The terragrunt stack command will follow the spec outlined in this proposal. It will support the following subcommands:

    • terragrunt stack generate
    • terragrunt stack output
    • terragrunt stack run *

Considerations:

  • Users will have to .gitignore a new directory: .terragrunt-stack (though they could technically take a vendored approach and commit it).
  • Terragrunt will have a new instance where it will use go-getter to fetch Units for a Stack.
  • Users can write terragrunt.hcl configurations that are invalid in potentially non-obvious ways (they may use paths in their terragrunt.stack.hcl file that don't align with the config_path value in dependency blocks of terragrunt.hcl files).
  • Nested Stacks can result in significantly complicated dependency graphs. It may be hard to reason about a Stack with a large number of nested children.

Press Release

Introducing Terragrunt Stacks!

Stacks are a way to drastically reduce the repetition in Terragrunt codebases by leveraging a new configuration file: terragrunt.stack.hcl.

With the introduction of Stacks, users can now consolidate large numbers of terragrunt.hcl files into a single terragrunt.stack.hcl file.

Stacks are a powerful new feature, and are the largest change to how users write Terragrunt configurations to date.

To get started, try out the new terragrunt stack command, which allows you to create, manage, and destroy Stacks:

mkdir my-stack
cd my-stack
cat > terragrunt.stack.hcl <<EOF
locals {
    version = "v0.0.1"
    environment = "dev"
}

unit "service" {
    # Source is an intentionally broken URL for the press release.
    source = "github.com/gruntwork-io/terragrunt-stacks//stacks/mock/service?ref=${local.version}"
    path   = "service" # default would be gruntwork-io/terragrunt-stacks/stacks/mock/service
}

unit "db" {
    # Source is an intentionally broken URL for the press release.
    source = "github.com/gruntwork-io/terragrunt-stacks//stacks/mock/db?ref=${local.version}"
    path   = "db" # default would be gruntwork-io/terragrunt-stacks/stacks/mock/db
}

unit "api" {
    # Source is an intentionally broken URL for the press release.
    source = "github.com/gruntwork-io/terragrunt-stacks//stacks/mock/api?ref=${local.version}"
    path   = "api" # default would be gruntwork-io/terragrunt-stacks/stacks/mock/api
}

labels = {
	environment = local.environment
}
EOF

terragrunt stack run plan
terragrunt stack run apply

Drawbacks

Potentially Too Much Abstraction

The largest potential drawback to introducing Stacks is that it is yet another layer of abstraction to how users manage their infrastructure.

Terragrunt is already a fairly complex tool, and adding Stacks on top of it may make it more difficult for users to understand how their infrastructure is being managed.

The ways in which this design attempts to mitigate this drawback include:

  1. Stacks are Optional: Users can continue to use Terragrunt as they always have, and only introduce Stacks where they need to scale their Terragrunt codebase.

  2. Stacks are Explicit: Stacks are defined in a separate file, and are not a hidden feature in Terragrunt configurations. This makes it clear when a user is working with a Stack, and when they are not.

  3. Stacks are Simple: The design of Stacks is intentionally simple, with only a few added configurations and commands introduced in this initial proposal.

  4. Stacks are Familiar: All of the work Stacks do to interact with infrastructure is mediated by terragrunt.hcl files. Users can run terragrunt stack generate, and see a .terragrunt-stack directory that operates exactly like a current Terragrunt codebase without Stacks.

    This behavior falls in line with how the .terragrunt-cache directory was designed, allowing users to run tofu/terraform commands within the directory to achieve the same end result, dropping down a layer of abstraction.

Performance

Users leveraging remote Units as part of their stacks will deal with the performance penalty of fetching those Units from a remote source before running any infrastructure updates.

It's probably not a huge penalty to deal with, but users can always vendor their .terragrunt-stack directories and remove the performance penalty entirely.

Alternatives

_envcommon

The alternative that most Terragrunt users use today is to leverage a directory of shared configurations located in a directory named something like _envcommon.

This directory usually contains a collection of files that use Terragrunt HCL configurations. These files are then included in multiple other terragrunt.hcl files via include configuration blocks using the path attribute.

This approach is effective at reducing repetition in Terragrunt codebases, and has some advantages over the proposed solution:

  1. The number of committed terragrunt.hcl files directly relates to the number of Units in the codebase. This can make it easier to initiate individual state updates, as there is always a single terragrunt.hcl file that can be run.
  2. Units can be very easily edited within a directory of Units directly.

However, this approach also has some drawbacks:

  1. Synchronizing updates across many terragrunt.hcl files can be difficult, as there is no built-in way to ensure that all terragrunt.hcl files referencing the same _envcommon file are updated.
  2. The _envcommon directory is not independently versioned, and changes to the _envcommon directory can result in updates with large blast radii.

Larger OpenTofu/Terraform Modules

Another alternative is to put more logic into OpenTofu/Terraform modules themselves, and use a single terragrunt.hcl file to manage the larger module.

This approach is also effective at reducing repetition in Terragrunt codebases, and allows users to put more of the logic for managing infrastructure in .tf files if they would prefer that.

The drawbacks to this approach are largely the reason that using Terragrunt is advantageous:

  1. Managing more infrastructure in a single state file increases the blast radius of a single change.
  2. Functionality like run_cmd, before_hook, after_hook, error_hook can't be used to perform additional logic that is not supported by OpenTofu/Terraform.
  3. Seperation of concerns is more difficult to achieve, as the logic for configuring disparate reusable infrastructure is all in one terragrunt.hcl file.

Migration Strategy

Users that aren't currently using stacks will have to do some work in order to migrate their existing Terragrunt codebases to use Stacks if they want to take advantage of them.

Creating terragrunt.stack.hcl Files

Taking a collection of terragrunt.hcl files, and consolidating them into a single terragrunt.stack.hcl file is the first step in migrating to Stacks

Users will want to consider where they want their terragrunt.hcl files to live (either in the same repo, as part of a monorepo, or in a different, dedicated repository).

Then, they'll want to decide which Units they want to consolidate into a Stack, and write terragrunt.stack.hcl files to reference those Units.

Migrating State

Users will need to consider how they want to migrate their state files to work with Stacks.

For a gradual adoption of Stacks, users should prioritize using Stacks for net new infrastructure, then consider migrating existing infrastructure to Stacks.

Considerations to take into account when migrating state files include, but is not limited to:

  1. The frequency with which the infrastructure is updated: Users may prioritize migrating state for infrastructure that is updated less frequently to avoid accidentally encountering errors during the migration process.
  2. The blast radius of the infrastructure: Users may prioritize migrating state for infrastructure with a smaller blast radius to reduce the cost of accidental errors during the migration process.
  3. The value of migrating the Units to Stacks: Users may prioritize migrating state for Units that are more frequently repeated in the codebase to reduce the amount of code that needs to be managed as a consequence.

⚠️ Before migrating state, some basic precautions are advised. Users should always back up their state files before migrating them, and have a tested disaster recovery plan if accidental updates to infrastructure occur.

To migrate state files, users will want to follow these steps:

# 1. Pull down the state file from the remote state store
cd /path/to/terragrunt/unit
terragrunt state pull > /tmp/tf.tfstate
# 2. Ensure the stack is generated
cd /path/to/terragrunt/stack
terragrunt stack generate
# 3. Push the state to the new location as part of the Stack
cd /path/to/terragrunt/stack/.terragrunt-stack/path/to/unit
terragrunt state push /tmp/tf.tfstate

Unresolved Questions

How does the community feel about introducing Stacks as a feature in Terragrunt?

This will be a significant change to what users see in Terragrunt codebases, and will require that they be comfortable with the new abstraction.

Are there alternate abstractions Gruntwork should prefer to this?

What is the minimum required feature set of Stacks to make them useful?

There is a lot more planned for Stacks than what is presented in this proposal. One goal here is to present the minimum feature set that will make Stacks useful to users, and to receive feedback from the community.

Is there anything missing from this proposal that immediately jumps to mind as a requirement to make Stacks useful?

How does the community feel about the design of Stacks?

Does this seem like a natural abstraction that fits well within the existing Terragrunt ecosystem? Are there any changes that should be made to the design work better?

How does the community feel about the terminology used here?

Do the terms "Stack" and "Unit" make sense in the context of Terragrunt? Are there any other terms that might be more appropriate?

How does this proposal fit into the lifecycle of a Terragrunt Unit?

Careful consideration goes into making sure that Terragrunt has good tooling so that configuration can be introduced into codebases in a sensible and convenient manner, that it is easy to create, update, manage, use, and remove.

Stacks are viewed as a natural extension of this lifecycle, where Units can be refactored into Stacks when they need to be reused repeatedly.

Does this proposal fit well into that lifecycle?

References

Proof of Concept Pull Request

No response

Support Level

  • I have Terragrunt Enterprise Support
  • I am a paying Gruntwork customer

Customer Name

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    acceptedAccepted RFCfeature-completeAll the features desired for this functionality is deliveredgenerally-availableBreaking changes are minimized. Usage is enabled by default.rfcRequest For Comments

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions