Skip to content

(CDK Garbage Collection): stack-scoped garbage collection #32799

Open
@kaizencc

Description

@kaizencc

Describe the feature

We recently launched CDK Garbage Collection in CDK v2.165.0. This version of garbage collection is scoped to an individual environment (account + region) due to legacy constraints of the CDK Assets mechanism. With a modern CDK Assets, we can scope CDK Garbage Collection to each individual stack, and this will fit the mental model of CDK customers better. Additionally, it will fix a theoretical race condition that exists in CDK Garbage Collection today. See: https://github.com/aws/aws-cdk/tree/main/packages/aws-cdk#theoretical-race-condition-with-review_in_progress-stacks

Use Case

Customers who want to garbage collect assets that are managed by their CDK app, and disregard other stacks in the same account/region.

Proposed Solution


Background:

Garbage Collection was completed 10/25/2024 with the following design:

Image

The main requirement for that design was that garbage collection would fit with the existing asset mechanism so that customers would be able to retroactively clean up their bootstrapped resources. While the initial Garbage Collection achieves exactly that, it comes with the following caveats:

  • Garbage Collection must be done per-environment, not per-stack or per-app. This is because all stacks in an environment share the same bootstrapped S3 bucket / ECR repository and the assets in there are virtually indistinguishable from each other.
  • Garbage Collection has a subtle race condition when dealing with REVIEW_IN_PROGRESS stacks. This again is a result of it being a per-environment operation, so it has to deal with stacks getting created in parallel.

Goal:

A better version of Garbage Collection would be one that can operate on a per-stack basis. This would have the benefit of being a much more contained scope for a delete operation.

Design:

We cannot achieve this with the current version of the asset mechanism because all assets are named via their content-based hash. This means that different stacks can share the same asset in the same environment. One stack not using a particular asset is not enough to say that the asset is isolated because other stacks could be referencing the same one.

A new asset upload mechanism would need to ensure each asset is uploaded with an identifier to the stack. That can look something like this:

/assets/MyStack/.zip

The complexity here would be that a) stacks can be renamed at deploy time, and b) nested stacks would need to be handled correctly.

For a), we would need to make sure that the stack identifier is unique to the stack and traceable back to the stack even if the stack name changes. For this, we can likely reuse template metadata to trace the name uploaded to the actual stack it represents.

For b), TBD

Migrating from old to new:

Customers migrating from the old asset mechanism would see all their assets reuploaded the first time, but there should be no problem beyond that.

Why should we do this?

This will result in a cleaner experience overall for both cdk gc and assets. In the past CDK has determined that the asset mechanism is an implementation detail but in practice customers are confused/concerned that assets are not separated out per-stack. This will align better with our customers’ understanding that CDK stacks are independent of each other. For cdk gc, the operation would take a trivial amount of time.

Why should we not do this?

We already have a system that improves on our bootstrap system, called the App Staging Synthesizer. The idea is to bootstrap resources per-stack to separate out bootstrap entirely. We can invest more in migrating customers to use that system that negates the need for garbage collection entirely.


Other Information

This may eventually be an RFC when we decide to pick this up. For now, if this is something you are interested, please 👍 this issue.

Acknowledgements

  • I may be able to implement this feature request
  • This feature might incur a breaking change

CDK version used

2.165.0

Environment details (OS name and version, etc.)

Mac

Metadata

Metadata

Assignees

No one assigned

    Labels

    effort/mediumMedium work item – several days of effortfeature-requestA feature should be added or improved.p2package/toolsRelated to AWS CDK Tools or CLI

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions