fix: workspace directories are being deleted every hour by maratkomarov · Pull Request #123 · upbound/provider-opentofu

maratkomarov · 2026-02-12T22:45:45Z

Description of your changes

It all started with observing very long reconciliation loops in the OpenTofu workspaces. We have approximately 200 resources, and it takes about 1h to reconcile them all. Resources rarely change. So the provider observation should run tofu plan, see no change, and finish. What we see instead is that almost every time the provider runs tofu init, because it thinks that the workspace directory checksum has changed.

On closer examination, I found that the workspace directories periodically disappear. We mount a persistent volume at /tofu to preserve content if the provider pod restarts. Moreover, the provider pod has multi-hour uptime, but workspace directories consistently disappear every 1 hour or so.

I delved into the source code and found that the provider creates 2 workspace controllers: cluster and namespaced. Both controllers start garbage collectors: cluster, namespaced. Collectors run the same function, with the only difference being the namespaced value: true | false. The flag determines, which resource type, the collect() function will list: clusterv1beta1.Workspace or namespacedv1beta1.Workspace. Then the function lists the workspace directories and deletes those no longer associated with the existing workspace.

The problem is that both cluster and namespaced controllers store their workspaces in the same place: /tofu.

This causes the namespaced garbage collector to delete directories owned by a cluster and vice versa.

The fix is to give each resource type a separate folder:

$XP_TF_DIR/cluster - cluster workspaces
$XP_TF_DIR/namespaced - namespaced workspaces

I have:

Run make reviewable to ensure this PR is ready for review.

How has this code been tested

Passed the unit test suite.

Built a provider and validated that it works in our test environment as expected.

Upbound-CLA · 2026-02-12T22:45:52Z

All committers have signed the CLA.

erhancagirici · 2026-02-24T11:22:33Z

@maratkomarov many thanks for the analysis and the PR! Similar issue was also discovered in provider-terraform as well. The solution in this PR was also considered. While it is valid and resolves the issue, we wanted to avoid a potential breaking change regarding the directory structure, in case consumers rely on them externally.

#124 keeps the directory structure as-is, and got merged, so closing this in favor of it. Again, thanks for the PR anyways!

give cluster and namespaced resources their own folders

716d45c

maratkomarov requested review from erhancagirici, sergenyalcin, turkenf and ulucinar as code owners February 12, 2026 22:45

erhancagirici closed this Feb 24, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: workspace directories are being deleted every hour#123

fix: workspace directories are being deleted every hour#123
maratkomarov wants to merge 1 commit intoupbound:mainfrom
maratkomarov:fix-gc-race

maratkomarov commented Feb 12, 2026 •

edited

Loading

Uh oh!

Upbound-CLA commented Feb 12, 2026 •

edited

Loading

Uh oh!

erhancagirici commented Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

maratkomarov commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description of your changes

How has this code been tested

Uh oh!

Upbound-CLA commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

erhancagirici commented Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

maratkomarov commented Feb 12, 2026 •

edited

Loading

Upbound-CLA commented Feb 12, 2026 •

edited

Loading