Skip to content

We need a way to delete resources when terraform gets stuck #179

@HartS

Description

@HartS

I've been noticing terraform getting stuck trying to interact with ECP a lot lately.

For example, I attempted to destroy a cluster that failed to completely deploy with make clean.. terraform destroy started running and eventually stalled (presumably due to network/VPN hiccup). Four hours later, nothing was progressing, the process can't be terminated without kill -9, and terraform leaves the resources in a 'locked' state.

I don't know a way to recover from this and have wasted a ton of time trying to address it already; terraform does have a force-unlock subcommand, but attempting to run that yields Local state cannot be unlocked by another process . When this happened previously, I manually deleted the lockfile but that didn't allow terraform destroy to run again either.

I ended up having to delete the buildir manually and spend about 2 hours drilling into resources in the openstack console to ensure everything was cleaned up, but we should either determine a way to recover from this kind of scenario (which I've now hit again) in a graceful way that allows terraform to clean things up, or provide another subcommand in catapult to clean resources from ECP using the openstack CLIs instead of terraform

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions