Skip to content
This repository was archived by the owner on Sep 14, 2021. It is now read-only.
This repository was archived by the owner on Sep 14, 2021. It is now read-only.

Clean up OpenStack VMs that fail to start properly #54

@SolomonShorser-OICR

Description

@SolomonShorser-OICR

Sometimes OpenStack VMs start and immediately are in an "ERROR" state. I've seen this happen when the resources for the VM flavour are available to the OpenStack tenant so the VM is created but the resources are not all available in the same physical compute node, so OpenStack immediately fails and the new VM is left in an ERROR state. For example, the tenant might have lots of memory available, but spread out in little portions between many nodes. If the flavour requires more memory than is available in any single node, but less than the sum of all free memory across all nodes, then this could happen.

The problem is that the new VM still uses some of the resources it wanted, so they are still allocated to that VM. Youxia is not able to properly remove these VMs, so they accumulate while not doing anything, but they still consume some resources. In theory, this could lead to a situation where Youxia keeps launching VMs that can't run but consume resources until all resources are locked up by idle/ERROR VMs.

Youxia needs a way to detect VMs that are in this state and remove them before attempting to create more VMs.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions