Skip to content

Enhancement: Check cloud-init status #955

@lentzi90

Description

@lentzi90

We recently had a problem in the CI where the OS we use for the Nodes had an update that made one of our preKubeadmCommands fail. This is in turn caused some network issues, which was what we detected first. It took quite some time before we managed to figure out the root cause because no one suspected that there was an error in the cloud-init commands. The Machines were provisioned, the cluster worked as expected in many ways, all Nodes healthy.

My suggestion is that we add a step to the integration tests (or even to the controller if possible) to detect errors in cloud-init and report them in a more obvious way. In the CI we should simply be able to check cloud-init status and error out if it is set to status: error.
To be clear, this check should be done on the workload cluster Nodes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedDenotes an issue that needs help from a contributor. Must meet "help wanted" guidelines.kind/featureCategorizes issue or PR as related to a new feature.lifecycle/frozenIndicates that an issue or PR should not be auto-closed due to staleness.triage/acceptedIndicates an issue is ready to be actively worked on.

    Type

    No type

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions