Skip to content

Reboots done via juju ssh are causing hook errors intermittently #921

@dshcherb

Description

@dshcherb

https://bugs.launchpad.net/juju/+bug/1989629 - the issue description itself
https://bugs.launchpad.net/juju/+bug/1989629/comments/4 - root cause analysis

  • an environment is set up by a test (like enable_dpdk) and reboots a machine via juju ssh to apply some changes;
  • asynchronously, a juju agent decides to execute a hook like update-status before the agent is stopped by systemd;
  • the host comes up and brings up a unit agent which reports an error about the update-status execution.

This was bugging us in https://review.opendev.org/c/x/charm-ovn-chassis/+/856548/ due to its intermittent nature.

The codepath that leads to this:

self.enable_hugepages_vfio_on_hvs_in_vms(4)

zaza.utilities.machine_os.enable_hugepages(

https://github.com/openstack-charmers/zaza/blob/8f9f9c79b246ef09a632d40323c975f002fcd4cf/zaza/utilities/machine_os.py#L231

https://github.com/openstack-charmers/zaza/blob/8f9f9c79b246ef09a632d40323c975f002fcd4cf/zaza/utilities/machine_os.py#L203

https://github.com/openstack-charmers/zaza/blob/8f9f9c79b246ef09a632d40323c975f002fcd4cf/zaza/utilities/generic.py#L493-L504

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions