libvirt_vm: expand reboot methods and improve state detection#4312
libvirt_vm: expand reboot methods and improve state detection#4312YongxueHong merged 1 commit intoavocado-framework:masterfrom
Conversation
WalkthroughAdds threading and partial imports and extends VM.reboot in virttest/libvirt_vm.py to support two reboot methods: a shell-based flow that logs in (session or serial), issues a reboot, detects guest shutdown via serial console patterns, and cleans up sessions; and a libvirt_reset flow that performs a Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
cebe1c2 to
582c364
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
virttest/libvirt_vm.py (1)
18-42: Add blank line between third-party and local imports to fix isort check.The isort pre-commit hook requires a blank line between the
avocado.utilsimports and thevirttestimports. Add a blank line after line 25 (from avocado.utils import crypto, process) before thevirttestimport block begins at line 26.The local import of
libvirtat line 452 insidemake_create_command()is intentional for lazy loading and not a concern.
🧹 Nitpick comments (4)
virttest/libvirt_vm.py (4)
2757-2767: Catch specific exceptions instead of broadException.Catching
Exceptionhides the specific failure modes and makes debugging harder. Consider catchingaexpectspecific exceptions (e.g.,aexpect.ExpectTimeoutError,aexpect.ExpectProcessTerminatedError).Also, the
read_until_any_line_matchescall blocks fortimeoutseconds if no pattern is found. Since this function is wrapped inutils_misc.wait_for(..., timeout=timeout), the effective behavior is a single blocking attempt rather than polling. Consider using a shorter internal timeout or restructuring for true polling.♻️ Suggested improvement
try: if console.read_until_any_line_matches( - reboot_patterns, timeout=timeout + reboot_patterns, timeout=min(timeout, 30) # Use shorter interval for polling ): LOG.debug("Reboot pattern detected in serial console") return True - LOG.debug(f"No reboot patterns detected within timeout {timeout} sec") + LOG.debug("No reboot patterns detected in this interval") return False - except Exception as e: - LOG.warning(f"Unexpected error during serial console reboot check: {e}") + except (aexpect.ExpectTimeoutError, aexpect.ExpectProcessTerminatedError) as e: + LOG.warning("Serial console reboot check error: %s", e) return False
2774-2785: Catch specific exceptions and consider timeout handling.Similar to the previous helper, this catches a broad
Exception. Thevirsh.eventcommand withevent_timeoutwill block for the full duration if no event occurs, which again creates inefficient nesting with the outerwait_for.For
virsh.event, consider catchingprocess.CmdErroror checkingresult.exit_statusmore explicitly.♻️ Suggested improvement
def _check_system_event_down(timeout): """Check if system is down via libvirt events.""" try: result = virsh.event( - domain=self.name, event="reboot", event_timeout=timeout + domain=self.name, event="reboot", event_timeout=min(timeout, 30) ) libvirt.check_exit_status(result) LOG.debug("Detected libvirt reboot event") return True - except Exception as e: - LOG.debug(f"Libvirt reboot event check failed: {e}") + except (process.CmdError, exceptions.TestFail) as e: + LOG.debug("Libvirt reboot event check failed: %s", e) return False
2787-2793: Thread result is not checked; consider adding error propagation.The reset command runs in a daemon thread but its result is never checked. If
virsh.resetfails, the code will rely on the event timeout to detect the failure, which doesn't provide clear error information.Consider either:
- Running
virsh.resetsynchronously (it should be quick), or- Storing the thread result and checking it, or
- At minimum, logging within the thread target if it fails.
♻️ Suggested improvement for synchronous execution
def _execute_system_reset(): """Execute system reset via virsh.""" - reset_thread = threading.Thread( - target=virsh.reset, args=(self.name,), name=f"reset-{self.name}" - ) - reset_thread.daemon = True - reset_thread.start() + result = virsh.reset(self.name) + if result.exit_status: + LOG.warning("virsh reset failed: %s", result.stderr_text)
2823-2831: Redundantwait_forwrapper around blocking check functions.The
_check_go_downpartial functions already block fortimeoutseconds internally (viaread_until_any_line_matchesorvirsh.event). Wrapping them inutils_misc.wait_for(..., timeout=timeout)creates redundancy since the inner call consumes the entire timeout beforewait_forcan retry.The current code effectively runs the check once and returns. If this single-attempt behavior is intentional, consider removing
wait_forfor clarity:♻️ Simplified approach
error_context.context("waiting for guest to go down", LOG.info) try: _reboot() - if not utils_misc.wait_for(_check_go_down, timeout=timeout): + if not _check_go_down(): raise virt_vm.VMRebootError("Guest refuses to go down") finally: if session: session.close() if serial_console: serial_console.close()Alternatively, if polling behavior is desired, the inner check functions should use a shorter timeout (e.g., 5-10 seconds) to allow multiple attempts within the overall timeout window.
582c364 to
388c86a
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In `@virttest/libvirt_vm.py`:
- Around line 18-31: The import block in virttest/libvirt_vm.py is misordered
and failing isort; reorder imports into standard library (threading, time,
functools.partial), third-party (aexpect, avocado.core.exceptions,
avocado.utils.crypto/process), and local/package imports (virttest.* and
virttest.utils_test.libvirt), or simply run isort on the file to automatically
fix ordering; ensure grouped imports (e.g., from aexpect import remote stays
with aexpect) and that aliases like from functools import partial remain in the
stdlib section before committing.
- Around line 2787-2803: When session is a remote.RemoteSession, wrap the call
to wait_for_serial_login in a try/except that catches the appropriate
login-related exception (e.g., LoginError or VMError used by this module) and,
on failure, fall back to a non-serial check function instead of unconditionally
requiring a serial console; specifically update the block that currently does
serial_console = self.wait_for_serial_login(timeout=timeout) to try obtaining
serial_console and on exception set serial_console = None and switch
_check_go_down to use an alternative like _check_system_event_down (or another
existing non-serial verifier) while keeping _reboot =
partial(_execute_shell_reboot, session) unchanged so non-serial setups can
reboot gracefully.
- Around line 2762-2779: The calls in _check_system_event_down and
_execute_system_reset must pass the VM's connection URI and use the correct
event type: update virsh.event(..., event="reboot", ...) to include
uri=self.connect_uri and replace the "reboot" event with the appropriate event
type for a reset (e.g., a lifecycle/state transition such as "lifecycle" or the
libvirt-specific event name your libvirt emits on reset); likewise, when
spawning the reset thread in _execute_system_reset, call virsh.reset with
uri=self.connect_uri so the reset runs against the correct connection; adjust
the event string in _check_system_event_down to the verified reset-related event
name for your libvirt version.
388c86a to
cf3f64f
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@virttest/libvirt_vm.py`:
- Around line 2744-2766: The helper _check_serial_console_down currently passes
the full timeout into console.read_until_any_line_matches, then the caller wraps
_check_serial_console_down with utils_misc.wait_for, causing nested long waits;
fix by avoiding nested full timeouts: either remove the outer
utils_misc.wait_for call that wraps _check_serial_console_down and call it once
(so read_until_any_line_matches controls the timeout), or change
_check_serial_console_down to use a short per-call poll timeout (e.g., 1s) when
calling console.read_until_any_line_matches and let utils_misc.wait_for manage
the overall timeout; update only the caller or the _check_serial_console_down
signature and its read_until_any_line_matches timeout argument accordingly.
cf3f64f to
fafd0d7
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@virttest/libvirt_vm.py`:
- Around line 2756-2766: Replace the broad except Exception block around the
call to read_until_any_line_matches (used with reboot_patterns and timeout) with
explicit exception handlers for the documented aexpect exceptions: catch
aexpect.ExpectProcessTerminatedError and aexpect.ExpectError (which already
covers ExpectTimeoutError) and handle each the same way you currently do (log
the debug message and return False); ensure the aexpect module is
imported/available in virttest/libvirt_vm.py and update the except clause(s)
around the read_until_any_line_matches call accordingly.
|
Hi @dzhengfy @nanli1 @yanglei-rh |
fafd0d7 to
98577e2
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@virttest/libvirt_vm.py`:
- Around line 2734-2735: The docstring for the reboot method documents the
reboot option as "system_reset" but the implementation checks for
"libvirt_reset", causing callers using the documented name to raise
VMRebootError; update the implementation in the reboot function (the parameter
named method and the branch that currently looks for "libvirt_reset") to check
for "system_reset" instead (or alternatively update the docstring to
"libvirt_reset") and make sure any other occurrences of the reboot keyword in
the file (and tests) are adjusted so the docstring, the branch check, and
VMRebootError message use the same name consistently.
- Around line 2815-2824: The libvirt_reset path races because _reboot() (which
calls _execute_libvirt_reset -> virsh.reset) runs before _check_go_down() (which
calls _check_system_event_reboot -> virsh.event) subscribes; fix by starting the
event listener first and only then triggering the reset: spawn the
event-monitoring call (_check_system_event_reboot via _check_go_down) in a
background thread or ensure it registers synchronously before invoking
_execute_libvirt_reset, then wait on the monitor (use the existing
utils_misc.wait_for or a threading.Event) to observe the reboot; update the
libvirt_reset branch (where _reboot and _check_go_down are set) to reverse the
order so the listener is active before calling _reboot().
The current reboot verification logic relies on session responsiveness, which can be unreliable. To ensure a more robust verification, this update introduces hardware-level reset capabilities and transitions to event-driven monitoring. Key Enhancements: 1. Introduced system_reset Method: Added support for the virsh reset command. This provides a "hard" reboot option by mimicking a physical power button reset, bypassing the guest OS shutdown signals when necessary. 2. Heuristic Reboot Detection: Rather than polling for an active session, the system now monitors the serial console output for specific boot patterns. 3. Libvirt Event Integration: Leveraged libvirt lifecycle events to detect state transitions in real-time, providing a more accurate confirmation that the VM has successfully cycled. Signed-off-by: Yongxue Hong <yhong@redhat.com>
98577e2 to
1e87c50
Compare
|
Hi @dzhengfy @qiankehan |
Hi @qiankehan |
qiankehan
left a comment
There was a problem hiding this comment.
Sorry for late reply. LGTM
The current reboot verification logic relies on session responsiveness, which can be unreliable. To ensure a more robust verification, this update introduces hardware-level reset capabilities and transitions to event-driven monitoring.
Key Enhancements:
ID: LIBVIRTAT-22245
Summary by CodeRabbit
New Features
Improvements