Open
Description
ISSUE TYPE
- Bug Report
COMPONENT NAME
HA, KVM
CLOUDSTACK VERSION
4.20
CONFIGURATION
Zone type : Advanced Network
Primary Storage: ShareMountPoint
OS / ENVIRONMENT
Hosts OS: Ubuntu 22.04 (HPE ProLiant BL460c Gen10)
Management Server OS: Ubuntu 22.04
out-of-band management driver: IPMI
SUMMARY
Hello, I configured out-of-band management on my hosts, however, the HA status of my hosts is always between Suspect or DEGRADED, I have already checked the IPMI communication and everything is working, my servers are also on and operational.
STEPS TO REPRODUCE
Configure Hosts KVM
Configure HA provider with KVMHAProvider
Configure out-of-band management with IPMI driver
Enable HA and see HA State
EXPECTED RESULTS
HA hosts with AVAILABLE state
ACTUAL RESULTS
Managemente Server logs:
@MSLOG@:2025-01-07 00:29:25,698 DEBUG [o.a.c.h.HAManagerImpl] (pool-4-thread-21:[]) HA state post-transition:: new state=[Suspect], old state=[Checking], for resource id=[3], status=[true], ha config state=[Suspect].
@MSLOG@:2025-01-07 00:29:25,707 DEBUG [o.a.c.h.HAManagerImpl] (pool-4-thread-21:[]) Transitioned host HA state from:Checking to:Suspect due to event:TooFewActivityCheckSamples for the host id:3
@MSLOG@:2025-01-07 00:29:41,622 DEBUG [o.a.c.h.HAManagerImpl] (BackgroundTaskPollManager-2:[ctx-28440d8d]) HA state post-transition:: new state=[Checking], old state=[Suspect], for resource id=[2], status=[true], ha config state=[Checking].
@MSLOG@:2025-01-07 00:29:41,629 DEBUG [o.a.c.h.HAManagerImpl] (BackgroundTaskPollManager-2:[ctx-28440d8d]) Transitioned host HA state from:Suspect to:Checking due to event:PerformActivityCheck for the host id:2
2025-01-07 15:44:06,928 DEBUG [o.a.c.u.p.ProcessRunner] (pool-2-thread-11:[]) Process standard output for command [/usr/bin/ipmitool -I lanplus -R 1 -v -H 10.16.20.21 -p 623 -U cloudstack -P ***** chassis power status]: [Chassis Power is on
].
2025-01-07 15:44:06,928 DEBUG [o.a.c.u.p.ProcessRunner] (pool-2-thread-11:[]) Process standard error output command [/usr/bin/ipmitool -I lanplus -R 1 -v -H 10.16.20.21 -p 623 -U cloudstack -P ***** chassis power status]: [Running Get PICMG Properties my_addr 0x20, transit 0, target 0x20
Error response 0xc1 from Get PICMG Properities
Running Get VSO Capabilities my_addr 0x20, transit 0, target 0x20
Invalid completion code received: Invalid command
Discovered IPMB address 0x0
].
2025-01-07 15:44:06,929 DEBUG [o.a.c.o.d.i.IpmitoolOutOfBandManagementDriver] (pool-2-thread-11:[]) The command [/usr/bin/ipmitool -I lanplus -R 1 -v -H 10.16.20.21 -p 623 -U cloudstack -P PASSWORD chassis power status] was successful and got the result [Chassis Power is on].
KVM hosts logs:
2025-01-07 15:49:52,534 DEBUG [kvm.resource.KVMHAChecker] (pool-1067-thread-1:[]) (logid:) Checking heart beat with KVMHAChecker for host IP [IP_SERVER] in pools []
2025-01-07 15:49:52,534 WARN [kvm.resource.KVMHAChecker] (pool-1067-thread-1:[]) (logid:) All checks with KVMHAChecker for host IP [IP_SERVER] in pools [] considered it as dead. It may cause a shutdown of the host.
Metadata
Metadata
Assignees
Type
Projects
Status
Todo