OCPBUGS-78589: baremetal: add serial console logging for bootstrap VM#10400
OCPBUGS-78589: baremetal: add serial console logging for bootstrap VM#10400elfosardo wants to merge 1 commit intoopenshift:mainfrom
Conversation
|
@elfosardo: This pull request references Jira Issue OCPBUGS-78589, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
WalkthroughAdd a per-domain serial device with logging to the bootstrap domain; pass architecture into live-ISO selection and conditionally append an architecture-mapped Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.3)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/infrastructure/baremetal/bootstrap_test.go (1)
25-28: Add arch-specific assertions for serial target compatibility.These assertions cover log wiring well, but they don’t guard against invalid serial target/model on non-x86 arches. Consider extending
TestConfigureDomainArch(or adding a focused test) to validate serial target settings per arch.As per coding guidelines, "-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/infrastructure/baremetal/bootstrap_test.go` around lines 25 - 28, The current assertions in TestConfigureDomainArch only validate serial log wiring (dom.Devices.Serials and dom.Devices.Serials[0].Log) but miss validating serial target/model per architecture; update TestConfigureDomainArch (or add a focused test) to assert that dom.Devices.Serials[0].Target.Name and dom.Devices.Serials[0].Target.Model (or the equivalent fields on the Serial struct) are set to the expected values for each supported arch (e.g., on x86 expect target "isa-serial" or model "chardev"/whatever the project standard is, and on non-x86 assert the compatible target/model or absence thereof), iterating or branching by the test's arch input to fail fast on incompatible combos and keeping the existing log assertions (Serials, Log, Log.File, Log.Append).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/infrastructure/baremetal/bootstrap.go`:
- Around line 80-85: The serial target is being hardcoded to "isa-serial" which
breaks non-x86 bootstrap domains; update newDomain() (or the constructor that
builds the libvirtxml.DomainSerialTarget) to accept the guest architecture
(e.g., arch string) and conditionally set DomainSerialTarget.Type and
DomainSerialTargetModel.Name based on that architecture (or simply omit Type and
Model to let libvirt auto-select), ensuring you adjust all callers to pass the
architecture and change the DomainSerialTarget instantiation instead of always
using "isa-serial".
---
Nitpick comments:
In `@pkg/infrastructure/baremetal/bootstrap_test.go`:
- Around line 25-28: The current assertions in TestConfigureDomainArch only
validate serial log wiring (dom.Devices.Serials and dom.Devices.Serials[0].Log)
but miss validating serial target/model per architecture; update
TestConfigureDomainArch (or add a focused test) to assert that
dom.Devices.Serials[0].Target.Name and dom.Devices.Serials[0].Target.Model (or
the equivalent fields on the Serial struct) are set to the expected values for
each supported arch (e.g., on x86 expect target "isa-serial" or model
"chardev"/whatever the project standard is, and on non-x86 assert the compatible
target/model or absence thereof), iterating or branching by the test's arch
input to fail fast on incompatible combos and keeping the existing log
assertions (Serials, Log, Log.File, Log.Append).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 59d69587-6b6e-4c00-b0f5-f1d17a074932
📒 Files selected for processing (2)
pkg/infrastructure/baremetal/bootstrap.gopkg/infrastructure/baremetal/bootstrap_test.go
|
/retest |
zaneb
left a comment
There was a problem hiding this comment.
The serial log seems to only contain logs from the bootloader. I think you also need to add console=ttyS0 to the kernel command line at line 261.
46881fd to
68d4847
Compare
|
@elfosardo: This pull request references Jira Issue OCPBUGS-78589, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/infrastructure/baremetal/bootstrap.go`:
- Around line 258-262: The kernel console argument is hardcoded to
"console=ttyS0"; update the kargs construction in bootstrap.go to choose the
console string based on capabilities.Host.CPU.Arch (used elsewhere in this
file). Map architectures to devices: "x86_64" -> "console=ttyS0", "aarch64" ->
"console=ttyAMA0", "s390x" -> "console=ttysclp0", "ppc64le" -> "console=hvc0",
and fallback to "console=ttyS0" if unknown; append that chosen console to the
existing kargs (and keep the existing fips handling). Ensure you reference
capabilities.Host.CPU.Arch and the kargs variable when making the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b6f4e595-aa88-4704-959d-b494f79d6f05
📒 Files selected for processing (2)
pkg/infrastructure/baremetal/bootstrap.gopkg/infrastructure/baremetal/bootstrap_test.go
68d4847 to
d0e3df7
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/infrastructure/baremetal/bootstrap.go (1)
259-264: Consolidate arch mappings to avoid future drift.
consoleDeviceandserialTargetTypeduplicate the same architecture keyspace in different functions. A shared config map (console + serial type together) would reduce maintenance risk when adding/changing architectures.♻️ Suggested refactor sketch
+type archDeviceConfig struct { + ConsoleDev string + SerialType string +} + +var archDeviceConfigs = map[string]archDeviceConfig{ + "x86_64": {ConsoleDev: "ttyS0", SerialType: "isa-serial"}, + "aarch64": {ConsoleDev: "ttyAMA0", SerialType: "system-serial"}, + "s390x": {ConsoleDev: "ttysclp0", SerialType: "sclp-serial"}, + "ppc64le": {ConsoleDev: "hvc0", SerialType: "spapr-vio-serial"}, +}Use
archDeviceConfigs[arch].ConsoleDevincreateLiveVolumeandarchDeviceConfigs[arch].SerialTypeinconfigureDomainArch.As per coding guidelines,
**: -Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.Also applies to: 384-389
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/infrastructure/baremetal/bootstrap.go` around lines 259 - 264, The arch-specific console and serial mappings are duplicated (consoleDevice and serialTargetType); create a single shared map named archDeviceConfigs mapping architecture string to a small struct with fields ConsoleDev and SerialType, then update createLiveVolume to use archDeviceConfigs[arch].ConsoleDev and update configureDomainArch to use archDeviceConfigs[arch].SerialType, removing the duplicated maps and adding safe lookup/default behavior for unknown arch values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/infrastructure/baremetal/bootstrap.go`:
- Around line 259-264: The arch-specific console and serial mappings are
duplicated (consoleDevice and serialTargetType); create a single shared map
named archDeviceConfigs mapping architecture string to a small struct with
fields ConsoleDev and SerialType, then update createLiveVolume to use
archDeviceConfigs[arch].ConsoleDev and update configureDomainArch to use
archDeviceConfigs[arch].SerialType, removing the duplicated maps and adding safe
lookup/default behavior for unknown arch values.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 010bb5d5-49ec-4649-b558-60127db4cda3
📒 Files selected for processing (2)
pkg/infrastructure/baremetal/bootstrap.gopkg/infrastructure/baremetal/bootstrap_test.go
|
/retest |
3 similar comments
|
/retest |
|
/retest |
|
/retest |
|
@zaneb when you have a moment please check this again |
zaneb
left a comment
There was a problem hiding this comment.
So now we're capturing all of the startup messages and the login prompt.
If that's enough then this looks good, but if we want the whole journal then we apparently need to either set systemd.journald.forward_to_console on the kernel command line (although this can apparently cause performance issues, according to the man page for journald.conf) or set up a systemd service that runs journalctl -f and redirects to the serial port.
| } | ||
| if t, ok := serialTargetType[arch]; ok && len(dom.Devices.Serials) > 0 { | ||
| dom.Devices.Serials[0].Target.Type = t | ||
| dom.Devices.Serials[0].Target.Model.Name = t |
There was a problem hiding this comment.
From what I can gather in the libvirt docs you can just leave these blank and it will do the Right Thing for the platform.
There was a problem hiding this comment.
yep, I was being explicit here but we can leave it to automation
@zaneb I was thinking that maybe we can consider that for a follow up, what we have now should be good for a first pass |
The bootstrap VM created by the installer does not have a serial console log file configured, making it impossible to diagnose boot failures when the VM is unreachable via SSH. Master/worker VMs already have this via dev-scripts. Add a serial device with a log file at /var/log/libvirt/qemu/<name>-serial0.log, which is automatically collected by the existing CI gather step. The serial target type is left unset so that libvirt auto-selects the appropriate device for each architecture. Append a console= kernel argument using the architecture-appropriate device (ttyS0 for x86_64, ttyAMA0 for aarch64, ttysclp0 for s390x, hvc0 for ppc64le) so that kernel and userspace output is directed to the serial console.
d0e3df7 to
a9540c0
Compare
|
You can add a separate system ignition file like the one on line 233, so it's not too bad in terms of complexity to add a service. But this looks like a win for now. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: zaneb The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/retest |
|
@elfosardo: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
The bootstrap VM created by the installer does not have a serial console
log file configured, making it impossible to diagnose boot failures when
the VM is unreachable via SSH. Master/worker VMs already have this via
dev-scripts.
Add a serial device with a log file at
/var/log/libvirt/qemu/-serial0.log, which is automatically
collected by the existing CI gather step. The serial target type is left
unset so that libvirt auto-selects the appropriate device for each
architecture. Append a console= kernel argument using the
architecture-appropriate device (ttyS0 for x86_64, ttyAMA0 for aarch64,
ttysclp0 for s390x, hvc0 for ppc64le) so that kernel and userspace
output is directed to the serial console.