OCPEDGE-2217: Add TNF ABI workflow to existing assisted installer flow #8457

fonta-rh · 2025-11-27T14:32:19Z

Summary

Add support for Two-Node Fencing (TNF) in the Agent-Based Installer workflow.

Background

Two-Node Fencing (TNF) enables highly-available OpenShift clusters with only 2 control plane nodes by using BMC-based fencing to ensure safe failover. The fencing credentials (BMC address, username, password) are provided in install-config.yaml, controlPlane block and must be applied to hosts during installation.

Changes

This PR adds the ABI client-side support for TNF:

getHostConfigDir() - Reads HOST_CONFIG_DIR env var with fallback to default path. This is a wonky workaround to not modify existing functions and avoid just hardcoding the path.
loadFencingCredentials() - Parses fencing-credentials.yaml generated by the installer
applyHostConfigByHostname() - Applies fencing credentials to hosts by matching hostname
Updated ApplyHostConfigs() to load and apply fencing credentials during host configuration

The fencing credentials file is read once and credentials are applied to hosts via the existing V2UpdateHost API. The existing handleFencing() in builder.go then reads these credentials from hosts when generating the final install-config.

Related PR

Works in tandem with an installer PR which generates the fencing-credentials.yaml file from install-config.yaml and adds client-side validations

List all the issues related to this PR

New Feature
https://issues.redhat.com/browse/OCPEDGE-2217

What environments does this code impact?

Automation (CI, tools, etc)
Operator Managed Deployments

How was this code tested?

assisted-test-infra environment
dev-scripts environment - - Tested using dev-scripts to install a working 4.21 cluster with a clusterbot build using this and the related installer PR
Reviewer's test appreciated
Waiting for CI to do a full test run
[] Manual
No tests needed

Checklist

Title and description added to both, commit and PR.
Relevant issues have been associated (see [CONTRIBUTING] guide)
This change does not require a documentation update (docstring, docs, README, etc)
Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

Are the title and description (in both PR and commit) meaningful and clear?
Is there a bug required (and linked) for this change?
Should this PR be backported?

Add getHostConfigDir() and loadFencingCredentials() to support reading fencing credentials from fencing-credentials.yaml. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Add applyHostConfigByHostname() and update ApplyHostConfigs() to apply fencing credentials to hosts by matching hostname. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Add tests for loadFencingCredentials() and applyHostConfigByHostname() covering valid files, missing files, validation errors, and API calls. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

openshift-ci-robot · 2025-11-27T14:32:23Z

openshift-ci · 2025-11-27T14:32:26Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

coderabbitai · 2025-11-27T14:32:26Z

Walkthrough

Adds loading and parsing of fencing-credentials.yaml from HOST_CONFIG_DIR and applies fencing credentials to hosts by hostname during host configuration, with validation, error propagation, and tests covering parsing and API update scenarios.

Changes

Cohort / File(s)	Summary
Host configuration logic `cmd/agentbasedinstaller/host_config.go`	Adds getHostConfigDir() to read HOST_CONFIG_DIR (default /etc/assisted/hostconfig). Implements loadFencingCredentials() to read and strictly parse `fencing-credentials.yaml` into a map keyed by hostname. Integrates loading into ApplyHostConfigs and adds applyHostConfigByHostname() to match inventory.Hostname and call V2UpdateHost to set FencingCredentials, wrapping API errors as UpdateFailure and logging outcomes. Maintains existing MAC-based config flow.
Unit tests and test helpers `cmd/agentbasedinstaller/host_config_test.go`	New tests creating temporary `fencing-credentials.yaml`, validating parsing (required fields, optional certificateVerification, case-insensitivity, strict unknown-field rejection, invalid YAML), round-trip compatibility, and applyHostConfigByHostname() behavior across scenarios (nil map, missing inventory, empty hostname, no match, match with successful V2UpdateHost, and simulated V2UpdateHost failure). Adds mockHostConfigTransport and helper functions for test scaffolding.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Pay attention to YAML strict parsing and error messages in loadFencingCredentials().
Verify applyHostConfigByHostname() correctly handles missing inventory/hostname and wraps API errors as UpdateFailure.
Review integration in ApplyHostConfigs to ensure order and error propagation are correct.
Inspect mockHostConfigTransport behavior to confirm tests accurately simulate V2UpdateHost success/failure.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci-robot · 2025-11-27T14:33:11Z

openshift-ci · 2025-11-27T14:33:16Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: fonta-rh
Once this PR has been reviewed and has the lgtm label, please assign bfournie for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

cmd/agentbasedinstaller/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot · 2025-11-27T14:37:29Z

openshift-ci-robot · 2025-12-01T17:39:00Z

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

cmd/agentbasedinstaller/host_config.go (1)

75-99: Consider detecting duplicate hostnames in fencing credentials.

If the YAML file contains multiple entries with the same hostname, the latter entry silently overwrites the former. This could mask configuration errors. Consider adding a check:
 	credentialsMap := make(map[string]*models.FencingCredentialsParams)

 	for i, cred := range fcFile.Credentials {
 		if cred.Hostname == "" {
 			return nil, fmt.Errorf("fencing credential at index %d has empty hostname", i)
 		}

+		if _, exists := credentialsMap[cred.Hostname]; exists {
+			return nil, fmt.Errorf("duplicate fencing credential for hostname: %s", cred.Hostname)
+		}
+
 		if cred.Address == nil {

cmd/agentbasedinstaller/host_config_test.go (1)

235-395: Good test coverage for hostname-based configuration.

Tests cover key scenarios including skip conditions and success/failure paths. Consider adding a test case for malformed inventory JSON to verify error handling:

Context("when host has malformed inventory JSON", func() {
    It("should return error", func() {
        testLogger, _ := test.NewNullLogger()
        host := &models.Host{
            ID:         &testHostID,
            InfraEnvID: testInfraEnvID,
            Inventory:  `{invalid json`,
        }

        fencingCreds := map[string]*models.FencingCredentialsParams{
            "master-0": {
                Address:  strPtr("redfish+https://example.com"),
                Username: strPtr("admin"),
                Password: strPtr("password"),
            },
        }

        err := applyHostConfigByHostname(ctx, testLogger, bmInventory, host, fencingCreds)
        Expect(err).To(HaveOccurred())
        Expect(err.Error()).To(ContainSubstring("failed to unmarshal"))
    })
})

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 2e6e2ee and b3ff118.

📒 Files selected for processing (2)

cmd/agentbasedinstaller/host_config.go (3 hunks)
cmd/agentbasedinstaller/host_config_test.go (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

cmd/agentbasedinstaller/host_config_test.go
cmd/agentbasedinstaller/host_config.go

🧬 Code graph analysis (2)

cmd/agentbasedinstaller/host_config_test.go (3)

internal/installcfg/installcfg.go (1)

CertificateVerification (235-235)

models/fencing_credentials_params.go (1)

FencingCredentialsParams (21-38)

models/host_update_params.go (1)

HostUpdateParams (22-51)

cmd/agentbasedinstaller/host_config.go (4)

models/fencing_credentials_params.go (1)

FencingCredentialsParams (21-38)

internal/common/test_configuration.go (1)

Address (615-618)

internal/installcfg/installcfg.go (1)

CertificateVerification (235-235)

models/host_update_params.go (1)

HostUpdateParams (22-51)

🔇 Additional comments (7)

cmd/agentbasedinstaller/host_config.go (4)

34-42: LGTM!

Simple and correct implementation for environment variable lookup with a sensible default.

111-138: LGTM!

Good design to load fencing credentials once before the loop, and the error handling follows the established pattern consistently.

266-276: LGTM!

Error handling correctly wraps API errors into UpdateFailure with the necessary context. The config field is nil here, but UpdateFailure.Hostname() correctly falls back to inventory.Hostname which is always populated at this point.

246-278: Consider checking if fencing credentials need updating before calling API.

Unlike applyHostConfig() which checks if configuration changes are needed before updating, this function always calls V2UpdateHost when matching credentials are found. This could result in unnecessary API calls on repeated executions.

Consider comparing existing host fencing credentials (if available in the model) before making the update call, similar to how applyRole and applyRootDeviceHints check existing state.

cmd/agentbasedinstaller/host_config_test.go (3)

1-17: LGTM!

Standard and appropriate imports for Ginkgo/Gomega-based unit tests.

19-233: LGTM!

Comprehensive test coverage for loadFencingCredentials including:

Missing file handling

Valid credentials parsing with optional fields

Validation of required fields

Invalid YAML and strict parsing behavior

Installer compatibility verification

402-432: LGTM!

Clean and focused mock transport implementation that properly tracks API call invocations and supports error injection for failure testing.

codecov · 2025-12-01T17:56:49Z

Codecov Report

❌ Patch coverage is 75.28090% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 43.51%. Comparing base (e64cd91) to head (3ceafa5).
⚠️ Report is 21 commits behind head on master.

Files with missing lines	Patch %	Lines
cmd/agentbasedinstaller/host_config.go	75.28%	20 Missing and 2 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #8457      +/-   ##
==========================================
+ Coverage   43.33%   43.51%   +0.17%     
==========================================
  Files         405      411       +6     
  Lines       70793    71140     +347     
==========================================
+ Hits        30681    30959     +278     
- Misses      37368    37425      +57     
- Partials     2744     2756      +12

Files with missing lines	Coverage Δ
cmd/agentbasedinstaller/host_config.go	`21.89% <75.28%> (+21.89%)`	⬆️

... and 19 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Add validation to return an error if the fencing credentials file contains multiple entries with the same hostname, preventing silent overwrites that could mask configuration errors. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Add test case to verify applyHostConfigByHostname returns an error when host inventory contains invalid JSON. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

cmd/agentbasedinstaller/host_config_test.go (2)
19-233: Comprehensive test coverage for fencing credentials parsing.

The test suite thoroughly covers valid inputs, validation errors, strict parsing, and edge cases. The tests properly set up and clean up temporary directories, and validation scenarios are well-structured.

Consider adding a test case for duplicate hostnames in the credentials YAML to verify the behavior is as expected (currently the map would silently overwrite the first entry with the second).
It("should handle duplicate hostnames by using the last occurrence", func() {
	content := `credentials:
- hostname: master-0
  address: redfish+https://192.168.111.1:8000/redfish/v1/Systems/abc
  username: admin
  password: password123
- hostname: master-0
  address: redfish+https://192.168.111.1:8000/redfish/v1/Systems/xyz
  username: admin2
  password: password456
`
	err := os.WriteFile(filepath.Join(tempDir, "fencing-credentials.yaml"), []byte(content), 0600)
	Expect(err).NotTo(HaveOccurred())

	creds, err := loadFencingCredentials(tempDir)
	Expect(err).NotTo(HaveOccurred())
	Expect(creds).To(HaveLen(1))
	// Verify which entry wins (last one in this case)
	Expect(*creds["master-0"].Username).To(Equal("admin2"))
})
425-455: Mock transport implementation is appropriate for testing.

The mock correctly implements the runtime transport interface, captures API call parameters for verification, and supports error injection. The type switch properly handles V2UpdateHostParams and returns appropriate responses.

Minor: Line 454 could be simplified from fmt.Errorf("%s", errMsg) to just errors.New(errMsg):
 func (m *mockHostConfigTransport) SetUpdateError(errMsg string) {
-	m.updateError = fmt.Errorf("%s", errMsg)
+	m.updateError = errors.New(errMsg)
 }
But this is purely a style preference in test code.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between b3ff118 and 3ceafa5.

📒 Files selected for processing (2)

cmd/agentbasedinstaller/host_config.go (3 hunks)
cmd/agentbasedinstaller/host_config_test.go (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

cmd/agentbasedinstaller/host_config.go

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.

Files:

cmd/agentbasedinstaller/host_config_test.go

🔇 Additional comments (1)

cmd/agentbasedinstaller/host_config_test.go (1)

235-418: Excellent test coverage for applyHostConfigByHostname.

The test suite comprehensively covers all code paths including early-return scenarios, the happy path with proper API parameter verification, and error cases. The mock transport correctly captures API interactions and supports error injection for testing failure scenarios.

openshift-ci · 2025-12-02T21:24:38Z

@fonta-rh: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

fonta-rh and others added 3 commits November 27, 2025 15:24

OCPEDGE-2217: Add fencing credentials loading for ABI

4ab7a14

Add getHostConfigDir() and loadFencingCredentials() to support reading fencing credentials from fencing-credentials.yaml. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

OCPEDGE-2217: Apply fencing credentials to hosts in ABI

aa9c9cf

Add applyHostConfigByHostname() and update ApplyHostConfigs() to apply fencing credentials to hosts by matching hostname. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 27, 2025

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 27, 2025

openshift-ci bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Nov 27, 2025

fonta-rh mentioned this pull request Dec 1, 2025

OCPEDGE-1517: add-tnf-agent-based-installer openshift/installer#9946

Open

fonta-rh marked this pull request as ready for review December 1, 2025 17:38

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 1, 2025

openshift-ci bot requested review from andfasano and zaneb December 1, 2025 17:39

coderabbitai bot reviewed Dec 1, 2025

View reviewed changes

fonta-rh and others added 2 commits December 2, 2025 19:15

OCPEDGE-2217: Add test for malformed inventory JSON handling

3ceafa5

Add test case to verify applyHostConfigByHostname returns an error when host inventory contains invalid JSON. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

coderabbitai bot reviewed Dec 2, 2025

View reviewed changes

OCPEDGE-2217: Add TNF ABI workflow to existing assisted installer flow #8457

Are you sure you want to change the base?

OCPEDGE-2217: Add TNF ABI workflow to existing assisted installer flow #8457

Uh oh!

Conversation

fonta-rh commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Background

Changes

Related PR

List all the issues related to this PR

What environments does this code impact?

How was this code tested?

Checklist

Reviewers Checklist

Uh oh!

openshift-ci-robot commented Nov 27, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

List all the issues related to this PR

What environments does this code impact?

How was this code tested?

Checklist

Reviewers Checklist

Uh oh!

openshift-ci bot commented Nov 27, 2025

Uh oh!

coderabbitai bot commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

openshift-ci-robot commented Nov 27, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Background

Changes

Related PR

List all the issues related to this PR

What environments does this code impact?

How was this code tested?

Checklist

Reviewers Checklist

Uh oh!

openshift-ci bot commented Nov 27, 2025

Uh oh!

openshift-ci-robot commented Nov 27, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Background

Changes

Related PR

List all the issues related to this PR

What environments does this code impact?

How was this code tested?

Checklist

Reviewers Checklist

Uh oh!

openshift-ci-robot commented Dec 1, 2025 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Background

Changes

Related PR

List all the issues related to this PR

What environments does this code impact?

How was this code tested?

Checklist

Reviewers Checklist

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

fonta-rh commented Nov 27, 2025 •

edited

Loading

openshift-ci-robot commented Nov 27, 2025 •

edited by openshift-ci bot

Loading

coderabbitai bot commented Nov 27, 2025 •

edited

Loading

openshift-ci-robot commented Nov 27, 2025 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Nov 27, 2025 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Dec 1, 2025 •

edited by openshift-ci bot

Loading

codecov bot commented Dec 1, 2025 •

edited

Loading