Skip to content

MGMT-21201: Enable dual-stack clusters#1659

Merged
openshift-merge-bot[bot] merged 1 commit intoopenshift-kni:mainfrom
danmanor:enable-dual-stack
Aug 17, 2025
Merged

MGMT-21201: Enable dual-stack clusters#1659
openshift-merge-bot[bot] merged 1 commit intoopenshift-kni:mainfrom
danmanor:enable-dual-stack

Conversation

@danmanor
Copy link
Copy Markdown
Member

@danmanor danmanor commented Jul 22, 2025

Enable dual-stack clusters

Depend on rh-ecosystem-edge/recert#390

Background / Context

The lifecycle-agent is a core component responsible for managing Image-Based Upgrades (IBU) and Image-Based Install (IBI) operations in OpenShift Single Node OpenShift (SNO) clusters. It handles critical cluster lifecycle operations including:

  • Network Configuration Management: Configuring node IPs, machine networks, cluster networks, and service networks during cluster transitions
  • Recertification (Recert): Re-signing certificates with updated cluster information (IPs, hostnames, etc.) when transforming seed images into target clusters
  • Post-Pivot Operations: Managing network setup, DNS configuration, and kubelet configuration after cluster pivot operations
  • Seed Cluster Information: Capturing and transforming network details from seed clusters for target cluster deployment

Currently, the lifecycle-agent's network handling is designed around single-stack networking, where clusters operate on either IPv4 OR IPv6, but not both simultaneously. All network-related data structures use single string fields (NodeIP, MachineNetwork) and the logic assumes a single IP address per network interface.

Key components involved:

  • api/seedreconfig: Defines the SeedReconfiguration API for cluster transformation parameters
  • utils/client_helper.go: Extracts cluster network information from Kubernetes API
  • lca-cli/postpivot: Handles post-upgrade network configuration and kubelet setup
  • internal/recert: Manages certificate re-signing with updated network information
  • lca-cli/seedclusterinfo: Captures seed cluster network details for replication

Issue / Requirement / Reason for change

MGMT-21201: The lifecycle-agent needs to support dual-stack networking configurations where OpenShift clusters operate with both IPv4 and IPv6 addresses simultaneously on the same interfaces.

Current Limitations:

  1. Single IP Assumption: All network fields (NodeIP, MachineNetwork) are single strings, preventing multiple IP support
  2. Limited Network Discovery: Cluster info extraction only captures the first internal IP address
  3. Inadequate Recert Logic: Certificate re-signing only handles single IP changes
  4. Kubelet Configuration: Node IP hint generation assumes single network per stack
  5. Missing Test Coverage: No validation for dual-stack scenarios

Requirements:

  • Support IPv4 + IPv6 dual-stack clusters in IBU/IBI operations
  • Maintain 100% backward compatibility with existing single-stack configurations
  • Handle multiple machine networks per IP family
  • Update recert logic to process multiple IP addresses in certificate SANs
  • Ensure proper kubelet configuration for dual-stack node IPs

Changes Made

API Extensions

  • Added NodeIPs []string to SeedReconfiguration and SeedClusterInfo for multiple node IPs
  • Added MachineNetworks []string to support multiple machine network CIDRs
  • Preserved legacy fields (NodeIP, MachineNetwork) for backward compatibility with precedence rules

Network Configuration Updates

  • Enhanced GetClusterInfo() to discover all internal node IPs via getNodeInternalIPs()
  • Added getMachineNetworks() to extract all machine networks from install config
  • Implemented backward compatibility by populating legacy fields with first array element

Post-Pivot Improvements

  • Updated setNodeIpHint() to generate space-separated IP hints: KUBELET_NODEIP_HINT=<ip1> <ip2>
  • Refactored setNodeIPIfNotProvided() to parse kubelet config from /etc/systemd/system/kubelet.service.d/20-nodenet.conf
  • Added parseKubeletNodeIPs() function to extract both KUBELET_NODE_IP and KUBELET_NODE_IPS environment variables
  • Enhanced file validation to check for both existence AND valid content before triggering nodeip-configuration service

Recertification Logic

  • Implemented slices.Equal() comparison for clean IP change detection
  • Updated config IP format to comma-separated list: config.IP = "ip1,ip2" for multiple addresses
  • Enhanced certificate SAN rules to include all old and new IP addresses in replacement rules

Test Scenarios Covered:

  • ✅ Single-stack IPv4 and IPv6 configurations
  • ✅ Dual-stack (IPv4 + IPv6) with both primary orders
  • ✅ Multiple networks per IP family
  • ✅ Legacy → new field migration scenarios
  • ✅ Error handling and edge cases
  • ✅ Backward compatibility validation

Backward Compatibility

100% backward compatibility maintained:

  • Existing single-stack clusters continue to work without modification
  • Legacy NodeIP and MachineNetwork fields preserved and populated
  • New fields (NodeIPs, MachineNetworks) take precedence when populated
  • Automatic migration from single to multiple field formats
  • All existing APIs and configuration files remain functional

Testing

Unit Tests Added

  • API compatibility tests: Verify backward compatibility with old configurations
  • Network parsing tests: Validate IPv4, IPv6, and dual-stack parsing
  • Kubelet config tests: Ensure proper environment variable generation
  • Recert integration tests: Test certificate handling with multiple IPs

Integration Scenarios

  • Single-stack clusters: IPv4-only and IPv6-only configurations
  • Dual-stack clusters: IPv4+IPv6 with different primary IP families
  • Migration scenarios: Existing single-stack → dual-stack upgrades
  • Edge cases: Empty configurations, invalid IPs, network parsing errors

Checklist

This is a personal checklist that should be applicable to most PRs. It's good
to go over it in order to make sure you haven't missed anything. If you feel
like some of these points are not relevant to your PR, feel free to keep them
unchecked and if you want also explain why you think they're inapplicable.

  • I also copied this entire text into my commit message, and not just the GitHub PR description (git config commit.template .github/pull_request_template.md)
  • I performed a rough self-review of my changes
  • I explained non-trivial motivation for my code using code-comments
  • I made sure my code passes linting, tests, and builds correctly
  • I have ran the code and made sure it works as intended, and doesn't introduce any obvious regressions
  • I have not committed any irrelevant changes (if you did, please point them out and why, ideally separate them into a different PR)
  • I added tests (or decided that tests aren't really necessary)
  • I deleted this checklist and all the "<!---" comments (like this one) from the commit message and the PR description, leaving only my own text

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jul 22, 2025

@danmanor: This pull request references MGMT-21201 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.20.0" version, but no target version was set.

Details

In response to this:

Background / Context

The lifecycle-agent is a core component responsible for managing Image-Based Upgrades (IBU) and Image-Based Install (IBI) operations in OpenShift Single Node OpenShift (SNO) clusters. It handles critical cluster lifecycle operations including:

  • Network Configuration Management: Configuring node IPs, machine networks, cluster networks, and service networks during cluster transitions
  • Recertification (Recert): Re-signing certificates with updated cluster information (IPs, hostnames, etc.) when transforming seed images into target clusters
  • Post-Pivot Operations: Managing network setup, DNS configuration, and kubelet configuration after cluster pivot operations
  • Seed Cluster Information: Capturing and transforming network details from seed clusters for target cluster deployment

Currently, the lifecycle-agent's network handling is designed around single-stack networking, where clusters operate on either IPv4 OR IPv6, but not both simultaneously. All network-related data structures use single string fields (NodeIP, MachineNetwork) and the logic assumes a single IP address per network interface.

Key components involved:

  • api/seedreconfig: Defines the SeedReconfiguration API for cluster transformation parameters
  • utils/client_helper.go: Extracts cluster network information from Kubernetes API
  • lca-cli/postpivot: Handles post-upgrade network configuration and kubelet setup
  • internal/recert: Manages certificate re-signing with updated network information
  • lca-cli/seedclusterinfo: Captures seed cluster network details for replication

Issue / Requirement / Reason for change

MGMT-21201: The lifecycle-agent needs to support dual-stack networking configurations where OpenShift clusters operate with both IPv4 and IPv6 addresses simultaneously on the same interfaces.

Current Limitations:

  1. Single IP Assumption: All network fields (NodeIP, MachineNetwork) are single strings, preventing multiple IP support
  2. Limited Network Discovery: Cluster info extraction only captures the first internal IP address
  3. Inadequate Recert Logic: Certificate re-signing only handles single IP changes
  4. Kubelet Configuration: Node IP hint generation assumes single network per stack
  5. Missing Test Coverage: No validation for dual-stack scenarios

Requirements:

  • Support IPv4 + IPv6 dual-stack clusters in IBU/IBI operations
  • Maintain 100% backward compatibility with existing single-stack configurations
  • Handle multiple machine networks per IP family
  • Update recert logic to process multiple IP addresses in certificate SANs
  • Ensure proper kubelet configuration for dual-stack node IPs

Changes Made

API Extensions

  • Added NodeIPs []string to SeedReconfiguration and SeedClusterInfo for multiple node IPs
  • Added MachineNetworks []string to support multiple machine network CIDRs
  • Preserved legacy fields (NodeIP, MachineNetwork) for backward compatibility with precedence rules

Network Configuration Updates

  • Enhanced GetClusterInfo() to discover all internal node IPs via getNodeInternalIPs()
  • Added getMachineNetworks() to extract all machine networks from install config
  • Implemented backward compatibility by populating legacy fields with first array element

Post-Pivot Improvements

  • Updated setNodeIpHint() to generate space-separated IP hints: KUBELET_NODEIP_HINT=<ip1> <ip2>
  • Refactored setNodeIPIfNotProvided() to parse kubelet config from /etc/systemd/system/kubelet.service.d/20-nodenet.conf
  • Added parseKubeletNodeIPs() function to extract both KUBELET_NODE_IP and KUBELET_NODE_IPS environment variables
  • Enhanced file validation to check for both existence AND valid content before triggering nodeip-configuration service

Recertification Logic

  • Implemented slices.Equal() comparison for clean IP change detection
  • Updated config IP format to comma-separated list: config.IP = "ip1,ip2" for multiple addresses
  • Enhanced certificate SAN rules to include all old and new IP addresses in replacement rules

Helper Functions

Added backward-compatible accessor functions:

  • getMachineNetworksFromSeedReconfig() / getNodeIPsFromSeedReconfig()
  • getPrimaryMachineNetwork() / getPrimaryNodeIP()
  • getAllNodeIPsFromSeedReconfig() / getAllNodeIPsFromSeedClusterInfo()

Comprehensive Test Coverage

Created 995 lines of test coverage across 4 new test files:

  • postpivot_dualstack_test.go: Network hint generation and kubelet parsing (251 lines)
  • recert_dualstack_test.go: IP comparison and certificate logic (309 lines)
  • clusterconfig_dualstack_test.go: Seed configuration creation (205 lines)
  • seedclusterinfo_dualstack_test.go: Cluster info transformation (230 lines)

Test Scenarios Covered:

  • ✅ Single-stack IPv4 and IPv6 configurations
  • ✅ Dual-stack (IPv4 + IPv6) with both primary orders
  • ✅ Multiple networks per IP family
  • ✅ Legacy → new field migration scenarios
  • ✅ Error handling and edge cases
  • ✅ Backward compatibility validation

Backward Compatibility

100% backward compatibility maintained:

  • Existing single-stack clusters continue to work without modification
  • Legacy NodeIP and MachineNetwork fields preserved and populated
  • New fields (NodeIPs, MachineNetworks) take precedence when specified
  • All existing APIs and behavior unchanged for single-stack scenarios

Testing

  • All existing tests pass - no regressions introduced
  • 59 new test cases covering all dual-stack scenarios
  • Production-ready validation for IPv4, IPv6, and dual-stack configurations
  • Edge case coverage including empty configs, invalid data, and service interactions

Files Changed

Core Implementation (8 files):

  • api/seedreconfig/seedreconfig.go: Added dual-stack API fields
  • utils/client_helper.go: Enhanced cluster info discovery
  • lca-cli/postpivot/postpivot.go: Updated network configuration logic
  • internal/recert/recert.go: Improved IP comparison and cert handling
  • lca-cli/seedclusterinfo/seedclusterinfo.go: Extended cluster info structure
  • internal/clusterconfig/clusterconfig.go: Updated seed reconfig creation
  • lca-cli/postpivot/postpivot_test.go: Enhanced existing tests
  • docs/examples.md: Added dual-stack configuration examples

Test Coverage (4 new files):

  • internal/clusterconfig/clusterconfig_dualstack_test.go
  • internal/recert/recert_dualstack_test.go
  • lca-cli/postpivot/postpivot_dualstack_test.go
  • lca-cli/seedclusterinfo/seedclusterinfo_dualstack_test.go

Total Impact: 12 files changed, 1,338 insertions(+), 139 deletions(-)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

1 similar comment
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jul 22, 2025

@danmanor: This pull request references MGMT-21201 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.20.0" version, but no target version was set.

Details

In response to this:

Background / Context

The lifecycle-agent is a core component responsible for managing Image-Based Upgrades (IBU) and Image-Based Install (IBI) operations in OpenShift Single Node OpenShift (SNO) clusters. It handles critical cluster lifecycle operations including:

  • Network Configuration Management: Configuring node IPs, machine networks, cluster networks, and service networks during cluster transitions
  • Recertification (Recert): Re-signing certificates with updated cluster information (IPs, hostnames, etc.) when transforming seed images into target clusters
  • Post-Pivot Operations: Managing network setup, DNS configuration, and kubelet configuration after cluster pivot operations
  • Seed Cluster Information: Capturing and transforming network details from seed clusters for target cluster deployment

Currently, the lifecycle-agent's network handling is designed around single-stack networking, where clusters operate on either IPv4 OR IPv6, but not both simultaneously. All network-related data structures use single string fields (NodeIP, MachineNetwork) and the logic assumes a single IP address per network interface.

Key components involved:

  • api/seedreconfig: Defines the SeedReconfiguration API for cluster transformation parameters
  • utils/client_helper.go: Extracts cluster network information from Kubernetes API
  • lca-cli/postpivot: Handles post-upgrade network configuration and kubelet setup
  • internal/recert: Manages certificate re-signing with updated network information
  • lca-cli/seedclusterinfo: Captures seed cluster network details for replication

Issue / Requirement / Reason for change

MGMT-21201: The lifecycle-agent needs to support dual-stack networking configurations where OpenShift clusters operate with both IPv4 and IPv6 addresses simultaneously on the same interfaces.

Current Limitations:

  1. Single IP Assumption: All network fields (NodeIP, MachineNetwork) are single strings, preventing multiple IP support
  2. Limited Network Discovery: Cluster info extraction only captures the first internal IP address
  3. Inadequate Recert Logic: Certificate re-signing only handles single IP changes
  4. Kubelet Configuration: Node IP hint generation assumes single network per stack
  5. Missing Test Coverage: No validation for dual-stack scenarios

Requirements:

  • Support IPv4 + IPv6 dual-stack clusters in IBU/IBI operations
  • Maintain 100% backward compatibility with existing single-stack configurations
  • Handle multiple machine networks per IP family
  • Update recert logic to process multiple IP addresses in certificate SANs
  • Ensure proper kubelet configuration for dual-stack node IPs

Changes Made

API Extensions

  • Added NodeIPs []string to SeedReconfiguration and SeedClusterInfo for multiple node IPs
  • Added MachineNetworks []string to support multiple machine network CIDRs
  • Preserved legacy fields (NodeIP, MachineNetwork) for backward compatibility with precedence rules

Network Configuration Updates

  • Enhanced GetClusterInfo() to discover all internal node IPs via getNodeInternalIPs()
  • Added getMachineNetworks() to extract all machine networks from install config
  • Implemented backward compatibility by populating legacy fields with first array element

Post-Pivot Improvements

  • Updated setNodeIpHint() to generate space-separated IP hints: KUBELET_NODEIP_HINT=<ip1> <ip2>
  • Refactored setNodeIPIfNotProvided() to parse kubelet config from /etc/systemd/system/kubelet.service.d/20-nodenet.conf
  • Added parseKubeletNodeIPs() function to extract both KUBELET_NODE_IP and KUBELET_NODE_IPS environment variables
  • Enhanced file validation to check for both existence AND valid content before triggering nodeip-configuration service

Recertification Logic

  • Implemented slices.Equal() comparison for clean IP change detection
  • Updated config IP format to comma-separated list: config.IP = "ip1,ip2" for multiple addresses
  • Enhanced certificate SAN rules to include all old and new IP addresses in replacement rules

Helper Functions

Added backward-compatible accessor functions:

  • getMachineNetworksFromSeedReconfig() / getNodeIPsFromSeedReconfig()
  • getPrimaryMachineNetwork() / getPrimaryNodeIP()
  • getAllNodeIPsFromSeedReconfig() / getAllNodeIPsFromSeedClusterInfo()

Comprehensive Test Coverage

Created 995 lines of test coverage across 4 new test files:

  • postpivot_dualstack_test.go: Network hint generation and kubelet parsing (251 lines)
  • recert_dualstack_test.go: IP comparison and certificate logic (309 lines)
  • clusterconfig_dualstack_test.go: Seed configuration creation (205 lines)
  • seedclusterinfo_dualstack_test.go: Cluster info transformation (230 lines)

Test Scenarios Covered:

  • ✅ Single-stack IPv4 and IPv6 configurations
  • ✅ Dual-stack (IPv4 + IPv6) with both primary orders
  • ✅ Multiple networks per IP family
  • ✅ Legacy → new field migration scenarios
  • ✅ Error handling and edge cases
  • ✅ Backward compatibility validation

Backward Compatibility

100% backward compatibility maintained:

  • Existing single-stack clusters continue to work without modification
  • Legacy NodeIP and MachineNetwork fields preserved and populated
  • New fields (NodeIPs, MachineNetworks) take precedence when specified
  • All existing APIs and behavior unchanged for single-stack scenarios

Testing

  • All existing tests pass - no regressions introduced
  • 59 new test cases covering all dual-stack scenarios
  • Production-ready validation for IPv4, IPv6, and dual-stack configurations
  • Edge case coverage including empty configs, invalid data, and service interactions

Files Changed

Core Implementation (8 files):

  • api/seedreconfig/seedreconfig.go: Added dual-stack API fields
  • utils/client_helper.go: Enhanced cluster info discovery
  • lca-cli/postpivot/postpivot.go: Updated network configuration logic
  • internal/recert/recert.go: Improved IP comparison and cert handling
  • lca-cli/seedclusterinfo/seedclusterinfo.go: Extended cluster info structure
  • internal/clusterconfig/clusterconfig.go: Updated seed reconfig creation
  • lca-cli/postpivot/postpivot_test.go: Enhanced existing tests
  • docs/examples.md: Added dual-stack configuration examples

Test Coverage (4 new files):

  • internal/clusterconfig/clusterconfig_dualstack_test.go
  • internal/recert/recert_dualstack_test.go
  • lca-cli/postpivot/postpivot_dualstack_test.go
  • lca-cli/seedclusterinfo/seedclusterinfo_dualstack_test.go

Total Impact: 12 files changed, 1,338 insertions(+), 139 deletions(-)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from leo8a and pixelsoccupied July 22, 2025 14:26
@openshift-ci openshift-ci bot added cluster-config-api-changed Cluster config API changed. It's used by other projects. Review to ensure your change is nonbreaking needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 22, 2025
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Jul 22, 2025

Hi @danmanor. Thanks for your PR.

I'm waiting for a openshift-kni member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jul 22, 2025

@danmanor: This pull request references MGMT-21201 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.20.0" version, but no target version was set.

Details

In response to this:

Background / Context

The lifecycle-agent is a core component responsible for managing Image-Based Upgrades (IBU) and Image-Based Install (IBI) operations in OpenShift Single Node OpenShift (SNO) clusters. It handles critical cluster lifecycle operations including:

  • Network Configuration Management: Configuring node IPs, machine networks, cluster networks, and service networks during cluster transitions
  • Recertification (Recert): Re-signing certificates with updated cluster information (IPs, hostnames, etc.) when transforming seed images into target clusters
  • Post-Pivot Operations: Managing network setup, DNS configuration, and kubelet configuration after cluster pivot operations
  • Seed Cluster Information: Capturing and transforming network details from seed clusters for target cluster deployment

Currently, the lifecycle-agent's network handling is designed around single-stack networking, where clusters operate on either IPv4 OR IPv6, but not both simultaneously. All network-related data structures use single string fields (NodeIP, MachineNetwork) and the logic assumes a single IP address per network interface.

Key components involved:

  • api/seedreconfig: Defines the SeedReconfiguration API for cluster transformation parameters
  • utils/client_helper.go: Extracts cluster network information from Kubernetes API
  • lca-cli/postpivot: Handles post-upgrade network configuration and kubelet setup
  • internal/recert: Manages certificate re-signing with updated network information
  • lca-cli/seedclusterinfo: Captures seed cluster network details for replication

Issue / Requirement / Reason for change

MGMT-21201: The lifecycle-agent needs to support dual-stack networking configurations where OpenShift clusters operate with both IPv4 and IPv6 addresses simultaneously on the same interfaces.

Current Limitations:

  1. Single IP Assumption: All network fields (NodeIP, MachineNetwork) are single strings, preventing multiple IP support
  2. Limited Network Discovery: Cluster info extraction only captures the first internal IP address
  3. Inadequate Recert Logic: Certificate re-signing only handles single IP changes
  4. Kubelet Configuration: Node IP hint generation assumes single network per stack
  5. Missing Test Coverage: No validation for dual-stack scenarios

Requirements:

  • Support IPv4 + IPv6 dual-stack clusters in IBU/IBI operations
  • Maintain 100% backward compatibility with existing single-stack configurations
  • Handle multiple machine networks per IP family
  • Update recert logic to process multiple IP addresses in certificate SANs
  • Ensure proper kubelet configuration for dual-stack node IPs

Changes Made

API Extensions

  • Added NodeIPs []string to SeedReconfiguration and SeedClusterInfo for multiple node IPs
  • Added MachineNetworks []string to support multiple machine network CIDRs
  • Preserved legacy fields (NodeIP, MachineNetwork) for backward compatibility with precedence rules

Network Configuration Updates

  • Enhanced GetClusterInfo() to discover all internal node IPs via getNodeInternalIPs()
  • Added getMachineNetworks() to extract all machine networks from install config
  • Implemented backward compatibility by populating legacy fields with first array element

Post-Pivot Improvements

  • Updated setNodeIpHint() to generate space-separated IP hints: KUBELET_NODEIP_HINT=<ip1> <ip2>
  • Refactored setNodeIPIfNotProvided() to parse kubelet config from /etc/systemd/system/kubelet.service.d/20-nodenet.conf
  • Added parseKubeletNodeIPs() function to extract both KUBELET_NODE_IP and KUBELET_NODE_IPS environment variables
  • Enhanced file validation to check for both existence AND valid content before triggering nodeip-configuration service

Recertification Logic

  • Implemented slices.Equal() comparison for clean IP change detection
  • Updated config IP format to comma-separated list: config.IP = "ip1,ip2" for multiple addresses
  • Enhanced certificate SAN rules to include all old and new IP addresses in replacement rules

Helper Functions

Added backward-compatible accessor functions:

  • getMachineNetworksFromSeedReconfig() / getNodeIPsFromSeedReconfig()
  • getPrimaryMachineNetwork() / getPrimaryNodeIP()
  • getAllNodeIPsFromSeedReconfig() / getAllNodeIPsFromSeedClusterInfo()

Comprehensive Test Coverage

Created 995 lines of test coverage across 4 new test files:

  • postpivot_dualstack_test.go: Network hint generation and kubelet parsing (251 lines)
  • recert_dualstack_test.go: IP comparison and certificate logic (309 lines)
  • clusterconfig_dualstack_test.go: Seed configuration creation (205 lines)
  • seedclusterinfo_dualstack_test.go: Cluster info transformation (230 lines)

Test Scenarios Covered:

  • ✅ Single-stack IPv4 and IPv6 configurations
  • ✅ Dual-stack (IPv4 + IPv6) with both primary orders
  • ✅ Multiple networks per IP family
  • ✅ Legacy → new field migration scenarios
  • ✅ Error handling and edge cases
  • ✅ Backward compatibility validation

Backward Compatibility

100% backward compatibility maintained:

  • Existing single-stack clusters continue to work without modification
  • Legacy NodeIP and MachineNetwork fields preserved and populated
  • New fields (NodeIPs, MachineNetworks) take precedence when specified
  • All existing APIs and behavior unchanged for single-stack scenarios

Testing

  • All existing tests pass - no regressions introduced
  • 59 new test cases covering all dual-stack scenarios
  • Production-ready validation for IPv4, IPv6, and dual-stack configurations
  • Edge case coverage including empty configs, invalid data, and service interactions

Files Changed

Core Implementation (8 files):

  • api/seedreconfig/seedreconfig.go: Added dual-stack API fields
  • utils/client_helper.go: Enhanced cluster info discovery
  • lca-cli/postpivot/postpivot.go: Updated network configuration logic
  • internal/recert/recert.go: Improved IP comparison and cert handling
  • lca-cli/seedclusterinfo/seedclusterinfo.go: Extended cluster info structure
  • internal/clusterconfig/clusterconfig.go: Updated seed reconfig creation
  • lca-cli/postpivot/postpivot_test.go: Enhanced existing tests
  • docs/examples.md: Added dual-stack configuration examples

Test Coverage (4 new files):

  • internal/clusterconfig/clusterconfig_dualstack_test.go
  • internal/recert/recert_dualstack_test.go
  • lca-cli/postpivot/postpivot_dualstack_test.go
  • lca-cli/seedclusterinfo/seedclusterinfo_dualstack_test.go

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

1 similar comment
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jul 22, 2025

@danmanor: This pull request references MGMT-21201 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.20.0" version, but no target version was set.

Details

In response to this:

Background / Context

The lifecycle-agent is a core component responsible for managing Image-Based Upgrades (IBU) and Image-Based Install (IBI) operations in OpenShift Single Node OpenShift (SNO) clusters. It handles critical cluster lifecycle operations including:

  • Network Configuration Management: Configuring node IPs, machine networks, cluster networks, and service networks during cluster transitions
  • Recertification (Recert): Re-signing certificates with updated cluster information (IPs, hostnames, etc.) when transforming seed images into target clusters
  • Post-Pivot Operations: Managing network setup, DNS configuration, and kubelet configuration after cluster pivot operations
  • Seed Cluster Information: Capturing and transforming network details from seed clusters for target cluster deployment

Currently, the lifecycle-agent's network handling is designed around single-stack networking, where clusters operate on either IPv4 OR IPv6, but not both simultaneously. All network-related data structures use single string fields (NodeIP, MachineNetwork) and the logic assumes a single IP address per network interface.

Key components involved:

  • api/seedreconfig: Defines the SeedReconfiguration API for cluster transformation parameters
  • utils/client_helper.go: Extracts cluster network information from Kubernetes API
  • lca-cli/postpivot: Handles post-upgrade network configuration and kubelet setup
  • internal/recert: Manages certificate re-signing with updated network information
  • lca-cli/seedclusterinfo: Captures seed cluster network details for replication

Issue / Requirement / Reason for change

MGMT-21201: The lifecycle-agent needs to support dual-stack networking configurations where OpenShift clusters operate with both IPv4 and IPv6 addresses simultaneously on the same interfaces.

Current Limitations:

  1. Single IP Assumption: All network fields (NodeIP, MachineNetwork) are single strings, preventing multiple IP support
  2. Limited Network Discovery: Cluster info extraction only captures the first internal IP address
  3. Inadequate Recert Logic: Certificate re-signing only handles single IP changes
  4. Kubelet Configuration: Node IP hint generation assumes single network per stack
  5. Missing Test Coverage: No validation for dual-stack scenarios

Requirements:

  • Support IPv4 + IPv6 dual-stack clusters in IBU/IBI operations
  • Maintain 100% backward compatibility with existing single-stack configurations
  • Handle multiple machine networks per IP family
  • Update recert logic to process multiple IP addresses in certificate SANs
  • Ensure proper kubelet configuration for dual-stack node IPs

Changes Made

API Extensions

  • Added NodeIPs []string to SeedReconfiguration and SeedClusterInfo for multiple node IPs
  • Added MachineNetworks []string to support multiple machine network CIDRs
  • Preserved legacy fields (NodeIP, MachineNetwork) for backward compatibility with precedence rules

Network Configuration Updates

  • Enhanced GetClusterInfo() to discover all internal node IPs via getNodeInternalIPs()
  • Added getMachineNetworks() to extract all machine networks from install config
  • Implemented backward compatibility by populating legacy fields with first array element

Post-Pivot Improvements

  • Updated setNodeIpHint() to generate space-separated IP hints: KUBELET_NODEIP_HINT=<ip1> <ip2>
  • Refactored setNodeIPIfNotProvided() to parse kubelet config from /etc/systemd/system/kubelet.service.d/20-nodenet.conf
  • Added parseKubeletNodeIPs() function to extract both KUBELET_NODE_IP and KUBELET_NODE_IPS environment variables
  • Enhanced file validation to check for both existence AND valid content before triggering nodeip-configuration service

Recertification Logic

  • Implemented slices.Equal() comparison for clean IP change detection
  • Updated config IP format to comma-separated list: config.IP = "ip1,ip2" for multiple addresses
  • Enhanced certificate SAN rules to include all old and new IP addresses in replacement rules

Helper Functions

Added backward-compatible accessor functions:

  • getMachineNetworksFromSeedReconfig() / getNodeIPsFromSeedReconfig()
  • getPrimaryMachineNetwork() / getPrimaryNodeIP()
  • getAllNodeIPsFromSeedReconfig() / getAllNodeIPsFromSeedClusterInfo()

Comprehensive Test Coverage

Created 995 lines of test coverage across 4 new test files:

  • postpivot_dualstack_test.go: Network hint generation and kubelet parsing (251 lines)
  • recert_dualstack_test.go: IP comparison and certificate logic (309 lines)
  • clusterconfig_dualstack_test.go: Seed configuration creation (205 lines)
  • seedclusterinfo_dualstack_test.go: Cluster info transformation (230 lines)

Test Scenarios Covered:

  • ✅ Single-stack IPv4 and IPv6 configurations
  • ✅ Dual-stack (IPv4 + IPv6) with both primary orders
  • ✅ Multiple networks per IP family
  • ✅ Legacy → new field migration scenarios
  • ✅ Error handling and edge cases
  • ✅ Backward compatibility validation

Backward Compatibility

100% backward compatibility maintained:

  • Existing single-stack clusters continue to work without modification
  • Legacy NodeIP and MachineNetwork fields preserved and populated
  • New fields (NodeIPs, MachineNetworks) take precedence when specified
  • All existing APIs and behavior unchanged for single-stack scenarios

Testing

  • All existing tests pass - no regressions introduced
  • 59 new test cases covering all dual-stack scenarios
  • Production-ready validation for IPv4, IPv6, and dual-stack configurations
  • Edge case coverage including empty configs, invalid data, and service interactions

Files Changed

Core Implementation (8 files):

  • api/seedreconfig/seedreconfig.go: Added dual-stack API fields
  • utils/client_helper.go: Enhanced cluster info discovery
  • lca-cli/postpivot/postpivot.go: Updated network configuration logic
  • internal/recert/recert.go: Improved IP comparison and cert handling
  • lca-cli/seedclusterinfo/seedclusterinfo.go: Extended cluster info structure
  • internal/clusterconfig/clusterconfig.go: Updated seed reconfig creation
  • lca-cli/postpivot/postpivot_test.go: Enhanced existing tests
  • docs/examples.md: Added dual-stack configuration examples

Test Coverage (4 new files):

  • internal/clusterconfig/clusterconfig_dualstack_test.go
  • internal/recert/recert_dualstack_test.go
  • lca-cli/postpivot/postpivot_dualstack_test.go
  • lca-cli/seedclusterinfo/seedclusterinfo_dualstack_test.go

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jul 22, 2025

@danmanor: This pull request references MGMT-21201 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.20.0" version, but no target version was set.

Details

In response to this:

Background / Context

The lifecycle-agent is a core component responsible for managing Image-Based Upgrades (IBU) and Image-Based Install (IBI) operations in OpenShift Single Node OpenShift (SNO) clusters. It handles critical cluster lifecycle operations including:

  • Network Configuration Management: Configuring node IPs, machine networks, cluster networks, and service networks during cluster transitions
  • Recertification (Recert): Re-signing certificates with updated cluster information (IPs, hostnames, etc.) when transforming seed images into target clusters
  • Post-Pivot Operations: Managing network setup, DNS configuration, and kubelet configuration after cluster pivot operations
  • Seed Cluster Information: Capturing and transforming network details from seed clusters for target cluster deployment

Currently, the lifecycle-agent's network handling is designed around single-stack networking, where clusters operate on either IPv4 OR IPv6, but not both simultaneously. All network-related data structures use single string fields (NodeIP, MachineNetwork) and the logic assumes a single IP address per network interface.

Key components involved:

  • api/seedreconfig: Defines the SeedReconfiguration API for cluster transformation parameters
  • utils/client_helper.go: Extracts cluster network information from Kubernetes API
  • lca-cli/postpivot: Handles post-upgrade network configuration and kubelet setup
  • internal/recert: Manages certificate re-signing with updated network information
  • lca-cli/seedclusterinfo: Captures seed cluster network details for replication

Issue / Requirement / Reason for change

MGMT-21201: The lifecycle-agent needs to support dual-stack networking configurations where OpenShift clusters operate with both IPv4 and IPv6 addresses simultaneously on the same interfaces.

Current Limitations:

  1. Single IP Assumption: All network fields (NodeIP, MachineNetwork) are single strings, preventing multiple IP support
  2. Limited Network Discovery: Cluster info extraction only captures the first internal IP address
  3. Inadequate Recert Logic: Certificate re-signing only handles single IP changes
  4. Kubelet Configuration: Node IP hint generation assumes single network per stack
  5. Missing Test Coverage: No validation for dual-stack scenarios

Requirements:

  • Support IPv4 + IPv6 dual-stack clusters in IBU/IBI operations
  • Maintain 100% backward compatibility with existing single-stack configurations
  • Handle multiple machine networks per IP family
  • Update recert logic to process multiple IP addresses in certificate SANs
  • Ensure proper kubelet configuration for dual-stack node IPs

Changes Made

API Extensions

  • Added NodeIPs []string to SeedReconfiguration and SeedClusterInfo for multiple node IPs
  • Added MachineNetworks []string to support multiple machine network CIDRs
  • Preserved legacy fields (NodeIP, MachineNetwork) for backward compatibility with precedence rules

Network Configuration Updates

  • Enhanced GetClusterInfo() to discover all internal node IPs via getNodeInternalIPs()
  • Added getMachineNetworks() to extract all machine networks from install config
  • Implemented backward compatibility by populating legacy fields with first array element

Post-Pivot Improvements

  • Updated setNodeIpHint() to generate space-separated IP hints: KUBELET_NODEIP_HINT=<ip1> <ip2>
  • Refactored setNodeIPIfNotProvided() to parse kubelet config from /etc/systemd/system/kubelet.service.d/20-nodenet.conf
  • Added parseKubeletNodeIPs() function to extract both KUBELET_NODE_IP and KUBELET_NODE_IPS environment variables
  • Enhanced file validation to check for both existence AND valid content before triggering nodeip-configuration service

Recertification Logic

  • Implemented slices.Equal() comparison for clean IP change detection
  • Updated config IP format to comma-separated list: config.IP = "ip1,ip2" for multiple addresses
  • Enhanced certificate SAN rules to include all old and new IP addresses in replacement rules

Test Scenarios Covered:

  • ✅ Single-stack IPv4 and IPv6 configurations
  • ✅ Dual-stack (IPv4 + IPv6) with both primary orders
  • ✅ Multiple networks per IP family
  • ✅ Legacy → new field migration scenarios
  • ✅ Error handling and edge cases
  • ✅ Backward compatibility validation

Backward Compatibility

100% backward compatibility maintained:

  • Existing single-stack clusters continue to work without modification
  • Legacy NodeIP and MachineNetwork fields preserved and populated
  • New fields (NodeIPs, MachineNetworks) take precedence when specified
  • All existing APIs and behavior unchanged for single-stack scenarios

Testing

  • All existing tests pass - no regressions introduced
  • 59 new test cases covering all dual-stack scenarios
  • Production-ready validation for IPv4, IPv6, and dual-stack configurations
  • Edge case coverage including empty configs, invalid data, and service interactions

Checklist

This is a personal checklist that should be applicable to most PRs. It's good
to go over it in order to make sure you haven't missed anything. If you feel
like some of these points are not relevant to your PR, feel free to keep them
unchecked and if you want also explain why you think they're inapplicable.

  • I also copied this entire text into my commit message, and not just the GitHub PR description (git config commit.template .github/pull_request_template.md)
  • I performed a rough self-review of my changes
  • I explained non-trivial motivation for my code using code-comments
  • I made sure my code passes linting, tests, and builds correctly
  • I have ran the code and made sure it works as intended, and doesn't introduce any obvious regressions
  • I have not committed any irrelevant changes (if you did, please point them out and why, ideally separate them into a different PR)
  • I added tests (or decided that tests aren't really necessary)
  • I deleted this checklist and all the "<!---" comments (like this one) from the commit message and the PR description, leaving only my own text

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jul 22, 2025

@danmanor: This pull request references MGMT-21201 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.20.0" version, but no target version was set.

Details

In response to this:

Background / Context

The lifecycle-agent is a core component responsible for managing Image-Based Upgrades (IBU) and Image-Based Install (IBI) operations in OpenShift Single Node OpenShift (SNO) clusters. It handles critical cluster lifecycle operations including:

  • Network Configuration Management: Configuring node IPs, machine networks, cluster networks, and service networks during cluster transitions
  • Recertification (Recert): Re-signing certificates with updated cluster information (IPs, hostnames, etc.) when transforming seed images into target clusters
  • Post-Pivot Operations: Managing network setup, DNS configuration, and kubelet configuration after cluster pivot operations
  • Seed Cluster Information: Capturing and transforming network details from seed clusters for target cluster deployment

Currently, the lifecycle-agent's network handling is designed around single-stack networking, where clusters operate on either IPv4 OR IPv6, but not both simultaneously. All network-related data structures use single string fields (NodeIP, MachineNetwork) and the logic assumes a single IP address per network interface.

Key components involved:

  • api/seedreconfig: Defines the SeedReconfiguration API for cluster transformation parameters
  • utils/client_helper.go: Extracts cluster network information from Kubernetes API
  • lca-cli/postpivot: Handles post-upgrade network configuration and kubelet setup
  • internal/recert: Manages certificate re-signing with updated network information
  • lca-cli/seedclusterinfo: Captures seed cluster network details for replication

Issue / Requirement / Reason for change

MGMT-21201: The lifecycle-agent needs to support dual-stack networking configurations where OpenShift clusters operate with both IPv4 and IPv6 addresses simultaneously on the same interfaces.

Current Limitations:

  1. Single IP Assumption: All network fields (NodeIP, MachineNetwork) are single strings, preventing multiple IP support
  2. Limited Network Discovery: Cluster info extraction only captures the first internal IP address
  3. Inadequate Recert Logic: Certificate re-signing only handles single IP changes
  4. Kubelet Configuration: Node IP hint generation assumes single network per stack
  5. Missing Test Coverage: No validation for dual-stack scenarios

Requirements:

  • Support IPv4 + IPv6 dual-stack clusters in IBU/IBI operations
  • Maintain 100% backward compatibility with existing single-stack configurations
  • Handle multiple machine networks per IP family
  • Update recert logic to process multiple IP addresses in certificate SANs
  • Ensure proper kubelet configuration for dual-stack node IPs

Changes Made

API Extensions

  • Added NodeIPs []string to SeedReconfiguration and SeedClusterInfo for multiple node IPs
  • Added MachineNetworks []string to support multiple machine network CIDRs
  • Preserved legacy fields (NodeIP, MachineNetwork) for backward compatibility with precedence rules

Network Configuration Updates

  • Enhanced GetClusterInfo() to discover all internal node IPs via getNodeInternalIPs()
  • Added getMachineNetworks() to extract all machine networks from install config
  • Implemented backward compatibility by populating legacy fields with first array element

Post-Pivot Improvements

  • Updated setNodeIpHint() to generate space-separated IP hints: KUBELET_NODEIP_HINT=<ip1> <ip2>
  • Refactored setNodeIPIfNotProvided() to parse kubelet config from /etc/systemd/system/kubelet.service.d/20-nodenet.conf
  • Added parseKubeletNodeIPs() function to extract both KUBELET_NODE_IP and KUBELET_NODE_IPS environment variables
  • Enhanced file validation to check for both existence AND valid content before triggering nodeip-configuration service

Recertification Logic

  • Implemented slices.Equal() comparison for clean IP change detection
  • Updated config IP format to comma-separated list: config.IP = "ip1,ip2" for multiple addresses
  • Enhanced certificate SAN rules to include all old and new IP addresses in replacement rules

Test Scenarios Covered:

  • ✅ Single-stack IPv4 and IPv6 configurations
  • ✅ Dual-stack (IPv4 + IPv6) with both primary orders
  • ✅ Multiple networks per IP family
  • ✅ Legacy → new field migration scenarios
  • ✅ Error handling and edge cases
  • ✅ Backward compatibility validation

Backward Compatibility

100% backward compatibility maintained:

  • Existing single-stack clusters continue to work without modification
  • Legacy NodeIP and MachineNetwork fields preserved and populated
  • New fields (NodeIPs, MachineNetworks) take precedence when specified
  • All existing APIs and behavior unchanged for single-stack scenarios

Testing

  • All existing tests pass - no regressions introduced
  • 59 new test cases covering all dual-stack scenarios
  • Production-ready validation for IPv4, IPv6, and dual-stack configurations
  • Edge case coverage including empty configs, invalid data, and service interactions

Checklist

This is a personal checklist that should be applicable to most PRs. It's good
to go over it in order to make sure you haven't missed anything. If you feel
like some of these points are not relevant to your PR, feel free to keep them
unchecked and if you want also explain why you think they're inapplicable.

  • I also copied this entire text into my commit message, and not just the GitHub PR description (git config commit.template .github/pull_request_template.md)
  • I performed a rough self-review of my changes
  • I explained non-trivial motivation for my code using code-comments
  • I made sure my code passes linting, tests, and builds correctly
  • I have ran the code and made sure it works as intended, and doesn't introduce any obvious regressions
  • I have not committed any irrelevant changes (if you did, please point them out and why, ideally separate them into a different PR)
  • I added tests (or decided that tests aren't really necessary)
  • I deleted this checklist and all the "<!---" comments (like this one) from the commit message and the PR description, leaving only my own text

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jul 22, 2025

@danmanor: This pull request references MGMT-21201 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.20.0" version, but no target version was set.

Details

In response to this:

Background / Context

The lifecycle-agent is a core component responsible for managing Image-Based Upgrades (IBU) and Image-Based Install (IBI) operations in OpenShift Single Node OpenShift (SNO) clusters. It handles critical cluster lifecycle operations including:

  • Network Configuration Management: Configuring node IPs, machine networks, cluster networks, and service networks during cluster transitions
  • Recertification (Recert): Re-signing certificates with updated cluster information (IPs, hostnames, etc.) when transforming seed images into target clusters
  • Post-Pivot Operations: Managing network setup, DNS configuration, and kubelet configuration after cluster pivot operations
  • Seed Cluster Information: Capturing and transforming network details from seed clusters for target cluster deployment

Currently, the lifecycle-agent's network handling is designed around single-stack networking, where clusters operate on either IPv4 OR IPv6, but not both simultaneously. All network-related data structures use single string fields (NodeIP, MachineNetwork) and the logic assumes a single IP address per network interface.

Key components involved:

  • api/seedreconfig: Defines the SeedReconfiguration API for cluster transformation parameters
  • utils/client_helper.go: Extracts cluster network information from Kubernetes API
  • lca-cli/postpivot: Handles post-upgrade network configuration and kubelet setup
  • internal/recert: Manages certificate re-signing with updated network information
  • lca-cli/seedclusterinfo: Captures seed cluster network details for replication

Issue / Requirement / Reason for change

MGMT-21201: The lifecycle-agent needs to support dual-stack networking configurations where OpenShift clusters operate with both IPv4 and IPv6 addresses simultaneously on the same interfaces.

Current Limitations:

  1. Single IP Assumption: All network fields (NodeIP, MachineNetwork) are single strings, preventing multiple IP support
  2. Limited Network Discovery: Cluster info extraction only captures the first internal IP address
  3. Inadequate Recert Logic: Certificate re-signing only handles single IP changes
  4. Kubelet Configuration: Node IP hint generation assumes single network per stack
  5. Missing Test Coverage: No validation for dual-stack scenarios

Requirements:

  • Support IPv4 + IPv6 dual-stack clusters in IBU/IBI operations
  • Maintain 100% backward compatibility with existing single-stack configurations
  • Handle multiple machine networks per IP family
  • Update recert logic to process multiple IP addresses in certificate SANs
  • Ensure proper kubelet configuration for dual-stack node IPs

Changes Made

API Extensions

  • Added NodeIPs []string to SeedReconfiguration and SeedClusterInfo for multiple node IPs
  • Added MachineNetworks []string to support multiple machine network CIDRs
  • Preserved legacy fields (NodeIP, MachineNetwork) for backward compatibility with precedence rules

Network Configuration Updates

  • Enhanced GetClusterInfo() to discover all internal node IPs via getNodeInternalIPs()
  • Added getMachineNetworks() to extract all machine networks from install config
  • Implemented backward compatibility by populating legacy fields with first array element

Post-Pivot Improvements

  • Updated setNodeIpHint() to generate space-separated IP hints: KUBELET_NODEIP_HINT=<ip1> <ip2>
  • Refactored setNodeIPIfNotProvided() to parse kubelet config from /etc/systemd/system/kubelet.service.d/20-nodenet.conf
  • Added parseKubeletNodeIPs() function to extract both KUBELET_NODE_IP and KUBELET_NODE_IPS environment variables
  • Enhanced file validation to check for both existence AND valid content before triggering nodeip-configuration service

Recertification Logic

  • Implemented slices.Equal() comparison for clean IP change detection
  • Updated config IP format to comma-separated list: config.IP = "ip1,ip2" for multiple addresses
  • Enhanced certificate SAN rules to include all old and new IP addresses in replacement rules

Test Scenarios Covered:

  • ✅ Single-stack IPv4 and IPv6 configurations
  • ✅ Dual-stack (IPv4 + IPv6) with both primary orders
  • ✅ Multiple networks per IP family
  • ✅ Legacy → new field migration scenarios
  • ✅ Error handling and edge cases
  • ✅ Backward compatibility validation

Backward Compatibility

100% backward compatibility maintained:

  • Existing single-stack clusters continue to work without modification
  • Legacy NodeIP and MachineNetwork fields preserved and populated
  • New fields (NodeIPs, MachineNetworks) take precedence when specified
  • All existing APIs and behavior unchanged for single-stack scenarios

Testing

  • All existing tests pass - no regressions introduced
  • 59 new test cases covering all dual-stack scenarios
  • Production-ready validation for IPv4, IPv6, and dual-stack configurations
  • Edge case coverage including empty configs, invalid data, and service interactions

Checklist

This is a personal checklist that should be applicable to most PRs. It's good
to go over it in order to make sure you haven't missed anything. If you feel
like some of these points are not relevant to your PR, feel free to keep them
unchecked and if you want also explain why you think they're inapplicable.

  • I also copied this entire text into my commit message, and not just the GitHub PR description (git config commit.template .github/pull_request_template.md)
  • I performed a rough self-review of my changes
  • I explained non-trivial motivation for my code using code-comments
  • I made sure my code passes linting, tests, and builds correctly
  • I have ran the code and made sure it works as intended, and doesn't introduce any obvious regressions
  • I have not committed any irrelevant changes (if you did, please point them out and why, ideally separate them into a different PR)
  • I added tests (or decided that tests aren't really necessary)
  • I deleted this checklist and all the "<!---" comments (like this one) from the commit message and the PR description, leaving only my own text

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jul 22, 2025

@danmanor: This pull request references MGMT-21201 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.20.0" version, but no target version was set.

Details

In response to this:

Background / Context

The lifecycle-agent is a core component responsible for managing Image-Based Upgrades (IBU) and Image-Based Install (IBI) operations in OpenShift Single Node OpenShift (SNO) clusters. It handles critical cluster lifecycle operations including:

  • Network Configuration Management: Configuring node IPs, machine networks, cluster networks, and service networks during cluster transitions
  • Recertification (Recert): Re-signing certificates with updated cluster information (IPs, hostnames, etc.) when transforming seed images into target clusters
  • Post-Pivot Operations: Managing network setup, DNS configuration, and kubelet configuration after cluster pivot operations
  • Seed Cluster Information: Capturing and transforming network details from seed clusters for target cluster deployment

Currently, the lifecycle-agent's network handling is designed around single-stack networking, where clusters operate on either IPv4 OR IPv6, but not both simultaneously. All network-related data structures use single string fields (NodeIP, MachineNetwork) and the logic assumes a single IP address per network interface.

Key components involved:

  • api/seedreconfig: Defines the SeedReconfiguration API for cluster transformation parameters
  • utils/client_helper.go: Extracts cluster network information from Kubernetes API
  • lca-cli/postpivot: Handles post-upgrade network configuration and kubelet setup
  • internal/recert: Manages certificate re-signing with updated network information
  • lca-cli/seedclusterinfo: Captures seed cluster network details for replication

Issue / Requirement / Reason for change

MGMT-21201: The lifecycle-agent needs to support dual-stack networking configurations where OpenShift clusters operate with both IPv4 and IPv6 addresses simultaneously on the same interfaces.

Current Limitations:

  1. Single IP Assumption: All network fields (NodeIP, MachineNetwork) are single strings, preventing multiple IP support
  2. Limited Network Discovery: Cluster info extraction only captures the first internal IP address
  3. Inadequate Recert Logic: Certificate re-signing only handles single IP changes
  4. Kubelet Configuration: Node IP hint generation assumes single network per stack
  5. Missing Test Coverage: No validation for dual-stack scenarios

Requirements:

  • Support IPv4 + IPv6 dual-stack clusters in IBU/IBI operations
  • Maintain 100% backward compatibility with existing single-stack configurations
  • Handle multiple machine networks per IP family
  • Update recert logic to process multiple IP addresses in certificate SANs
  • Ensure proper kubelet configuration for dual-stack node IPs

Changes Made

API Extensions

  • Added NodeIPs []string to SeedReconfiguration and SeedClusterInfo for multiple node IPs
  • Added MachineNetworks []string to support multiple machine network CIDRs
  • Preserved legacy fields (NodeIP, MachineNetwork) for backward compatibility with precedence rules

Network Configuration Updates

  • Enhanced GetClusterInfo() to discover all internal node IPs via getNodeInternalIPs()
  • Added getMachineNetworks() to extract all machine networks from install config
  • Implemented backward compatibility by populating legacy fields with first array element

Post-Pivot Improvements

  • Updated setNodeIpHint() to generate space-separated IP hints: KUBELET_NODEIP_HINT=<ip1> <ip2>
  • Refactored setNodeIPIfNotProvided() to parse kubelet config from /etc/systemd/system/kubelet.service.d/20-nodenet.conf
  • Added parseKubeletNodeIPs() function to extract both KUBELET_NODE_IP and KUBELET_NODE_IPS environment variables
  • Enhanced file validation to check for both existence AND valid content before triggering nodeip-configuration service

Recertification Logic

  • Implemented slices.Equal() comparison for clean IP change detection
  • Updated config IP format to comma-separated list: config.IP = "ip1,ip2" for multiple addresses
  • Enhanced certificate SAN rules to include all old and new IP addresses in replacement rules

Test Scenarios Covered:

  • ✅ Single-stack IPv4 and IPv6 configurations
  • ✅ Dual-stack (IPv4 + IPv6) with both primary orders
  • ✅ Multiple networks per IP family
  • ✅ Legacy → new field migration scenarios
  • ✅ Error handling and edge cases
  • ✅ Backward compatibility validation

Backward Compatibility

100% backward compatibility maintained:

  • Existing single-stack clusters continue to work without modification
  • Legacy NodeIP and MachineNetwork fields preserved and populated
  • New fields (NodeIPs, MachineNetworks) take precedence when specified
  • All existing APIs and behavior unchanged for single-stack scenarios

Testing

  • All existing tests pass - no regressions introduced
  • 59 new test cases covering all dual-stack scenarios
  • Production-ready validation for IPv4, IPv6, and dual-stack configurations
  • Edge case coverage including empty configs, invalid data, and service interactions

Checklist

This is a personal checklist that should be applicable to most PRs. It's good
to go over it in order to make sure you haven't missed anything. If you feel
like some of these points are not relevant to your PR, feel free to keep them
unchecked and if you want also explain why you think they're inapplicable.

  • I also copied this entire text into my commit message, and not just the GitHub PR description (git config commit.template .github/pull_request_template.md)
  • I performed a rough self-review of my changes
  • I explained non-trivial motivation for my code using code-comments
  • I made sure my code passes linting, tests, and builds correctly
  • I have ran the code and made sure it works as intended, and doesn't introduce any obvious regressions
  • I have not committed any irrelevant changes (if you did, please point them out and why, ideally separate them into a different PR)
  • I added tests (or decided that tests aren't really necessary)
  • I deleted this checklist and all the "<!---" comments (like this one) from the commit message and the PR description, leaving only my own text

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jul 22, 2025

@danmanor: This pull request references MGMT-21201 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.20.0" version, but no target version was set.

Details

In response to this:

Background / Context

The lifecycle-agent is a core component responsible for managing Image-Based Upgrades (IBU) and Image-Based Install (IBI) operations in OpenShift Single Node OpenShift (SNO) clusters. It handles critical cluster lifecycle operations including:

  • Network Configuration Management: Configuring node IPs, machine networks, cluster networks, and service networks during cluster transitions
  • Recertification (Recert): Re-signing certificates with updated cluster information (IPs, hostnames, etc.) when transforming seed images into target clusters
  • Post-Pivot Operations: Managing network setup, DNS configuration, and kubelet configuration after cluster pivot operations
  • Seed Cluster Information: Capturing and transforming network details from seed clusters for target cluster deployment

Currently, the lifecycle-agent's network handling is designed around single-stack networking, where clusters operate on either IPv4 OR IPv6, but not both simultaneously. All network-related data structures use single string fields (NodeIP, MachineNetwork) and the logic assumes a single IP address per network interface.

Key components involved:

  • api/seedreconfig: Defines the SeedReconfiguration API for cluster transformation parameters
  • utils/client_helper.go: Extracts cluster network information from Kubernetes API
  • lca-cli/postpivot: Handles post-upgrade network configuration and kubelet setup
  • internal/recert: Manages certificate re-signing with updated network information
  • lca-cli/seedclusterinfo: Captures seed cluster network details for replication

Issue / Requirement / Reason for change

MGMT-21201: The lifecycle-agent needs to support dual-stack networking configurations where OpenShift clusters operate with both IPv4 and IPv6 addresses simultaneously on the same interfaces.

Current Limitations:

  1. Single IP Assumption: All network fields (NodeIP, MachineNetwork) are single strings, preventing multiple IP support
  2. Limited Network Discovery: Cluster info extraction only captures the first internal IP address
  3. Inadequate Recert Logic: Certificate re-signing only handles single IP changes
  4. Kubelet Configuration: Node IP hint generation assumes single network per stack
  5. Missing Test Coverage: No validation for dual-stack scenarios

Requirements:

  • Support IPv4 + IPv6 dual-stack clusters in IBU/IBI operations
  • Maintain 100% backward compatibility with existing single-stack configurations
  • Handle multiple machine networks per IP family
  • Update recert logic to process multiple IP addresses in certificate SANs
  • Ensure proper kubelet configuration for dual-stack node IPs

Changes Made

API Extensions

  • Added NodeIPs []string to SeedReconfiguration and SeedClusterInfo for multiple node IPs
  • Added MachineNetworks []string to support multiple machine network CIDRs
  • Preserved legacy fields (NodeIP, MachineNetwork) for backward compatibility with precedence rules

Network Configuration Updates

  • Enhanced GetClusterInfo() to discover all internal node IPs via getNodeInternalIPs()
  • Added getMachineNetworks() to extract all machine networks from install config
  • Implemented backward compatibility by populating legacy fields with first array element

Post-Pivot Improvements

  • Updated setNodeIpHint() to generate space-separated IP hints: KUBELET_NODEIP_HINT=<ip1> <ip2>
  • Refactored setNodeIPIfNotProvided() to parse kubelet config from /etc/systemd/system/kubelet.service.d/20-nodenet.conf
  • Added parseKubeletNodeIPs() function to extract both KUBELET_NODE_IP and KUBELET_NODE_IPS environment variables
  • Enhanced file validation to check for both existence AND valid content before triggering nodeip-configuration service

Recertification Logic

  • Implemented slices.Equal() comparison for clean IP change detection
  • Updated config IP format to comma-separated list: config.IP = "ip1,ip2" for multiple addresses
  • Enhanced certificate SAN rules to include all old and new IP addresses in replacement rules

Test Scenarios Covered:

  • ✅ Single-stack IPv4 and IPv6 configurations
  • ✅ Dual-stack (IPv4 + IPv6) with both primary orders
  • ✅ Multiple networks per IP family
  • ✅ Legacy → new field migration scenarios
  • ✅ Error handling and edge cases
  • ✅ Backward compatibility validation

Backward Compatibility

100% backward compatibility maintained:

  • Existing single-stack clusters continue to work without modification
  • Legacy NodeIP and MachineNetwork fields preserved and populated
  • New fields (NodeIPs, MachineNetworks) take precedence when specified
  • All existing APIs and behavior unchanged for single-stack scenarios

Testing

  • All existing tests pass - no regressions introduced
  • 59 new test cases covering all dual-stack scenarios
  • Production-ready validation for IPv4, IPv6, and dual-stack configurations
  • Edge case coverage including empty configs, invalid data, and service interactions

Checklist

This is a personal checklist that should be applicable to most PRs. It's good
to go over it in order to make sure you haven't missed anything. If you feel
like some of these points are not relevant to your PR, feel free to keep them
unchecked and if you want also explain why you think they're inapplicable.

  • I also copied this entire text into my commit message, and not just the GitHub PR description (git config commit.template .github/pull_request_template.md)
  • I performed a rough self-review of my changes
  • I explained non-trivial motivation for my code using code-comments
  • I made sure my code passes linting, tests, and builds correctly
  • I have ran the code and made sure it works as intended, and doesn't introduce any obvious regressions
  • I have not committed any irrelevant changes (if you did, please point them out and why, ideally separate them into a different PR)
  • I added tests (or decided that tests aren't really necessary)
  • I deleted this checklist and all the "<!---" comments (like this one) from the commit message and the PR description, leaving only my own text

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jul 22, 2025

@danmanor: This pull request references MGMT-21201 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.20.0" version, but no target version was set.

Details

In response to this:

Background / Context

The lifecycle-agent is a core component responsible for managing Image-Based Upgrades (IBU) and Image-Based Install (IBI) operations in OpenShift Single Node OpenShift (SNO) clusters. It handles critical cluster lifecycle operations including:

  • Network Configuration Management: Configuring node IPs, machine networks, cluster networks, and service networks during cluster transitions
  • Recertification (Recert): Re-signing certificates with updated cluster information (IPs, hostnames, etc.) when transforming seed images into target clusters
  • Post-Pivot Operations: Managing network setup, DNS configuration, and kubelet configuration after cluster pivot operations
  • Seed Cluster Information: Capturing and transforming network details from seed clusters for target cluster deployment

Currently, the lifecycle-agent's network handling is designed around single-stack networking, where clusters operate on either IPv4 OR IPv6, but not both simultaneously. All network-related data structures use single string fields (NodeIP, MachineNetwork) and the logic assumes a single IP address per network interface.

Key components involved:

  • api/seedreconfig: Defines the SeedReconfiguration API for cluster transformation parameters
  • utils/client_helper.go: Extracts cluster network information from Kubernetes API
  • lca-cli/postpivot: Handles post-upgrade network configuration and kubelet setup
  • internal/recert: Manages certificate re-signing with updated network information
  • lca-cli/seedclusterinfo: Captures seed cluster network details for replication

Issue / Requirement / Reason for change

MGMT-21201: The lifecycle-agent needs to support dual-stack networking configurations where OpenShift clusters operate with both IPv4 and IPv6 addresses simultaneously on the same interfaces.

Current Limitations:

  1. Single IP Assumption: All network fields (NodeIP, MachineNetwork) are single strings, preventing multiple IP support
  2. Limited Network Discovery: Cluster info extraction only captures the first internal IP address
  3. Inadequate Recert Logic: Certificate re-signing only handles single IP changes
  4. Kubelet Configuration: Node IP hint generation assumes single network per stack
  5. Missing Test Coverage: No validation for dual-stack scenarios

Requirements:

  • Support IPv4 + IPv6 dual-stack clusters in IBU/IBI operations
  • Maintain 100% backward compatibility with existing single-stack configurations
  • Handle multiple machine networks per IP family
  • Update recert logic to process multiple IP addresses in certificate SANs
  • Ensure proper kubelet configuration for dual-stack node IPs

Changes Made

API Extensions

  • Added NodeIPs []string to SeedReconfiguration and SeedClusterInfo for multiple node IPs
  • Added MachineNetworks []string to support multiple machine network CIDRs
  • Preserved legacy fields (NodeIP, MachineNetwork) for backward compatibility with precedence rules

Network Configuration Updates

  • Enhanced GetClusterInfo() to discover all internal node IPs via getNodeInternalIPs()
  • Added getMachineNetworks() to extract all machine networks from install config
  • Implemented backward compatibility by populating legacy fields with first array element

Post-Pivot Improvements

  • Updated setNodeIpHint() to generate space-separated IP hints: KUBELET_NODEIP_HINT=<ip1> <ip2>
  • Refactored setNodeIPIfNotProvided() to parse kubelet config from /etc/systemd/system/kubelet.service.d/20-nodenet.conf
  • Added parseKubeletNodeIPs() function to extract both KUBELET_NODE_IP and KUBELET_NODE_IPS environment variables
  • Enhanced file validation to check for both existence AND valid content before triggering nodeip-configuration service

Recertification Logic

  • Implemented slices.Equal() comparison for clean IP change detection
  • Updated config IP format to comma-separated list: config.IP = "ip1,ip2" for multiple addresses
  • Enhanced certificate SAN rules to include all old and new IP addresses in replacement rules

Test Scenarios Covered:

  • ✅ Single-stack IPv4 and IPv6 configurations
  • ✅ Dual-stack (IPv4 + IPv6) with both primary orders
  • ✅ Multiple networks per IP family
  • ✅ Legacy → new field migration scenarios
  • ✅ Error handling and edge cases
  • ✅ Backward compatibility validation

Backward Compatibility

100% backward compatibility maintained:

  • Existing single-stack clusters continue to work without modification
  • Legacy NodeIP and MachineNetwork fields preserved and populated
  • New fields (NodeIPs, MachineNetworks) take precedence when specified
  • All existing APIs and behavior unchanged for single-stack scenarios

Testing

  • All existing tests pass - no regressions introduced
  • 59 new test cases covering all dual-stack scenarios
  • Production-ready validation for IPv4, IPv6, and dual-stack configurations
  • Edge case coverage including empty configs, invalid data, and service interactions

Checklist

This is a personal checklist that should be applicable to most PRs. It's good
to go over it in order to make sure you haven't missed anything. If you feel
like some of these points are not relevant to your PR, feel free to keep them
unchecked and if you want also explain why you think they're inapplicable.

  • I also copied this entire text into my commit message, and not just the GitHub PR description (git config commit.template .github/pull_request_template.md)
  • I performed a rough self-review of my changes
  • I explained non-trivial motivation for my code using code-comments
  • I made sure my code passes linting, tests, and builds correctly
  • I have ran the code and made sure it works as intended, and doesn't introduce any obvious regressions
  • I have not committed any irrelevant changes (if you did, please point them out and why, ideally separate them into a different PR)
  • I added tests (or decided that tests aren't really necessary)
  • I deleted this checklist and all the "<!---" comments (like this one) from the commit message and the PR description, leaving only my own text

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jul 22, 2025

@danmanor: This pull request references MGMT-21201 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.20.0" version, but no target version was set.

Details

In response to this:

Background / Context

The lifecycle-agent is a core component responsible for managing Image-Based Upgrades (IBU) and Image-Based Install (IBI) operations in OpenShift Single Node OpenShift (SNO) clusters. It handles critical cluster lifecycle operations including:

  • Network Configuration Management: Configuring node IPs, machine networks, cluster networks, and service networks during cluster transitions
  • Recertification (Recert): Re-signing certificates with updated cluster information (IPs, hostnames, etc.) when transforming seed images into target clusters
  • Post-Pivot Operations: Managing network setup, DNS configuration, and kubelet configuration after cluster pivot operations
  • Seed Cluster Information: Capturing and transforming network details from seed clusters for target cluster deployment

Currently, the lifecycle-agent's network handling is designed around single-stack networking, where clusters operate on either IPv4 OR IPv6, but not both simultaneously. All network-related data structures use single string fields (NodeIP, MachineNetwork) and the logic assumes a single IP address per network interface.

Key components involved:

  • api/seedreconfig: Defines the SeedReconfiguration API for cluster transformation parameters
  • utils/client_helper.go: Extracts cluster network information from Kubernetes API
  • lca-cli/postpivot: Handles post-upgrade network configuration and kubelet setup
  • internal/recert: Manages certificate re-signing with updated network information
  • lca-cli/seedclusterinfo: Captures seed cluster network details for replication

Issue / Requirement / Reason for change

MGMT-21201: The lifecycle-agent needs to support dual-stack networking configurations where OpenShift clusters operate with both IPv4 and IPv6 addresses simultaneously on the same interfaces.

Current Limitations:

  1. Single IP Assumption: All network fields (NodeIP, MachineNetwork) are single strings, preventing multiple IP support
  2. Limited Network Discovery: Cluster info extraction only captures the first internal IP address
  3. Inadequate Recert Logic: Certificate re-signing only handles single IP changes
  4. Kubelet Configuration: Node IP hint generation assumes single network per stack
  5. Missing Test Coverage: No validation for dual-stack scenarios

Requirements:

  • Support IPv4 + IPv6 dual-stack clusters in IBU/IBI operations
  • Maintain 100% backward compatibility with existing single-stack configurations
  • Handle multiple machine networks per IP family
  • Update recert logic to process multiple IP addresses in certificate SANs
  • Ensure proper kubelet configuration for dual-stack node IPs

Changes Made

API Extensions

  • Added NodeIPs []string to SeedReconfiguration and SeedClusterInfo for multiple node IPs
  • Added MachineNetworks []string to support multiple machine network CIDRs
  • Preserved legacy fields (NodeIP, MachineNetwork) for backward compatibility with precedence rules

Network Configuration Updates

  • Enhanced GetClusterInfo() to discover all internal node IPs via getNodeInternalIPs()
  • Added getMachineNetworks() to extract all machine networks from install config
  • Implemented backward compatibility by populating legacy fields with first array element

Post-Pivot Improvements

  • Updated setNodeIpHint() to generate space-separated IP hints: KUBELET_NODEIP_HINT=<ip1> <ip2>
  • Refactored setNodeIPIfNotProvided() to parse kubelet config from /etc/systemd/system/kubelet.service.d/20-nodenet.conf
  • Added parseKubeletNodeIPs() function to extract both KUBELET_NODE_IP and KUBELET_NODE_IPS environment variables
  • Enhanced file validation to check for both existence AND valid content before triggering nodeip-configuration service

Recertification Logic

  • Implemented slices.Equal() comparison for clean IP change detection
  • Updated config IP format to comma-separated list: config.IP = "ip1,ip2" for multiple addresses
  • Enhanced certificate SAN rules to include all old and new IP addresses in replacement rules

Test Scenarios Covered:

  • ✅ Single-stack IPv4 and IPv6 configurations
  • ✅ Dual-stack (IPv4 + IPv6) with both primary orders
  • ✅ Multiple networks per IP family
  • ✅ Legacy → new field migration scenarios
  • ✅ Error handling and edge cases
  • ✅ Backward compatibility validation

Backward Compatibility

100% backward compatibility maintained:

  • Existing single-stack clusters continue to work without modification
  • Legacy NodeIP and MachineNetwork fields preserved and populated
  • New fields (NodeIPs, MachineNetworks) take precedence when specified
  • All existing APIs and behavior unchanged for single-stack scenarios

Testing

  • All existing tests pass - no regressions introduced
  • 59 new test cases covering all dual-stack scenarios
  • Production-ready validation for IPv4, IPv6, and dual-stack configurations
  • Edge case coverage including empty configs, invalid data, and service interactions

Checklist

This is a personal checklist that should be applicable to most PRs. It's good
to go over it in order to make sure you haven't missed anything. If you feel
like some of these points are not relevant to your PR, feel free to keep them
unchecked and if you want also explain why you think they're inapplicable.

  • I also copied this entire text into my commit message, and not just the GitHub PR description (git config commit.template .github/pull_request_template.md)
  • I performed a rough self-review of my changes
  • I explained non-trivial motivation for my code using code-comments
  • I made sure my code passes linting, tests, and builds correctly
  • I have ran the code and made sure it works as intended, and doesn't introduce any obvious regressions
  • I have not committed any irrelevant changes (if you did, please point them out and why, ideally separate them into a different PR)
  • I added tests (or decided that tests aren't really necessary)
  • I deleted this checklist and all the "<!---" comments (like this one) from the commit message and the PR description, leaving only my own text

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jul 22, 2025

@danmanor: This pull request references MGMT-21201 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.20.0" version, but no target version was set.

Details

In response to this:

Background / Context

The lifecycle-agent is a core component responsible for managing Image-Based Upgrades (IBU) and Image-Based Install (IBI) operations in OpenShift Single Node OpenShift (SNO) clusters. It handles critical cluster lifecycle operations including:

  • Network Configuration Management: Configuring node IPs, machine networks, cluster networks, and service networks during cluster transitions
  • Recertification (Recert): Re-signing certificates with updated cluster information (IPs, hostnames, etc.) when transforming seed images into target clusters
  • Post-Pivot Operations: Managing network setup, DNS configuration, and kubelet configuration after cluster pivot operations
  • Seed Cluster Information: Capturing and transforming network details from seed clusters for target cluster deployment

Currently, the lifecycle-agent's network handling is designed around single-stack networking, where clusters operate on either IPv4 OR IPv6, but not both simultaneously. All network-related data structures use single string fields (NodeIP, MachineNetwork) and the logic assumes a single IP address per network interface.

Key components involved:

  • api/seedreconfig: Defines the SeedReconfiguration API for cluster transformation parameters
  • utils/client_helper.go: Extracts cluster network information from Kubernetes API
  • lca-cli/postpivot: Handles post-upgrade network configuration and kubelet setup
  • internal/recert: Manages certificate re-signing with updated network information
  • lca-cli/seedclusterinfo: Captures seed cluster network details for replication

Issue / Requirement / Reason for change

MGMT-21201: The lifecycle-agent needs to support dual-stack networking configurations where OpenShift clusters operate with both IPv4 and IPv6 addresses simultaneously on the same interfaces.

Current Limitations:

  1. Single IP Assumption: All network fields (NodeIP, MachineNetwork) are single strings, preventing multiple IP support
  2. Limited Network Discovery: Cluster info extraction only captures the first internal IP address
  3. Inadequate Recert Logic: Certificate re-signing only handles single IP changes
  4. Kubelet Configuration: Node IP hint generation assumes single network per stack
  5. Missing Test Coverage: No validation for dual-stack scenarios

Requirements:

  • Support IPv4 + IPv6 dual-stack clusters in IBU/IBI operations
  • Maintain 100% backward compatibility with existing single-stack configurations
  • Handle multiple machine networks per IP family
  • Update recert logic to process multiple IP addresses in certificate SANs
  • Ensure proper kubelet configuration for dual-stack node IPs

Changes Made

API Extensions

  • Added NodeIPs []string to SeedReconfiguration and SeedClusterInfo for multiple node IPs
  • Added MachineNetworks []string to support multiple machine network CIDRs
  • Preserved legacy fields (NodeIP, MachineNetwork) for backward compatibility with precedence rules

Network Configuration Updates

  • Enhanced GetClusterInfo() to discover all internal node IPs via getNodeInternalIPs()
  • Added getMachineNetworks() to extract all machine networks from install config
  • Implemented backward compatibility by populating legacy fields with first array element

Post-Pivot Improvements

  • Updated setNodeIpHint() to generate space-separated IP hints: KUBELET_NODEIP_HINT=<ip1> <ip2>
  • Refactored setNodeIPIfNotProvided() to parse kubelet config from /etc/systemd/system/kubelet.service.d/20-nodenet.conf
  • Added parseKubeletNodeIPs() function to extract both KUBELET_NODE_IP and KUBELET_NODE_IPS environment variables
  • Enhanced file validation to check for both existence AND valid content before triggering nodeip-configuration service

Recertification Logic

  • Implemented slices.Equal() comparison for clean IP change detection
  • Updated config IP format to comma-separated list: config.IP = "ip1,ip2" for multiple addresses
  • Enhanced certificate SAN rules to include all old and new IP addresses in replacement rules

Test Scenarios Covered:

  • ✅ Single-stack IPv4 and IPv6 configurations
  • ✅ Dual-stack (IPv4 + IPv6) with both primary orders
  • ✅ Multiple networks per IP family
  • ✅ Legacy → new field migration scenarios
  • ✅ Error handling and edge cases
  • ✅ Backward compatibility validation

Backward Compatibility

100% backward compatibility maintained:

  • Existing single-stack clusters continue to work without modification
  • Legacy NodeIP and MachineNetwork fields preserved and populated
  • New fields (NodeIPs, MachineNetworks) take precedence when specified
  • All existing APIs and behavior unchanged for single-stack scenarios

Testing

  • All existing tests pass - no regressions introduced
  • 59 new test cases covering all dual-stack scenarios
  • Production-ready validation for IPv4, IPv6, and dual-stack configurations
  • Edge case coverage including empty configs, invalid data, and service interactions

Checklist

This is a personal checklist that should be applicable to most PRs. It's good
to go over it in order to make sure you haven't missed anything. If you feel
like some of these points are not relevant to your PR, feel free to keep them
unchecked and if you want also explain why you think they're inapplicable.

  • I also copied this entire text into my commit message, and not just the GitHub PR description (git config commit.template .github/pull_request_template.md)
  • I performed a rough self-review of my changes
  • I explained non-trivial motivation for my code using code-comments
  • I made sure my code passes linting, tests, and builds correctly
  • I have ran the code and made sure it works as intended, and doesn't introduce any obvious regressions
  • I have not committed any irrelevant changes (if you did, please point them out and why, ideally separate them into a different PR)
  • I added tests (or decided that tests aren't really necessary)
  • I deleted this checklist and all the "<!---" comments (like this one) from the commit message and the PR description, leaving only my own text

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@danmanor danmanor changed the title MGMT-21201: Enable dual-stack clusters WIP: MGMT-21201: Enable dual-stack clusters Jul 22, 2025
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 22, 2025
@danmanor
Copy link
Copy Markdown
Member Author

/cc @eranco74 @omertuc @mresvanis

@openshift-ci openshift-ci bot requested review from eranco74, mresvanis and omertuc July 22, 2025 14:33
@danmanor danmanor force-pushed the enable-dual-stack branch from 843109b to 95444b3 Compare July 22, 2025 14:34
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Jul 22, 2025

@danmanor: This pull request references MGMT-21201 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.20.0" version, but no target version was set.

Details

In response to this:

Background / Context

The lifecycle-agent is a core component responsible for managing Image-Based Upgrades (IBU) and Image-Based Install (IBI) operations in OpenShift Single Node OpenShift (SNO) clusters. It handles critical cluster lifecycle operations including:

  • Network Configuration Management: Configuring node IPs, machine networks, cluster networks, and service networks during cluster transitions
  • Recertification (Recert): Re-signing certificates with updated cluster information (IPs, hostnames, etc.) when transforming seed images into target clusters
  • Post-Pivot Operations: Managing network setup, DNS configuration, and kubelet configuration after cluster pivot operations
  • Seed Cluster Information: Capturing and transforming network details from seed clusters for target cluster deployment

Currently, the lifecycle-agent's network handling is designed around single-stack networking, where clusters operate on either IPv4 OR IPv6, but not both simultaneously. All network-related data structures use single string fields (NodeIP, MachineNetwork) and the logic assumes a single IP address per network interface.

Key components involved:

  • api/seedreconfig: Defines the SeedReconfiguration API for cluster transformation parameters
  • utils/client_helper.go: Extracts cluster network information from Kubernetes API
  • lca-cli/postpivot: Handles post-upgrade network configuration and kubelet setup
  • internal/recert: Manages certificate re-signing with updated network information
  • lca-cli/seedclusterinfo: Captures seed cluster network details for replication

Issue / Requirement / Reason for change

MGMT-21201: The lifecycle-agent needs to support dual-stack networking configurations where OpenShift clusters operate with both IPv4 and IPv6 addresses simultaneously on the same interfaces.

Current Limitations:

  1. Single IP Assumption: All network fields (NodeIP, MachineNetwork) are single strings, preventing multiple IP support
  2. Limited Network Discovery: Cluster info extraction only captures the first internal IP address
  3. Inadequate Recert Logic: Certificate re-signing only handles single IP changes
  4. Kubelet Configuration: Node IP hint generation assumes single network per stack
  5. Missing Test Coverage: No validation for dual-stack scenarios

Requirements:

  • Support IPv4 + IPv6 dual-stack clusters in IBU/IBI operations
  • Maintain 100% backward compatibility with existing single-stack configurations
  • Handle multiple machine networks per IP family
  • Update recert logic to process multiple IP addresses in certificate SANs
  • Ensure proper kubelet configuration for dual-stack node IPs

Changes Made

API Extensions

  • Added NodeIPs []string to SeedReconfiguration and SeedClusterInfo for multiple node IPs
  • Added MachineNetworks []string to support multiple machine network CIDRs
  • Preserved legacy fields (NodeIP, MachineNetwork) for backward compatibility with precedence rules

Network Configuration Updates

  • Enhanced GetClusterInfo() to discover all internal node IPs via getNodeInternalIPs()
  • Added getMachineNetworks() to extract all machine networks from install config
  • Implemented backward compatibility by populating legacy fields with first array element

Post-Pivot Improvements

  • Updated setNodeIpHint() to generate space-separated IP hints: KUBELET_NODEIP_HINT=<ip1> <ip2>
  • Refactored setNodeIPIfNotProvided() to parse kubelet config from /etc/systemd/system/kubelet.service.d/20-nodenet.conf
  • Added parseKubeletNodeIPs() function to extract both KUBELET_NODE_IP and KUBELET_NODE_IPS environment variables
  • Enhanced file validation to check for both existence AND valid content before triggering nodeip-configuration service

Recertification Logic

  • Implemented slices.Equal() comparison for clean IP change detection
  • Updated config IP format to comma-separated list: config.IP = "ip1,ip2" for multiple addresses
  • Enhanced certificate SAN rules to include all old and new IP addresses in replacement rules

Test Scenarios Covered:

  • ✅ Single-stack IPv4 and IPv6 configurations
  • ✅ Dual-stack (IPv4 + IPv6) with both primary orders
  • ✅ Multiple networks per IP family
  • ✅ Legacy → new field migration scenarios
  • ✅ Error handling and edge cases
  • ✅ Backward compatibility validation

Backward Compatibility

100% backward compatibility maintained:

  • Existing single-stack clusters continue to work without modification
  • Legacy NodeIP and MachineNetwork fields preserved and populated
  • New fields (NodeIPs, MachineNetworks) take precedence when specified
  • All existing APIs and behavior unchanged for single-stack scenarios

Testing

  • All existing tests pass - no regressions introduced
  • 59 new test cases covering all dual-stack scenarios
  • Production-ready validation for IPv4, IPv6, and dual-stack configurations
  • Edge case coverage including empty configs, invalid data, and service interactions

Other related PRs

recert - rh-ecosystem-edge/recert#390

Checklist

This is a personal checklist that should be applicable to most PRs. It's good
to go over it in order to make sure you haven't missed anything. If you feel
like some of these points are not relevant to your PR, feel free to keep them
unchecked and if you want also explain why you think they're inapplicable.

  • I also copied this entire text into my commit message, and not just the GitHub PR description (git config commit.template .github/pull_request_template.md)
  • I performed a rough self-review of my changes
  • I explained non-trivial motivation for my code using code-comments
  • I made sure my code passes linting, tests, and builds correctly
  • I have ran the code and made sure it works as intended, and doesn't introduce any obvious regressions
  • I have not committed any irrelevant changes (if you did, please point them out and why, ideally separate them into a different PR)
  • I added tests (or decided that tests aren't really necessary)
  • I deleted this checklist and all the "<!---" comments (like this one) from the commit message and the PR description, leaving only my own text

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@omertuc
Copy link
Copy Markdown
Collaborator

omertuc commented Jul 22, 2025

/ok-to-test

@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 22, 2025
@red-hat-konflux
Copy link
Copy Markdown
Contributor

Caution

There are some errors in your PipelineRun template.

PipelineRun Error
lifecycle-agent-digest-mirror-set no kind "ImageDigestMirrorSet" is registered for version "operator.openshift.io/v1" in scheme "k8s.io/client-go/kubernetes/scheme/register.go:83"

@danmanor danmanor force-pushed the enable-dual-stack branch from 95444b3 to 64c57e0 Compare July 22, 2025 16:14
@danmanor
Copy link
Copy Markdown
Member Author

/retest

@jc-rh
Copy link
Copy Markdown
Member

jc-rh commented Aug 13, 2025

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 13, 2025
@danmanor
Copy link
Copy Markdown
Member Author

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 13, 2025
@eranco74
Copy link
Copy Markdown
Collaborator

/ok-to-test
/approve

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Aug 13, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: eranco74

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 13, 2025
@danmanor
Copy link
Copy Markdown
Member Author

/retest

1 similar comment
@danmanor
Copy link
Copy Markdown
Member Author

/retest

@danmanor
Copy link
Copy Markdown
Member Author

/remove label cluster-config-api-changed

@danmanor
Copy link
Copy Markdown
Member Author

/remove-label cluster-config-api-changed

@openshift-ci openshift-ci bot removed the cluster-config-api-changed Cluster config API changed. It's used by other projects. Review to ensure your change is nonbreaking label Aug 14, 2025
@eranco74
Copy link
Copy Markdown
Collaborator

/ok-to-test

@danmanor
Copy link
Copy Markdown
Member Author

/retest-required

1 similar comment
@danmanor
Copy link
Copy Markdown
Member Author

/retest-required

@danmanor
Copy link
Copy Markdown
Member Author

/retest

4 similar comments
@danmanor
Copy link
Copy Markdown
Member Author

/retest

@danmanor
Copy link
Copy Markdown
Member Author

/retest

@danmanor
Copy link
Copy Markdown
Member Author

/retest

@danmanor
Copy link
Copy Markdown
Member Author

/retest

@danmanor
Copy link
Copy Markdown
Member Author

/ok-to-test

1 similar comment
@eranco74
Copy link
Copy Markdown
Collaborator

/ok-to-test

@danmanor
Copy link
Copy Markdown
Member Author

/test integration

@eranco74
Copy link
Copy Markdown
Collaborator

/ok-to-test

@danmanor
Copy link
Copy Markdown
Member Author

/retest

1 similar comment
@danmanor
Copy link
Copy Markdown
Member Author

/retest

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Aug 17, 2025

@danmanor: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/security eb1fbe8 link false /test security

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 85c3954 into openshift-kni:main Aug 17, 2025
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants