Skip to content

Use cert-manager to issue certificates for IPSec#5365

Merged
zhangzujian merged 3 commits intokubeovn:masterfrom
cruickshankpg:certmanager-ipsec
Jul 9, 2025
Merged

Use cert-manager to issue certificates for IPSec#5365
zhangzujian merged 3 commits intokubeovn:masterfrom
cruickshankpg:certmanager-ipsec

Conversation

@cruickshankpg
Copy link
Copy Markdown
Contributor

Pull Request

What type of this PR

  • Features

Add support for issuing IPSec tunnel certificates using cert-manager.

When cert-manager certificates are enabled, the controller no longer generates the IPSec CA cert or private key stored in the ovn-ipsec-ca secret. The secret should be populated with the same CA as configured with cert-manager. It still enables IPSec in OVN NB.

When cert-manager certificates are enabled the CNI daemon creates cert-manager CertificateRequest resources instead of CSRs. A cert-manager ClusterIssuer should be configured to approve and sign these CertificateRequests with a matching CA as configured in ovn-ipsec-ca secret. The name of the issuer to use is configurable in the CNI.

The CNI daemon now watches the ovn-ipsec-ca secret for changes allowing for rollout of a new trust bundle. It verifies the currently configured certificate is signed by the new bundle and if not then triggers a new certificate to be issued. The daemon now splits each certificate in the CA bundle into a separate file as strongswan is unable to parse multiple CAs from a single file.

The CNI daemon now requests a new certificate when the current certificate is at least half way to expiry based on the times in the certificate. When generating a new certificate the daemon also generates a new key just in case the previous one was leaked somehow. The certificate lifetime is also now configurable rather than lasting for a year. The CNI no longer restarts the ipsec or ovs-ipsec-monitor services when the certificate changes and just requests ipsec to reread the CA certs if they change.

To allow for the CNI daemon to keep track of the versions of its key, certificate, and CA cert files it now stores them with locally unique names on disk. Keys and certs are suffixed with the timestamp they were generated. CA files are suffixed with the k8s revision number of the ovn-ipsec-ca secret.

The cert manager validation webhook (if used) should be run in the host network to prevent the risk of certificate requests deadlocking in the event of a certificate expiry. The CNI pods and cert manager issuer interact with the API server over the host network to create and approve certificates but the API server calls the webhook of the service network which can be broken in the event of an expired certificate.

A new kind deployment is created using cert-manager issued certificates and a new e2e test is created that uses it. The e2e test runs through rotating the the CA.

Which issue(s) this PR fixes

Fixes #5073

@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. feature New network feature labels Jun 17, 2025
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary of Changes

Hello @cruickshankpg, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the IPSec certificate management within Kube-OVN by integrating with cert-manager. This change automates the issuance, renewal, and rotation of IPSec certificates, moving away from manual or less dynamic methods. It also improves the robustness of the system by ensuring certificates are always valid and trust bundles are up-to-date, with comprehensive testing to validate the new workflow.

Highlights

  • Cert-Manager Integration: Introduced support for using cert-manager to issue and manage IPSec tunnel certificates, providing a more robust and automated certificate lifecycle management solution compared to the previous self-signed approach.
  • Automated Certificate Management: The CNI daemon now automatically requests new certificates using cert-manager's CertificateRequest resources (or Kubernetes CSRs if cert-manager is not enabled). It also proactively renews certificates when they are halfway to expiry and generates a new private key for each renewal.
  • CA Certificate Rotation and Watch: The CNI daemon now watches the ovn-ipsec-ca secret for changes to the CA trust bundle. Upon detecting a change, it verifies if the current certificate is still signed by the new bundle and triggers a new certificate issuance if necessary. The CA bundle is also split into individual files for strongswan compatibility.
  • Improved File Management: Keys, certificates, and CA files are now stored with locally unique names (suffixed with timestamps or Kubernetes resource versions) on disk, allowing for better tracking and cleanup of old files.
  • Deployment and Testing Enhancements: Added new Makefile targets for deploying cert-manager alongside IPSec and introduced a new e2e test specifically designed to validate the cert-manager integration and CA rotation functionality.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a robust mechanism for managing IPSec certificates using cert-manager. The changes are well-structured, particularly the logic in pkg/daemon/ipsec.go for handling certificate lifecycle, CA rotation, and interaction with cert-manager. The new e2e test for CA rotation is a valuable addition. I've identified a few areas for potential minor improvements, mainly around error handling robustness in the Makefile and logging consistency. The core logic for cert-manager integration appears sound.

Makefile Outdated
Comment on lines +988 to +1001
$(eval CA_KEY = $(shell mktemp))
$(shell openssl genrsa -out $(CA_KEY) 2048)
$(eval CA_CERT = $(shell openssl req -x509 -new -nodes \
-key "$(CA_KEY)" \
-days 3650 \
-subj "/C=US/ST=California/L=Palo Alto/O=Open vSwitch/OU=Open vSwitch Certification Authority/CN=OVS CA" | \
base64 -w 0 -))

$(eval CA_KEY64 = $(shell base64 -w 0 "$(CA_KEY)"))

sed -e 's/KUBE_OVN_CA_KEY/$(CA_KEY64)/g' \
-e 's/KUBE_OVN_CA_CERT/$(CA_CERT)/g' yamls/ipsec-certs.yaml | \
kubectl apply -f -
rm $(CA_KEY)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If any command between the creation of $(CA_KEY) and its removal fails, the temporary file might be left behind. Consider using a trap to ensure cleanup on exit, e.g., trap 'rm -f $(CA_KEY)' EXIT after CA_KEY is defined.

"-addext", "subjectAltName = DNS:"+cn,
"-subj", fmt.Sprintf("/C=CN/O=kubeovn/OU=kind/CN=%s", cn),
"-key", ipsecPrivKeyPath,
"-subj", "/C=CN/O=kubeovn/OU=kind/CN="+cn,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The subject for the CSR includes OU=kind. If this IPSec functionality is intended for general production use beyond 'kind' clusters, this Organizational Unit (OU) might be too specific. Consider making the OU more generic or configurable.

Suggested change
"-subj", "/C=CN/O=kubeovn/OU=kind/CN="+cn,
"-subj", "/C=CN/O=kubeovn/OU=kube-ovn/CN="+cn,

Comment on lines +267 to +279
if err != nil {
if k8serrors.IsAlreadyExists(err) {
klog.Infof("CR %s already exists: %v", newCR.Name, err)
err := c.config.CertManagerClient.CertmanagerV1().CertificateRequests("kube-system").Delete(context.Background(), newCR.Name, metav1.DeleteOptions{})
if err != nil {
klog.Errorf("failed to delete cr: %s; %v", newCR.Name, err)
}
}

return nil, fmt.Errorf("creating certificate request; %w", err)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

When a CertificateRequest already exists, the current logic is to delete it and then return the IsAlreadyExists error, relying on the caller to retry the Create operation. Consider a more direct approach: after deleting the existing CR, immediately attempt to re-create it within this function call.

if k8serrors.IsAlreadyExists(err) {
    klog.Infof("CR %s already exists, attempting to delete and recreate", newCR.Name)
    deleteErr := c.config.CertManagerClient.CertmanagerV1().CertificateRequests("kube-system").Delete(context.Background(), newCR.Name, metav1.DeleteOptions{})
    if deleteErr != nil {
        klog.Errorf("failed to delete existing CR %s: %v", newCR.Name, deleteErr)
        return nil, fmt.Errorf("failed to delete existing certificate request %s: %w", newCR.Name, deleteErr)
    }
    // Retry creation after deletion
    _, err = c.config.CertManagerClient.CertmanagerV1().CertificateRequests("kube-system").Create(ctx, newCR, metav1.CreateOptions{})
    if err != nil {
        return nil, fmt.Errorf("failed to recreate certificate request %s after deletion: %w", newCR.Name, err)
    }
    // Proceed with polling as before
} else if err != nil {
    return nil, fmt.Errorf("creating certificate request; %w", err)

}

if len(certificates) == 0 {
return nil, fmt.Errorf("no valid certificates found in bundle")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Ensure that linkCACertToIPSecDir correctly handles the error from splitPEMCertificates and avoids further operations if no valid CAs were processed.

@cruickshankpg cruickshankpg force-pushed the certmanager-ipsec branch 2 times, most recently from f0d79eb to a8c235e Compare June 17, 2025 15:26
@cruickshankpg
Copy link
Copy Markdown
Contributor Author

Sorry this is a bit of a giant PR, let me know if there's anything I can do to help get it reviewed

@cruickshankpg cruickshankpg force-pushed the certmanager-ipsec branch 2 times, most recently from c315d01 to e7f4af9 Compare June 17, 2025 19:20
- "list"
- "watch"
- "delete"
- apiGroups:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was this used for beforehand? Have we potentially regressed some old function?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've replaced this very broad secrets permission to one specifically for the ovn-ipsec-ca secret in the kube-system namepsace

ENABLE_COMPACT=${ENABLE_COMPACT:-false}
SECURE_SERVING=${SECURE_SERVING:-false}
ENABLE_OVN_IPSEC=${ENABLE_OVN_IPSEC:-false}
CERT_MANAGER_IPSEC_CERT=${CERT_MANAGER_IPSEC_CERT:-false}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if this was called ENABLE_CERT_MANAGER_IPSEC in line with variables above

}

func (c *Controller) enqueueDeleteCA(obj any) {
pn := obj.(*v1.Secret)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: inconsistent with two fns above to split this into two lines

@oilbeater
Copy link
Copy Markdown
Collaborator

The pr conflict with the master, please rebase and resolve the conflict.

@cruickshankpg cruickshankpg force-pushed the certmanager-ipsec branch 2 times, most recently from 1ec1f02 to a5e698e Compare June 23, 2025 08:41
@coveralls
Copy link
Copy Markdown

coveralls commented Jun 23, 2025

Pull Request Test Coverage Report for Build 16143836874

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 449 (0.0%) changed or added relevant lines in 5 files are covered.
  • 9 unchanged lines in 3 files lost coverage.
  • Overall coverage decreased (-0.1%) to 21.389%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/controller/controller.go 0 1 0.0%
pkg/controller/config.go 0 2 0.0%
pkg/daemon/config.go 0 15 0.0%
pkg/daemon/controller.go 0 61 0.0%
pkg/daemon/ipsec.go 0 370 0.0%
Files with Coverage Reduction New Missed Lines %
pkg/daemon/config.go 1 0.0%
pkg/ovs/ovn-nb-bfd.go 2 61.61%
pkg/daemon/ipsec.go 6 0.0%
Totals Coverage Status
Change from base Build 16082071371: -0.1%
Covered Lines: 10511
Relevant Lines: 49143

💛 - Coveralls

@cruickshankpg cruickshankpg force-pushed the certmanager-ipsec branch 2 times, most recently from 81819b8 to d94361e Compare June 23, 2025 11:31
@cruickshankpg
Copy link
Copy Markdown
Contributor Author

@zhangzujian thanks for the review, I've got the e2e test running as part of the workflow. I think the other e2e test failures are unrelated.

When cert-manager certificates are enabled, the controller no longer generates the IPSec CA cert or private key stored in the `ovn-ipsec-ca` secret. The secret should be populated with the same CA as configured with cert-manager. It still enables IPSec in OVN NB.

When cert-manager certificates are enabled the CNI daemon creates cert-manager CertificateRequest resources instead of CSRs. A cert-manager ClusterIssuer should be configured to approve and sign these CertificateRequests with a matching CA as configured in `ovn-ipsec-ca` secret. The name of the issuer to use is configurable in the CNI.

The CNI daemon now watches the `ovn-ipsec-ca` secret for changes allowing for rollout of a new trust bundle. It verifies the currently configured certificate is signed by the new bundle and if not then triggers a new certificate to be issued. The daemon now splits each certificate in the CA bundle into a separate file as strongswan is unable to parse multiple CAs from a single file.

The CNI daemon now requests a new certificate when the current certificate is at least half way to expiry based on the times in the certificate. When generating a new certificate the daemon also generates a new key just in case the previous one was leaked somehow. The certificate lifetime is also now configurable rather than lasting for a year. The CNI no longer restarts the ipsec or ovs-ipsec-monitor services when the certificate changes and just requests ipsec to reread the CA certs if they change.

To allow for the CNI daemon to keep track of the versions of its key, certificate, and CA cert files it now stores them with locally unique names on disk. Keys and certs are suffixed with the timestamp they were generated. CA files are suffixed with the k8s revision number of the `ovn-ipsec-ca` secret.

The cert manager validation webhook (if used) should be run in the host network to prevent the risk of certificate requests deadlocking in the event of a certificate expiry. The CNI pods and cert manager issuer interact with the API server over the host network to create and approve certificates but the API server calls the webhook of the service network which can be broken in the event of an expired certificate.

A new kind deployment is created using cert-manager issued certificates and a new e2e test is created that uses it. The e2e test runs through rotating the the CA.

Signed-off-by: Paul Cruickshank <pcruickshank@evroc.com>
Signed-off-by: Paul Cruickshank <pcruickshank@evroc.com>
Signed-off-by: Paul Cruickshank <pcruickshank@evroc.com>
@zhangzujian zhangzujian merged commit c880c83 into kubeovn:master Jul 9, 2025
72 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature New network feature size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Allow cert-manager to sign IPSec CSRs

6 participants