Skip to content

Conversation

@gflarity
Copy link
Contributor

@gflarity gflarity commented Jan 17, 2026

What type of PR is this?

/kind feature

What this PR does / why we need it:

Note this PR was originally #233. I have taken it over it order to implement PR feedback and get things moving faster.

This PR adds support for external certificate management for Grove's webhook server. Users can now choose between:

  1. Auto-provisioned certificates (default): Grove's built-in cert-controller automatically generates and manages self-signed certificates - best for development/testing
  2. External certificates: Users can provide their own certificates (e.g., via cert-manager or manually) - best for production environments

Key changes:

  • Added autoProvision configuration option (config.server.webhooks.autoProvision) to toggle between auto-provisioned and external certificates
  • Added secretName configuration to specify the Kubernetes Secret containing webhook certificates
  • Added webhooks.caBundle Helm value to inject a custom CA bundle into webhook configurations
  • Added comprehensive e2e tests covering both certificate management modes and transitions between them

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

  • When autoProvision: true (default), behavior is unchanged - Grove generates and manages certificates automatically
  • When autoProvision: false, the operator expects certificates to exist in the specified Secret before starting
  • The e2e tests verify transitioning from cert-manager managed certificates to autoprovision mode and back
  • Webhook configurations now support an optional caBundle value for external CA certificates

Does this PR introduce a API change?

Added support for external certificate management. Users can now disable auto-provisioned certificates by setting `config.server.webhooks.autoProvision: false` and provide their own certificates (e.g., via cert-manager). A new `webhooks.caBundle` Helm value allows injecting custom CA bundles into webhook configurations.

Additional documentation e.g., enhancement proposals, usage docs, etc.:

Added docs/user-guide/certificate-management.md

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for external certificate management for Grove's webhook server, allowing users to choose between automatic certificate provisioning (default) and external certificate management (e.g., via cert-manager). This provides flexibility for both development/testing environments and production deployments with enterprise PKI requirements.

Changes:

  • Added autoProvision and secretName configuration options to toggle between auto-provisioned and external certificates
  • Added caBundle Helm value to support custom CA certificates in webhook configurations
  • Implemented comprehensive e2e tests for certificate management mode transitions

Reviewed changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
operator/internal/controller/cert/cert.go Modified ManageWebhookCerts to support both auto-provisioned and external certificate modes
operator/e2e/utils/k8s_client.go Exported applyYAMLData function as ApplyYAMLData for use in new tests
operator/e2e/tests/cert_management_test.go Added comprehensive e2e test suite for certificate management round-trip scenarios
operator/e2e/setup/helm.go Added UpgradeHelmChart and UninstallHelmChart functions with timeout support
operator/e2e/setup/grove.go Added UpgradeGrove function for upgrading Grove configuration
operator/e2e/dependencies.yaml Added cert-manager as a test dependency
operator/e2e/dependencies.go Added cert-manager to HelmCharts struct
operator/cmd/main.go Updated main to pass new certificate configuration parameters
operator/charts/values.yaml Added certificate management configuration options and documentation
operator/charts/templates/webhook-server-cert-secret.yaml Made secret creation conditional based on autoProvision setting
operator/charts/templates/pcs-validating-webhook-config.yaml Added support for custom annotations and caBundle
operator/charts/templates/pcs-defaulting-webhook-config.yaml Added support for custom annotations and caBundle
operator/charts/templates/deployment.yaml Made certificate directory and secret name configurable
operator/charts/templates/authorizer-webhook-config.yaml Added support for custom annotations and caBundle
operator/charts/templates/_helpers.tpl Added certificate configuration to config template
operator/api/config/v1alpha1/zz_generated.deepcopy.go Added deep copy support for new AutoProvision field
operator/api/config/v1alpha1/types.go Added SecretName and AutoProvision fields to WebhookServer struct
operator/api/config/v1alpha1/defaults.go Added defaults for new certificate configuration fields
docs/user-guide/certificate-management.md Added comprehensive certificate management documentation
docs/installation.md Added reference to certificate management guide
docs/api-reference/operator-api.md Updated API documentation with new certificate fields
.github/workflows/e2e-test.yaml Added cert_management test to CI workflow

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@Ronkahn21 Ronkahn21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will complete the review later today

kangclzjc and others added 19 commits January 22, 2026 14:01
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: Bashorun97 <bashorun.emma@gmail.com>
Signed-off-by: Bashorun97 <bashorun.emma@gmail.com>
Signed-off-by: Bashorun97 <bashorun.emma@gmail.com>
Signed-off-by: Emmanuel Bashorun <bashorun.emma@gmail.com>
Signed-off-by: Emmanuel Bashorun <bashorun.emma@gmail.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
Signed-off-by: kangclzjc <kangz@nvidia.com>
@gflarity gflarity force-pushed the cert_management_cleanup branch from f5681f1 to df66b3b Compare January 22, 2026 20:36
@gflarity gflarity requested a review from Ronkahn21 January 22, 2026 20:37
Comment on lines +64 to +74
for k, v := range c.Webhooks.Annotations {
annotations[k] = v
}

webhookAnnotations := map[string]interface{}{"annotations": annotations}

return map[string]interface{}{
"installCRDs": c.InstallCRDs,
"config": map[string]interface{}{
"server": map[string]interface{}{
"webhooks": map[string]interface{}{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Manual map nesting like this is pretty brittle—if we rename a field in GroveConfig but forget to update this string key, the Helm chart will break silently.

Can we refactor this to use json.Marshal/Unmarshal or a library like mapstructure?

//
// This approach avoids wasteful rebuilds while staying compatible with the Skaffold installation.
func UpdateGroveConfiguration(ctx context.Context, restConfig *rest.Config, config *GroveConfig, logger *utils.Logger) error {
chartDir, err := getGroveChartDir()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a SOLID perspective the path should be passed to the function

func applyYAMLData(ctx context.Context, yamlData []byte, namespace string, restConfig *rest.Config, logger *Logger) ([]AppliedResource, error) {
// ApplyYAMLData is the common function that applies YAML data to Kubernetes
func ApplyYAMLData(ctx context.Context, yamlData []byte, namespace string, restConfig *rest.Config, logger *Logger) ([]AppliedResource, error) {
dynamicClient, restMapper, err := createKubernetesClients(restConfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to have a way we not to create client obj in every function that interact with the cluster, in long session it could take up to a minute

// ManageWebhookCerts manages webhook certificates based on the CertProvisionMode configuration.
// When mode=auto: uses cert-controller for automatic certificate generation and management.
// When mode=manual: waits for externally provided certificates (e.g., from cert-manager, cluster admin).
func ManageWebhookCerts(mgr ctrl.Manager, certDir string, secretName string, authorizerEnabled bool, certProvisionMode configv1alpha1.CertProvisionMode, certsReadyCh chan struct{}) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you please add unit testing for the new functionality. It will make sure it does not break when we might have more the current two modes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants