Skip to content

ROX-33584: Migrate cert monitor to controller-runtime#2624

Open
kovayur wants to merge 2 commits intomainfrom
yury/certmon-controller-runtime
Open

ROX-33584: Migrate cert monitor to controller-runtime#2624
kovayur wants to merge 2 commits intomainfrom
yury/certmon-controller-runtime

Conversation

@kovayur
Copy link
Contributor

@kovayur kovayur commented Mar 12, 2026

Description

  • Migrate cert monitor to controller-runtime
  • Use label selector rhacs.redhat.com/tls to filter only stackrox TLS secrets
  • Increase sync interval from 1 minute to 30 minutes
  • Do not watch namespaces

Checklist (Definition of Done)

  • Unit and integration tests added
  • Added test description under Test manual
  • Documentation added if necessary (i.e. changes to dev setup, test execution, ...)
  • CI and all relevant tests are passing
  • Add the ticket number to the PR title if available, i.e. ROX-12345: ...
  • Discussed security and business related topics privately. Will move any security and business related topics that arise to private communication channel.
  • Add secret to app-interface Vault or Secrets Manager if necessary
  • RDS changes were e2e tested manually
  • Check AWS limits are reasonable for changes provisioning new resources

Test manual

TODO: Add manual testing efforts

# To run tests locally run:
make db/teardown db/setup db/migrate
make ocm/setup
make verify lint binary test test/integration

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 12, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kovayur

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kovayur kovayur requested review from johannes94 and kurlov March 12, 2026 18:25
)

// Config represents the certificate monitor configuration
type Config struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that the Config is still reference by the CertMonitor struct, but is not referenced by any of it's code.

The labelselectors are hardcoded in the constructur for the cache, with no way to influence the cache from an outside package.

We should either remove that config or make the CertMonitor actually use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I missed that. It's deleted now.

"time"

"github.com/prometheus/client_golang/prometheus/testutil"
"github.com/stackrox/acs-fleet-manager/fleetshard/pkg/fleetshardmetrics"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of test code deleted here. I'm not sure what it was testing exactly, didn't look to deep.

The remaining tests seems to verify package internal methods, but not that the overall CertMonitor is working like expected.

I would expect a test with sample secrets with rhacs.redhat.com/tls=true label and without it, starting the CertMonitor, then verify the metrics for matching labels show up. Then deleting a secrets, verifying it disappears from the metrics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the deleted tests were dedicated to the config validation and client side filtering which I removed.
I couldn't find a way to cover server-side filtering with unit tests. Perhaps I could add e2e tests or integration tests with envtest that starts a real etcd + API server, but I believe that would be overkill. Instead, I'd rely on controller-runtime to handle the filtering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll timebox an integration test implementation with envtest and see how it goes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I though it would be a simple fakeclient test at first, but you're right it isn't straight forward.

I'd also be fine if you'd get the cache options in a extra function and tests that as a unit tests.
So in essence we're testing that we're configuring controller-runtime correctly, and leave the rest to controller-runtime.

Something like:

 func certMonitorCacheOptions() ctrlcache.Options {
      syncPeriod := syncPeriod
      return ctrlcache.Options{
          ByObject: map[client.Object]ctrlcache.ByObject{
              &corev1.Secret{}: {
                  Label: labels.SelectorFromSet(labels.Set{
                      tlsSecretLabel: "true",
                  }),
              },
          },
          DefaultLabelSelector: labels.Nothing(),
          SyncPeriod:           &syncPeriod,
      }
  }

  Then test:

  func TestCertMonitorCacheOptions(t *testing.T) {
      opts := certMonitorCacheOptions()

      // Verify label selector matches labeled secrets
      selector := opts.ByObject[&corev1.Secret{}].Label
      labeled := labels.Set{"rhacs.redhat.com/tls": "true"}
      unlabeled := labels.Set{"other": "label"}

      assert.True(t, selector.Matches(labeled))
      assert.False(t, selector.Matches(unlabeled))

      // Verify default selector rejects everything
      assert.False(t, opts.DefaultLabelSelector.Matches(labels.Set{}))
  }

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done as suggested.

@kovayur kovayur force-pushed the yury/certmon-controller-runtime branch from ae82fc1 to 8ff7fd3 Compare March 13, 2026 13:33
@kovayur kovayur force-pushed the yury/certmon-controller-runtime branch from 8ff7fd3 to 3653fdb Compare March 13, 2026 13:58
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 13, 2026

@kovayur: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e e0378ca link true /test e2e

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants