Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 0 additions & 15 deletions config/default/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,21 +41,6 @@ patches:
target:
kind: Deployment


replacements:
- source:
kind: Deployment
name: controller-manager
fieldPath: metadata.namespace
targets:
- select:
kind: ServiceMonitor
group: monitoring.coreos.com
version: v1
name: controller-manager-metrics-monitor
fieldPaths:
- spec.namespaceSelector.matchNames.0

# Uncomment the patches line if you enable Metrics and CertManager
# [METRICS-WITH-CERTS] To enable metrics protected with certManager, uncomment the following line.
# This patch will protect the metrics with certManager self-signed certs.
Expand Down
6 changes: 6 additions & 0 deletions config/openshift/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ kind: Kustomization
resources:
- ../default
- cluster-monitoring-view-binding.yaml
- metrics-reader-token.yaml
- prometheus-metrics-auth-binding.yaml

patches:
- path: configmap-patch.yaml
Expand All @@ -22,5 +24,9 @@ patches:
target:
kind: Deployment
name: controller-manager
- path: monitor-auth-patch.yaml
target:
kind: ServiceMonitor
name: controller-manager-metrics-monitor

namespace: workload-variant-autoscaler-system
11 changes: 11 additions & 0 deletions config/openshift/metrics-reader-token.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Long-lived SA token for Prometheus to authenticate to the WVA metrics endpoint.
# Required on OpenShift because user-workload-monitoring Prometheus rejects
# bearerTokenFile for security. The ServiceMonitor is patched (via
# monitor-auth-patch.yaml) to reference this Secret instead.
apiVersion: v1
kind: Secret
metadata:
name: workload-variant-autoscaler-metrics-reader-token
annotations:
kubernetes.io/service-account.name: workload-variant-autoscaler-controller-manager
type: kubernetes.io/service-account-token
Comment on lines +1 to +11
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether a ClusterRoleBinding exists that binds metrics-reader to the controller-manager SA
echo "=== Searching for metrics-reader ClusterRoleBinding ==="
rg -n "metrics-reader" --type yaml -C5

echo ""
echo "=== Checking all ClusterRoleBindings for controller-manager SA ==="
rg -n "workload-variant-autoscaler-controller-manager" --type yaml -C5

Repository: opendatahub-io/workload-variant-autoscaler

Length of output: 18253


🏁 Script executed:

#!/bin/bash
# Check what resources are included in the openshift kustomization
echo "=== config/openshift/kustomization.yaml ==="
cat config/openshift/kustomization.yaml

echo ""
echo "=== Check if metrics_reader_role_binding.yaml exists in config/rbac/ ==="
ls -la config/rbac/metrics_reader_role_binding.yaml 2>&1 || echo "NOT FOUND"

echo ""
echo "=== Check what's in config/rbac/kustomization.yaml ==="
grep -n "metrics_reader" config/rbac/kustomization.yaml || echo "No metrics_reader reference"

Repository: opendatahub-io/workload-variant-autoscaler

Length of output: 1171


🏁 Script executed:

#!/bin/bash
# Check config/default/ to see if metrics_reader_role_binding is included there
echo "=== config/default/kustomization.yaml ==="
cat config/default/kustomization.yaml 2>&1 | head -50

echo ""
echo "=== Search for any metrics_reader_role_binding in entire config/ ==="
find config -name "*metrics_reader*binding*" -type f 2>/dev/null || echo "None found"

Repository: opendatahub-io/workload-variant-autoscaler

Length of output: 2191


Add ClusterRoleBinding for metrics-reader role to authorize controller-manager ServiceAccount.

The metrics-reader ClusterRole exists (defined in config/rbac/metrics_reader_role.yaml and included in the kustomize build), but there is no corresponding ClusterRoleBinding that grants this role to the workload-variant-autoscaler-controller-manager ServiceAccount. The token generated in this Secret will authenticate successfully but will receive HTTP 403 when Prometheus attempts to scrape /metrics due to missing authorization.

Add the following ClusterRoleBinding to config/rbac/:

metrics_reader_role_binding.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: workload-variant-autoscaler-metrics-reader-rolebinding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: workload-variant-autoscaler-metrics-reader
subjects:
- kind: ServiceAccount
  name: workload-variant-autoscaler-controller-manager
  namespace: workload-variant-autoscaler-system

Then reference it in config/rbac/kustomization.yaml.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/openshift/metrics-reader-token.yaml` around lines 1 - 12, Create and
add a ClusterRoleBinding that binds the existing ClusterRole
"workload-variant-autoscaler-metrics-reader" to the ServiceAccount
"workload-variant-autoscaler-controller-manager" (subject namespace:
"workload-variant-autoscaler-system"), using the ClusterRoleBinding name
"workload-variant-autoscaler-metrics-reader-rolebinding"; then include that new
manifest in the RBAC kustomization so the binding is applied alongside the
existing "metrics_reader" ClusterRole and ServiceAccount to allow Prometheus
access to /metrics.

20 changes: 20 additions & 0 deletions config/openshift/monitor-auth-patch.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Replace bearerTokenFile with authorization.credentials for OpenShift
# user-workload-monitoring compatibility. The user-workload Prometheus Operator
# rejects bearerTokenFile ("it accesses file system via bearer token file which
# Prometheus specification prohibits").
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: controller-manager-metrics-monitor
spec:
endpoints:
- port: https
path: /metrics
interval: 10s
scheme: https
tlsConfig:
insecureSkipVerify: true
Comment on lines +15 to +16
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

insecureSkipVerify: true disables TLS certificate validation (CWE-295).

Not a regression (base manifest has same setting), but this allows MITM attacks on metrics scraping traffic. If the controller-manager serves metrics over a cert signed by an in-cluster CA, consider configuring tlsConfig.ca or tlsConfig.caFile instead.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/openshift/monitor-auth-patch.yaml` around lines 16 - 17, The manifest
sets tlsConfig.insecureSkipVerify: true which disables TLS certificate
validation; replace this by configuring a trusted CA instead—use tlsConfig.ca
(base64 CA data) or tlsConfig.caFile pointing to the cluster CA so metrics
scraping validates the controller-manager certificate; update the tlsConfig
block (the tlsConfig key and its insecureSkipVerify field) to remove
insecureSkipVerify and add the appropriate tlsConfig.ca or tlsConfig.caFile
entry referencing the in-cluster CA.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lacking full context, do we have to do this?

authorization:
credentials:
name: workload-variant-autoscaler-metrics-reader-token
key: token
Comment on lines +17 to +20
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🌐 Web query:

Prometheus Operator ServiceMonitor authorization credentials SecretKeySelector spec

💡 Result:

The Prometheus Operator ServiceMonitor CRD includes an authorization field under spec.endpoints[] for configuring Authorization header credentials. This field uses a credentials object that functions as a SecretKeySelector. Key specification details: - Path: spec.endpoints[].authorization.credentials - Fields: - key (string, required): The key within the Secret containing the credentials. - name (string): Name of the Secret (in the same namespace as the ServiceMonitor). - optional (boolean): Whether the Secret or key must exist. The type field (string) under authorization sets the authentication type (defaults to "Bearer"; "Basic" causes an error). This cannot be used simultaneously with basicAuth or oauth2. The Secret must be accessible by the Prometheus Operator in the ServiceMonitor's namespace. Example YAML structure: spec: endpoints: - authorization: credentials: name: my-secret key: token type: Bearer port: metrics For basic authentication, use basicAuth instead, which has separate username and password SecretKeySelectors: basicAuth: username: name: auth-secret key: username password: name: auth-secret key: password Sources confirm this schema across Prometheus Operator v0.50+, OpenShift docs, and official CRD definitions.

Citations:


🏁 Script executed:

cat -n config/openshift/monitor-auth-patch.yaml

Repository: opendatahub-io/workload-variant-autoscaler

Length of output: 890


🏁 Script executed:

# Search for the Secret resource
rg 'workload-variant-autoscaler-metrics-reader-token' --type yaml

Repository: opendatahub-io/workload-variant-autoscaler

Length of output: 289


🏁 Script executed:

# Check for other insecureSkipVerify instances in ServiceMonitor configs
rg 'insecureSkipVerify' --type yaml -C 2

Repository: opendatahub-io/workload-variant-autoscaler

Length of output: 3740


Remove insecureSkipVerify: true—enables TLS certificate verification bypass (CWE-295).

The authorization.credentials structure is correct per Prometheus Operator spec. However, line 17 disables all TLS certificate validation, allowing man-in-the-middle attacks against the metrics endpoint. Either provide a valid CA certificate via tlsConfig.caFile or use OpenShift's service-to-service certificate validation. Disabling TLS verification is not acceptable for production.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/openshift/monitor-auth-patch.yaml` around lines 18 - 21, Remove the
insecureSkipVerify: true setting under the TLS configuration and instead supply
a trusted CA bundle or OpenShift service cert for TLS verification; update the
resource that configures prometheus scrape authorization (the
authorization.credentials block referencing name:
workload-variant-autoscaler-metrics-reader-token and key: token) to include a
proper tlsConfig.caFile (or equivalent OpenShift service-to-service cert
reference) so certificate validation is enforced rather than bypassed.

14 changes: 14 additions & 0 deletions config/openshift/prometheus-metrics-auth-binding.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Grant the OpenShift user-workload-monitoring Prometheus SA permission to
# authenticate to the WVA metrics endpoint (tokenreviews + subjectaccessreviews).
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: workload-variant-autoscaler-ocp-prometheus-metrics-auth
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: workload-variant-autoscaler-metrics-auth-role
subjects:
- kind: ServiceAccount
name: prometheus-k8s
namespace: openshift-user-workload-monitoring
Loading