Skip to content

Add PodMonitor resource for Prometheus monitoring#85

Merged
RobuRishabh merged 1 commit intoopendatahub-io:mainfrom
alimaredia:add-podmonitor-resource
Apr 3, 2026
Merged

Add PodMonitor resource for Prometheus monitoring#85
RobuRishabh merged 1 commit intoopendatahub-io:mainfrom
alimaredia:add-podmonitor-resource

Conversation

@alimaredia
Copy link
Copy Markdown

@alimaredia alimaredia commented Apr 2, 2026

Add opendatahub.io/monitoring: 'true' label for odh/rhoai overlays.

Purpose of this PR

Proposed changes:

  • <Change 1>
  • <Change 2>
  • <Change 3>

Change Category

  • Bugfix (non-breaking change which fixes an issue)
  • Feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that could affect existing functionality)
  • Documentation update

Rationale

Checklist

  • I have conducted a self-review of my own code.
  • I have updated documentation accordingly.
  • I have added tests that prove my changes are effective or that my feature works.
  • Existing unit tests pass locally with my changes.

Additional Notes

Summary by CodeRabbit

  • New Features

    • Added Prometheus monitoring support for Spark operator metrics collection.
  • Tests

    • Updated end-to-end test setup to properly configure monitoring dependencies before execution.

@alimaredia alimaredia requested a review from a team as a code owner April 2, 2026 20:10
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Introduces Prometheus monitoring infrastructure for Spark operator metrics collection across ODH and RHOAI overlays. Adds a new Prometheus kustomization directory with a PodMonitor resource that scrapes Spark operator metrics from the /metrics endpoint on port metrics. Both overlays are updated to include the prometheus resource set and apply a JSON6902 patch labeling the PodMonitor with opendatahub.io/monitoring: "true". The e2e testing Makefile is enhanced to apply the PodMonitor CRD before running tests.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Security & Architecture Issues

CWE-829 (Untrusted Source Control): The PODMONITOR_CRD_URL in examples/openshift/Makefile fetches from the main branch of prometheus-operator. Main branch commits are unstable and bypass verification. Pin to a specific release tag (e.g., v0.70.0) or commit SHA, and verify integrity via checksum validation before applying the CRD.

Missing RBAC/Authentication Context: The PodMonitor at config/prometheus/podmonitor.yaml does not specify scheme, tlsConfig, or authorization fields. Verify that:

  • The /metrics endpoint on Spark operator is unauthenticated and accessible to Prometheus scrape identity
  • No service account permissions are required to reach the metrics port

Pod Label Assumption Risk: The PodMonitor uses matchLabels with app.kubernetes.io/name: spark-operator and app.kubernetes.io/component: controller. Confirm these labels are guaranteed on all Spark operator controller pods across deployments; reliance on external labeling conventions without validation can lead to silent monitoring gaps.

Indirect Manifest Patching: Using JSON6902 patches in overlays to apply labels post-resource-definition adds operational friction. Consider moving the opendatahub.io/monitoring: "true" label into the base podmonitor.yaml instead of patching across overlays.

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the primary change: adding a PodMonitor resource for Prometheus monitoring across the codebase.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@config/overlays/odh/kustomization.yaml`:
- Around line 22-31: The JSON6902 patch adds /metadata/labels which can fail if
labels already exist; update the patch for the PodMonitor named
spark-operator-metrics-monitor to either use a strategic-merge patch (add a
patchesStrategicMerge entry that sets metadata.labels.opendatahub.io/monitoring:
"true" for kind PodMonitor, apiVersion monitoring.coreos.com/v1) or keep
JSON6902 but change the path to the escaped single-key path
/metadata/labels/opendatahub.io~1monitoring so only that label is added instead
of replacing the whole labels map.

In `@config/overlays/rhoai/kustomization.yaml`:
- Around line 22-31: The current JSON patch replaces the entire /metadata/labels
object which drops inherited labels; change the patch to a JSON6902-style
operation that adds only the single label key by escaping the slash in the key
(use path /metadata/labels/opendatahub.io~1monitoring) and set its value to
"true", keeping the same target (PodMonitor, name:
spark-operator-metrics-monitor) so the base labels (app.kubernetes.io/name,
app.kubernetes.io/managed-by) remain unchanged.

In `@config/prometheus/podmonitor.yaml`:
- Around line 11-12: Remove the redundant tlsConfig block that sets
insecureSkipVerify when the PodMonitor uses scheme: http: locate the PodMonitor
spec (look for scheme: http) and delete the tlsConfig: insecureSkipVerify: true
stanza (the tlsConfig/insecureSkipVerify setting is ignored for HTTP and only
confuses readers).
- Around line 13-15: The PodMonitor currently contains a hardcoded
namespaceSelector block (namespaceSelector / matchNames -> spark-operator) that
prevents discovery in overlay namespaces; remove the entire namespaceSelector
section from the PodMonitor manifest so it falls back to the default behavior of
selecting pods in the same namespace as the resource (allowing overlays that set
metadata.namespace to work), i.e., delete the namespaceSelector / matchNames
entries in the PodMonitor YAML so pods are discovered in the overlay-configured
namespace.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 22d3c37f-6ff6-49a2-a830-4821b8aa71bb

📥 Commits

Reviewing files that changed from the base of the PR and between 28e9e4f and a0a33c6.

📒 Files selected for processing (5)
  • config/default/kustomization.yaml
  • config/overlays/odh/kustomization.yaml
  • config/overlays/rhoai/kustomization.yaml
  • config/prometheus/kustomization.yaml
  • config/prometheus/podmonitor.yaml

@alimaredia alimaredia force-pushed the add-podmonitor-resource branch from a0a33c6 to 9e5610c Compare April 2, 2026 20:28
@alimaredia alimaredia force-pushed the add-podmonitor-resource branch from 9e5610c to 6f03f74 Compare April 2, 2026 22:42
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@examples/openshift/Makefile`:
- Around line 17-18: The Makefile currently sets CLI via "CLI ?= $(shell command
-v oc 2>/dev/null || command -v kubectl 2>/dev/null)" but doesn't guard when
both are missing; add a new phony target check-cli that verifies $(CLI) is
non-empty (or runs command -v oc/kubectl and exits non‑zero with a clear error
message when neither exists) and then add check-cli as a dependency to any
targets that use $(CLI) (the variable name CLI and the new check-cli target are
the unique symbols to modify) so the build fails fast with a helpful message
instead of crashing later.
- Around line 87-90: The e2e-kustomize-test target currently streams a remote
manifest via "$(CLI) apply -f $(PODMONITOR_CRD_URL)" which is unsafe and also
ignores failures; update the target to (1) use a pinned URL (tag or commit) for
PODMONITOR_CRD_URL or change it to a pinned variable, (2) download the manifest
first (curl/wget to a temp file referenced by PODMONITOR_CRD_URL), verify its
checksum/signature before applying, and (3) enable failure propagation by adding
set -e at the top of the recipe or explicitly check the download and apply exit
codes and exit non‑zero on error so the e2e-kustomize-test target (and CLI apply
invocation) cannot silently continue on failure.
- Line 23: The PODMONITOR_CRD_URL default currently points at the repository's
main branch; change the default value of the PODMONITOR_CRD_URL variable to
reference a specific Prometheus Operator release tag (replace "main" with a
stable release tag like vX.Y.Z) so it pins the CRD to a known immutable commit,
update the Makefile's PODMONITOR_CRD_URL assignment accordingly, and add a short
comment explaining the pin and the chosen tag so future maintainers know why it
is not tracking main.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 11c75127-e364-4237-b6dc-f53b38bda628

📥 Commits

Reviewing files that changed from the base of the PR and between 9e5610c and 6f03f74.

📒 Files selected for processing (6)
  • config/default/kustomization.yaml
  • config/overlays/odh/kustomization.yaml
  • config/overlays/rhoai/kustomization.yaml
  • config/prometheus/kustomization.yaml
  • config/prometheus/podmonitor.yaml
  • examples/openshift/Makefile
✅ Files skipped from review due to trivial changes (3)
  • config/overlays/odh/kustomization.yaml
  • config/prometheus/kustomization.yaml
  • config/prometheus/podmonitor.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • config/default/kustomization.yaml
  • config/overlays/rhoai/kustomization.yaml

Comment on lines +17 to +18
# CLI tool: prefer oc if available, fall back to kubectl
CLI ?= $(shell command -v oc 2>/dev/null || command -v kubectl 2>/dev/null)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Missing validation when CLI tool is unavailable

If neither oc nor kubectl is installed, $(CLI) resolves to empty string and line 89 fails with an unhelpful error. Add a guard.

Proposed fix
 # CLI tool: prefer oc if available, fall back to kubectl
 CLI ?= $(shell command -v oc 2>/dev/null || command -v kubectl 2>/dev/null)
+
+.PHONY: check-cli
+check-cli:
+ifndef CLI
+	$(error "Neither oc nor kubectl found in PATH")
+endif

Then add check-cli as a dependency to targets using $(CLI).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# CLI tool: prefer oc if available, fall back to kubectl
CLI ?= $(shell command -v oc 2>/dev/null || command -v kubectl 2>/dev/null)
# CLI tool: prefer oc if available, fall back to kubectl
CLI ?= $(shell command -v oc 2>/dev/null || command -v kubectl 2>/dev/null)
.PHONY: check-cli
check-cli:
ifndef CLI
$(error "Neither oc nor kubectl found in PATH")
endif
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/openshift/Makefile` around lines 17 - 18, The Makefile currently
sets CLI via "CLI ?= $(shell command -v oc 2>/dev/null || command -v kubectl
2>/dev/null)" but doesn't guard when both are missing; add a new phony target
check-cli that verifies $(CLI) is non-empty (or runs command -v oc/kubectl and
exits non‑zero with a clear error message when neither exists) and then add
check-cli as a dependency to any targets that use $(CLI) (the variable name CLI
and the new check-cli target are the unique symbols to modify) so the build
fails fast with a helpful message instead of crashing later.

# Configuration
KIND_CLUSTER_NAME ?= spark-operator
CLEANUP ?= true
PODMONITOR_CRD_URL ?= https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/jsonnet/prometheus-operator/podmonitors-crd.json
Copy link
Copy Markdown

@coderabbitai coderabbitai bot Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

What is the latest stable release version of prometheus-operator?

💡 Result:

The latest stable release version of prometheus-operator is v0.90.1.

Citations:


🏁 Script executed:

#!/bin/bash
# Verify the URL pattern works with prometheus-operator releases
# Check if the jsonnet file exists at specific release versions

# Test with the latest version from web search (v0.90.1)
curl -s -o /dev/null -w "v0.90.1 HTTP Status: %{http_code}\n" \
  "https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.90.1/jsonnet/prometheus-operator/podmonitors-crd.json"

# Also test with the suggested version from the review (v0.75.0)
curl -s -o /dev/null -w "v0.75.0 HTTP Status: %{http_code}\n" \
  "https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.75.0/jsonnet/prometheus-operator/podmonitors-crd.json"

# And verify main branch still works (to establish baseline)
curl -s -o /dev/null -w "main HTTP Status: %{http_code}\n" \
  "https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/jsonnet/prometheus-operator/podmonitors-crd.json"

Repository: opendatahub-io/spark-operator

Length of output: 144


Pin PODMONITOR_CRD_URL to a release tag instead of main branch

Referencing main creates a supply chain risk (CWE-829): any upstream change or repository compromise silently propagates to your build. Pin to a specific release tag.

Suggested fix
-PODMONITOR_CRD_URL ?= https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/jsonnet/prometheus-operator/podmonitors-crd.json
+PODMONITOR_CRD_URL ?= https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.90.1/jsonnet/prometheus-operator/podmonitors-crd.json
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
PODMONITOR_CRD_URL ?= https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/jsonnet/prometheus-operator/podmonitors-crd.json
PODMONITOR_CRD_URL ?= https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.90.1/jsonnet/prometheus-operator/podmonitors-crd.json
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/openshift/Makefile` at line 23, The PODMONITOR_CRD_URL default
currently points at the repository's main branch; change the default value of
the PODMONITOR_CRD_URL variable to reference a specific Prometheus Operator
release tag (replace "main" with a stable release tag like vX.Y.Z) so it pins
the CRD to a known immutable commit, update the Makefile's PODMONITOR_CRD_URL
assignment accordingly, and add a short comment explaining the pin and the
chosen tag so future maintainers know why it is not tracking main.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. need to add checksum validation too.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RobuRishabh — Agreed. Here's a complete fix that pins the version and adds SHA256 integrity verification before apply. The maintainer must compute the real digest once and embed it:

-PODMONITOR_CRD_URL ?= https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/jsonnet/prometheus-operator/podmonitors-crd.json
+# Pin to a specific release; update both URL and SHA256 together when upgrading.
+PODMONITOR_CRD_URL ?= https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.90.1/jsonnet/prometheus-operator/podmonitors-crd.json
+PODMONITOR_CRD_SHA256 ?= <replace-with-sha256-of-v0.90.1-file>

And in the e2e-kustomize-test target, replace the direct apply -f <url> with a download-verify-apply-cleanup sequence:

 .PHONY: e2e-kustomize-test
 e2e-kustomize-test:
 	`@echo` "Running Go e2e tests with Kustomize installation..."
-	$(CLI) apply -f $(PODMONITOR_CRD_URL)
+	$(eval CRD_TMP := $(shell mktemp /tmp/podmonitors-crd.XXXXXX.json))
+	curl -fsSL $(PODMONITOR_CRD_URL) -o $(CRD_TMP)
+	echo "$(PODMONITOR_CRD_SHA256)  $(CRD_TMP)" | sha256sum -c -
+	$(CLI) apply -f $(CRD_TMP)
+	rm -f $(CRD_TMP)
 	cd $(REPO_ROOT) && INSTALL_METHOD=kustomize go test ./examples/openshift/tests/e2e/ -v -ginkgo.v -timeout 30m

To compute the digest to embed:

curl -fsSL https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/v0.90.1/jsonnet/prometheus-operator/podmonitors-crd.json | sha256sum

Two additional notes:

  • macOS: sha256sum is part of GNU coreutils; on macOS CI runners, use shasum -a 256 -c instead or guard with a uname check.
  • Cleanup on failure: the rm -f won't run if sha256sum fails and make aborts. If that's a concern, use a .ONESHELL: target or a trap in the recipe to guarantee cleanup.

Comment on lines 87 to 90
e2e-kustomize-test: kind-setup ## Run Go e2e tests using Kustomize manifests for operator installation.
@echo "Running Go e2e tests with Kustomize installation..."
$(CLI) apply -f $(PODMONITOR_CRD_URL)
cd $(REPO_ROOT) && INSTALL_METHOD=kustomize go test ./examples/openshift/tests/e2e/ -v -ginkgo.v -timeout 30m
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remote URL applied without integrity check

$(CLI) apply -f $(PODMONITOR_CRD_URL) fetches and applies a remote manifest in one step—functionally equivalent to curl-pipe-shell. Combined with the unpinned URL, this is exploitable if the upstream is tampered with (CWE-829).

Mitigations:

  1. Pin the URL to a commit/tag (see prior comment).
  2. Optionally download first and verify checksum before applying.

Additionally, the target silently continues if the apply fails. Consider adding set -e behavior or explicit error handling.

Proposed fix with error handling
 e2e-kustomize-test: kind-setup ## Run Go e2e tests using Kustomize manifests for operator installation.
 	`@echo` "Running Go e2e tests with Kustomize installation..."
-	$(CLI) apply -f $(PODMONITOR_CRD_URL)
+	`@if` [ -z "$(CLI)" ]; then echo "ERROR: Neither oc nor kubectl found"; exit 1; fi
+	$(CLI) apply -f $(PODMONITOR_CRD_URL) || { echo "ERROR: Failed to apply PodMonitor CRD"; exit 1; }
 	cd $(REPO_ROOT) && INSTALL_METHOD=kustomize go test ./examples/openshift/tests/e2e/ -v -ginkgo.v -timeout 30m
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
e2e-kustomize-test: kind-setup ## Run Go e2e tests using Kustomize manifests for operator installation.
@echo "Running Go e2e tests with Kustomize installation..."
$(CLI) apply -f $(PODMONITOR_CRD_URL)
cd $(REPO_ROOT) && INSTALL_METHOD=kustomize go test ./examples/openshift/tests/e2e/ -v -ginkgo.v -timeout 30m
e2e-kustomize-test: kind-setup ## Run Go e2e tests using Kustomize manifests for operator installation.
`@echo` "Running Go e2e tests with Kustomize installation..."
`@if` [ -z "$(CLI)" ]; then echo "ERROR: Neither oc nor kubectl found"; exit 1; fi
$(CLI) apply -f $(PODMONITOR_CRD_URL) || { echo "ERROR: Failed to apply PodMonitor CRD"; exit 1; }
cd $(REPO_ROOT) && INSTALL_METHOD=kustomize go test ./examples/openshift/tests/e2e/ -v -ginkgo.v -timeout 30m
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/openshift/Makefile` around lines 87 - 90, The e2e-kustomize-test
target currently streams a remote manifest via "$(CLI) apply -f
$(PODMONITOR_CRD_URL)" which is unsafe and also ignores failures; update the
target to (1) use a pinned URL (tag or commit) for PODMONITOR_CRD_URL or change
it to a pinned variable, (2) download the manifest first (curl/wget to a temp
file referenced by PODMONITOR_CRD_URL), verify its checksum/signature before
applying, and (3) enable failure propagation by adding set -e at the top of the
recipe or explicitly check the download and apply exit codes and exit non‑zero
on error so the e2e-kustomize-test target (and CLI apply invocation) cannot
silently continue on failure.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
examples/openshift/Makefile (1)

89-90: ⚠️ Potential issue | 🟡 Minor

No error propagation on CRD apply failure

If $(CLI) apply fails (network issue, auth failure, invalid manifest), execution continues to go test, which will likely fail with confusing errors when PodMonitor CRD doesn't exist.

Proposed fix
 e2e-kustomize-test:
 	`@echo` "Running Go e2e tests with Kustomize installation..."
-	$(CLI) apply -f $(PODMONITOR_CRD_URL)
+	`@if` [ -z "$(CLI)" ]; then echo "ERROR: Neither oc nor kubectl found in PATH"; exit 1; fi
+	$(CLI) apply -f $(PODMONITOR_CRD_URL) || { echo "ERROR: Failed to apply PodMonitor CRD"; exit 1; }
 	cd $(REPO_ROOT) && INSTALL_METHOD=kustomize go test ./examples/openshift/tests/e2e/ -v -ginkgo.v -timeout 30m
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@examples/openshift/Makefile` around lines 89 - 90, The Makefile runs $(CLI)
apply -f $(PODMONITOR_CRD_URL) but does not propagate failures, so subsequent cd
$(REPO_ROOT) && INSTALL_METHOD=kustomize go test ... will run even if the CRD
apply failed; change the recipe to fail fast by ensuring the apply command's
non-zero exit aborts the recipe (for example, add an explicit check like
appending "|| exit 1" to $(CLI) apply -f $(PODMONITOR_CRD_URL) or chain the
commands with "&&" so the go test only runs when the apply succeeds), keeping
the existing $(CLI) and $(PODMONITOR_CRD_URL) tokens.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@examples/openshift/Makefile`:
- Around line 89-90: The Makefile runs $(CLI) apply -f $(PODMONITOR_CRD_URL) but
does not propagate failures, so subsequent cd $(REPO_ROOT) &&
INSTALL_METHOD=kustomize go test ... will run even if the CRD apply failed;
change the recipe to fail fast by ensuring the apply command's non-zero exit
aborts the recipe (for example, add an explicit check like appending "|| exit 1"
to $(CLI) apply -f $(PODMONITOR_CRD_URL) or chain the commands with "&&" so the
go test only runs when the apply succeeds), keeping the existing $(CLI) and
$(PODMONITOR_CRD_URL) tokens.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 09187bcd-a660-4a19-9125-e2b78e5f230c

📥 Commits

Reviewing files that changed from the base of the PR and between 6f03f74 and a636e4e.

📒 Files selected for processing (1)
  • examples/openshift/Makefile

@alimaredia alimaredia force-pushed the add-podmonitor-resource branch from a636e4e to 5e03427 Compare April 3, 2026 14:50
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
config/prometheus/podmonitor.yaml (1)

11-14: Narrow the PodMonitor selector to one operator instance.

The selector currently matches any controller pod with the same name + component labels in the namespace. Add app.kubernetes.io/instance to prevent accidental cross-instance scraping.

Suggested change
 spec:
   podMetricsEndpoints:
     - port: metrics
       path: /metrics
       scheme: http
   selector:
     matchLabels:
       app.kubernetes.io/name: spark-operator
+      app.kubernetes.io/instance: spark-operator
       app.kubernetes.io/component: controller

As per coding guidelines, this follows **: REVIEW PRIORITIES: 2. Architectural issues and anti-patterns and 3. Bug-prone patterns and error handling gaps.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@config/prometheus/podmonitor.yaml` around lines 11 - 14, The PodMonitor's
selector.matchLabels is too broad and should include the instance label to avoid
scraping other operator instances; update the selector.matchLabels block (the
selector and matchLabels entries) to add app.kubernetes.io/instance with the
specific instance value for this operator so the PodMonitor only targets that
single operator instance.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@config/prometheus/podmonitor.yaml`:
- Around line 11-14: The PodMonitor's selector.matchLabels is too broad and
should include the instance label to avoid scraping other operator instances;
update the selector.matchLabels block (the selector and matchLabels entries) to
add app.kubernetes.io/instance with the specific instance value for this
operator so the PodMonitor only targets that single operator instance.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 38f824b2-5a96-4125-ad67-43bbb2323508

📥 Commits

Reviewing files that changed from the base of the PR and between a636e4e and 5e03427.

📒 Files selected for processing (5)
  • config/overlays/odh/kustomization.yaml
  • config/overlays/rhoai/kustomization.yaml
  • config/prometheus/kustomization.yaml
  • config/prometheus/podmonitor.yaml
  • examples/openshift/Makefile
✅ Files skipped from review due to trivial changes (2)
  • config/prometheus/kustomization.yaml
  • config/overlays/rhoai/kustomization.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • examples/openshift/Makefile

# Configuration
KIND_CLUSTER_NAME ?= spark-operator
CLEANUP ?= true
PODMONITOR_CRD_URL ?= https://raw.githubusercontent.com/prometheus-operator/prometheus-operator/main/jsonnet/prometheus-operator/podmonitors-crd.json
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. need to add checksum validation too.

Add opendatahub.io/monitoring: 'true' label for odh/rhoai overlays.

Signed-off-by: Ali Maredia <amaredia@redhat.com>
@alimaredia alimaredia force-pushed the add-podmonitor-resource branch from 01e3aae to 23544c9 Compare April 3, 2026 15:56
@RobuRishabh RobuRishabh merged commit d1788d6 into opendatahub-io:main Apr 3, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants