Skip to content

Configure OpenTelemetry on existing tests#572

Merged
kpunwatk merged 2 commits intoopendatahub-io:mainfrom
kpunwatk:otel_test
Oct 22, 2025
Merged

Configure OpenTelemetry on existing tests#572
kpunwatk merged 2 commits intoopendatahub-io:mainfrom
kpunwatk:otel_test

Conversation

@kpunwatk
Copy link
Copy Markdown
Contributor

@kpunwatk kpunwatk commented Sep 1, 2025

This PR adds support for configuring OpenTelemetry (OTLP exporter) with Tempo tracing integration in existing guardrails tests. Updated fixtures and test logic to accept otel_exporter settings, allowing metrics and traces endpoints to be dynamically injected.

Description

How Has This Been Tested?

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

  • Tests
    • Added optional OpenTelemetry support in Guardrails tests to export metrics and traces to Tempo when enabled; includes provisioning and orchestration helpers, endpoint and port‑forward utilities, MinIO-backed storage fixtures, pod‑readiness helpers, and new end-to-end tests validating traces in Tempo.
  • Chores
    • Bumped an internal dependency version.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Sep 1, 2025

📝 Walkthrough

Walkthrough

Adds Tempo and OpenTelemetry test infrastructure, wires optional OTEL exporter into the Guardrails orchestrator fixture, extends guardrails tests to include OTEL/Tempo fixtures, and adds tests that verify traces are ingested into Tempo.

Changes

Cohort / File(s) Summary of Changes
Orchestrator fixture OTEL config
tests/fixtures/guardrails.py
Conditionally injects an otel_exporter configuration into the GuardrailsOrchestrator fixture when otel_exporter_config is present; resolves otelcol_metrics_endpoint and tempo_traces_endpoint, and sets metricsEndpoint, tracesEndpoint, otlpExport, and protocols (gRPC/otlp) on gorch_kwargs before instantiation.
Tempo & OpenTelemetry test fixtures
tests/model_explainability/guardrails/conftest.py
Adds fixtures, helpers, and constants to install Tempo and OpenTelemetry operators, create TempoStack and OpenTelemetryCollector CRs, provision MinIO-backed PVC/Deployment/Service/Secret for Tempo storage, provide port-forwarding and endpoint helpers (otelcol_metrics_endpoint, tempo_traces_endpoint, tempo_traces_service_portforward), and add orchestration utilities (pod waits, timeouts, imports).
Telemetry constants
tests/model_explainability/guardrails/constants.py
Adds constants: OTEL_EXPORTER_PORT: int = 4317, SUPER_SECRET = "supersecret" # pragma: allowlist secret, and TEMPO = "tempo".
Guardrails tests with OTEL/Tempo
tests/model_explainability/guardrails/test_guardrails.py
Extends test imports and parametrization to accept OTEL/Tempo configuration; updates class-level usefixtures and several test signatures to include OTEL/Tempo fixtures; adds test_guardrails_traces_in_tempo to both BuiltIn and HuggingFace detector test classes; passes OTEL/Tempo payloads into relevant tests.
Dependency bump
pyproject.toml
Bumps openshift-python-wrapper dependency from 11.0.94 to 11.0.99.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "Configure OpenTelemetry on existing tests" directly and clearly reflects the primary objective of this pull request, which is to add support for configuring OpenTelemetry (OTLP exporter) with Tempo tracing integration in existing guardrails tests. The title is concise, uses specific terminology (OpenTelemetry), and avoids vague phrasing, making it immediately clear what the changeset accomplishes. All modifications across the multiple affected files—including new fixtures, constants, test parametrization, and dependency updates—align with and support this central objective of enabling OpenTelemetry configuration in the test suite.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Sep 1, 2025

The following are automatically added/executed:

  • PR size label.
  • Run pre-commit
  • Run tox
  • Add PR author as the PR assignee
  • Build image based on the PR

Available user actions:

  • To mark a PR as WIP, add /wip in a comment. To remove it from the PR comment /wip cancel to the PR.
  • To block merging of a PR, add /hold in a comment. To un-block merging of PR comment /hold cancel.
  • To mark a PR as approved, add /lgtm in a comment. To remove, add /lgtm cancel.
    lgtm label removed on each new commit push.
  • To mark PR as verified comment /verified to the PR, to un-verify comment /verified cancel to the PR.
    verified label removed on each new commit push.
  • To Cherry-pick a merged PR /cherry-pick <target_branch_name> to the PR. If <target_branch_name> is valid,
    and the current PR is merged, a cherry-picked PR would be created and linked to the current PR.
  • To build and push image to quay, add /build-push-pr-image in a comment. This would create an image with tag
    pr-<pr_number> to quay repository. This image tag, however would be deleted on PR merge or close action.
Supported labels

{'/lgtm', '/wip', '/verified', '/build-push-pr-image', '/hold', '/cherry-pick'}

@kpunwatk
Copy link
Copy Markdown
Contributor Author

kpunwatk commented Sep 1, 2025

/wip

@kpunwatk kpunwatk marked this pull request as draft September 1, 2025 10:36
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tests/model_explainability/guardrails/conftest.py (2)

69-71: Fix SyntaxError: join split storage_uri string.

Two adjacent string literals break the keyword arg list.

-        storage_uri="oci://quay.io/trustyai_testing/detectors/deberta-v3-base-prompt-injection-v2"
-        "@sha256:8737d6c7c09edf4c16dc87426624fd8ed7d118a12527a36b670be60f089da215",
+        storage_uri="oci://quay.io/trustyai_testing/detectors/deberta-v3-base-prompt-injection-v2@sha256:8737d6c7c09edf4c16dc87426624fd8ed7d118a12527a36b670be60f089da215",

122-124: Fix SyntaxError: join split storage_uri string (second occurrence).

-        storage_uri="oci://quay.io/trustyai_testing/detectors/granite-guardian-hap-38m"
-        "@sha256:9dd129668cce86dac674814c0a965b1526a01de562fd1e9a28d1892429bdad7b",
+        storage_uri="oci://quay.io/trustyai_testing/detectors/granite-guardian-hap-38m@sha256:9dd129668cce86dac674814c0a965b1526a01de562fd1e9a28d1892429bdad7b",
♻️ Duplicate comments (6)
tests/model_explainability/guardrails/conftest.py (5)

155-171: Remove unused fixture parameter (model_namespace).

The parameter isn’t used; simplify signature.

-@pytest.fixture(scope="class")
-def installed_tempo_operator(admin_client: DynamicClient, model_namespace: Namespace) -> Generator[None, Any, None]:
+@pytest.fixture(scope="class")
+def installed_tempo_operator(admin_client: DynamicClient) -> Generator[None, Any, None]:

Note: adjust any direct requests for this fixture accordingly.


214-222: Make ALM example lookup safe (avoid StopIteration).

Provide default to next(...) and handle missing case explicitly.

-    tempo_stack_dict: dict[str, Any] = next(
-        example
-        for example in alm_examples
-        if example["kind"] == "TempoStack" and example["apiVersion"].startswith("tempo.grafana.com/")
-    )
-
-    if not tempo_stack_dict:
+    tempo_stack_dict: dict[str, Any] = next(
+        (e for e in alm_examples
+         if e.get("kind") == "TempoStack" and str(e.get("apiVersion","")).startswith("tempo.grafana.com/")),
+        None,
+    )
+    if tempo_stack_dict is None:
         raise ResourceNotFoundError(f"No TempoStack dict found in alm_examples for CSV {tempo_csv.name}")

320-329: Same StopIteration risk for OpenTelemetryCollector example.

Harden next(...) with a default, then check None.

-    otel_cr_dict: dict[str, Any] = next(
-        example
-        for example in alm_examples
-        if example["kind"] == "OpenTelemetryCollector" and example["apiVersion"] == "opentelemetry.io/v1beta1"
-    )
-
-    if not otel_cr_dict:
+    otel_cr_dict: dict[str, Any] = next(
+        (e for e in alm_examples
+         if e.get("kind") == "OpenTelemetryCollector"
+         and e.get("apiVersion") == "opentelemetry.io/v1beta1"),
+        None,
+    )
+    if otel_cr_dict is None:
         raise ResourceNotFoundError(f"No OpenTelemetryCollector example found in ALM examples for {otel_csv.name}")

433-435: Wait for PVC to be BOUND before yielding (avoid flakiness).

Consumers may start before binding completes.

     with PersistentVolumeClaim(**pvc_kwargs) as pvc:
-        yield pvc
+        pvc.wait_for_status(status=pvc.Status.BOUND, timeout=Timeout.TIMEOUT_2MIN)
+        yield pvc

512-544: Guard against missing OTEL collector service (StopIteration) and avoid hardcoded port duplication.

Handle absence gracefully and derive port from the Service when possible.

-    service = next(
-        Service.get(
-            dyn_client=admin_client,
-            namespace=model_namespace.name,
-            label_selector="app.kubernetes.io/component=opentelemetry-collector",
-        )
-    )
-
-    service_name = service.name
-
-    port = OTEL_EXPORTER_PORT
-    return f"http://{service_name}.{model_namespace.name}.svc.cluster.local:{port}"
+    service = next(
+        Service.get(
+            dyn_client=admin_client,
+            namespace=model_namespace.name,
+            label_selector="app.kubernetes.io/component=opentelemetry-collector",
+        ),
+        None,
+    )
+    if service is None:
+        raise ResourceNotFoundError(
+            f"No OpenTelemetry Collector service found in namespace {model_namespace.name}"
+        )
+    service_name = service.name
+    # Prefer discovered port, fallback to configured constant
+    discovered_port = None
+    try:
+        ports = getattr(service.instance.spec, "ports", []) or []
+        discovered_port = next((p.port for p in ports if getattr(p, "port", None)), None)
+    except Exception:
+        pass
+    port = discovered_port or OTEL_EXPORTER_PORT
+    return f"http://{service_name}.{model_namespace.name}.svc.cluster.local:{port}"
tests/model_explainability/guardrails/test_guardrails.py (1)

388-417: Make trace test reliable: prime traces, add timeouts, and avoid hard‑coded service name.

Generate a request to ensure traces exist; discover a guardrails service via /api/services; add timeouts; drop unused args.

-    def test_guardrails_traces_in_tempo(
-        self,
-        admin_client,
-        model_namespace,
-        orchestrator_config,
-        minio_pod,
-        minio_data_connection,
-        guardrails_orchestrator,
-        guardrails_gateway_config,
-        otel_collector,
-        tempo_stack,
-        tempo_traces_service_portforward,
-    ):
+    def test_guardrails_traces_in_tempo(
+        self,
+        current_client_token,
+        openshift_ca_bundle_file,
+        guardrails_orchestrator_health_route,
+        tempo_traces_service_portforward,
+    ):
@@
-        @retry(wait_timeout=Timeout.TIMEOUT_1MIN, sleep=5)
+        # Prime traces by hitting health endpoint
+        check_guardrails_health_endpoint(
+            host=guardrails_orchestrator_health_route.host,
+            token=current_client_token,
+            ca_bundle_file=openshift_ca_bundle_file,
+        )
+
+        @retry(wait_timeout=Timeout.TIMEOUT_1MIN, sleep=5)
         def check_traces():
-            response = requests.get(f"{tempo_traces_service_portforward}/api/traces?service=fms_guardrails_orchestr8")
-            if response.status_code == http.HTTPStatus.OK:
-                data = response.json()
-                if data.get("data"):  # non-empty list of traces
-                    return data
-            return False
+            # Discover a service that contains "guardrails"
+            svcs = requests.get(
+                f"{tempo_traces_service_portforward}/api/services",
+                timeout=Timeout.TIMEOUT_30SEC,
+            )
+            if svcs.status_code != http.HTTPStatus.OK:
+                return False
+            names = svcs.json().get("data", [])
+            guardrails_svc = next((s for s in names if "guardrails" in s), None)
+            if not guardrails_svc:
+                return False
+            r = requests.get(
+                f"{tempo_traces_service_portforward}/api/traces?service={guardrails_svc}",
+                timeout=Timeout.TIMEOUT_30SEC,
+            )
+            if r.status_code == http.HTTPStatus.OK:
+                body = r.json()
+                if body.get("data"):
+                    return body
+            return False

Based on learnings (Requests best practices: always set timeouts).

🧹 Nitpick comments (3)
tests/model_explainability/guardrails/conftest.py (2)

548-557: Consider deriving Tempo distributor port instead of hardcoding OTEL_EXPORTER_PORT.

Minor, but keeps secrets/endpoints consistent with Service spec. If keeping constant, add a short comment stating why.


566-579: Avoid hardcoded 16686; discover query‑frontend port from Service.

More robust across deployments.

-    service_name = f"tempo-{tempo_stack.metadata.name}-query-frontend"  # tempo-my-tempo-stack-query-frontend"
-    local_port = 16686
+    service_name = f"tempo-{tempo_stack.metadata.name}-query-frontend"
+    # Try to discover the remote service port; default to 16686
+    remote_port = 16686
+    try:
+        svc = next((s for s in Service.get(dyn_client=admin_client, namespace=namespace) if s.name == service_name), None)
+        if svc and getattr(svc.instance.spec, "ports", None):
+            remote_port = next((p.port for p in svc.instance.spec.ports if getattr(p, "port", None)), remote_port)
+    except Exception:
+        LOGGER.debug("Falling back to default query-frontend port 16686")
+    local_port = remote_port
     local_url = f"{LOCAL_HOST_URL}:{local_port}"
@@
-            to_port=local_port,
+            to_port=remote_port,
             waiting=20,
tests/model_explainability/guardrails/test_guardrails.py (1)

334-358: Remove unused fixture parameters in tests to appease linters.

These args are only for ordering; rely on class-level usefixtures instead.

-        orchestrator_config,
-        guardrails_orchestrator,
-        otel_collector,
-        tempo_stack,
+        # rely on class-level usefixtures for setup/ordering

Apply similarly in related methods if not used.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e97f185 and 1b3def6.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • pyproject.toml (1 hunks)
  • tests/fixtures/guardrails.py (1 hunks)
  • tests/model_explainability/guardrails/conftest.py (2 hunks)
  • tests/model_explainability/guardrails/constants.py (1 hunks)
  • tests/model_explainability/guardrails/test_guardrails.py (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/fixtures/guardrails.py
  • tests/model_explainability/guardrails/constants.py
  • pyproject.toml
🧰 Additional context used
🧬 Code graph analysis (2)
tests/model_explainability/guardrails/test_guardrails.py (3)
utilities/constants.py (1)
  • Timeout (220-231)
tests/model_explainability/guardrails/conftest.py (3)
  • otel_collector (301-373)
  • tempo_stack (193-258)
  • tempo_traces_service_portforward (560-583)
tests/conftest.py (4)
  • admin_client (73-74)
  • model_namespace (105-125)
  • minio_pod (485-526)
  • minio_data_connection (552-564)
tests/model_explainability/guardrails/conftest.py (6)
tests/model_registry/model_catalog/utils.py (1)
  • ResourceNotFoundError (23-24)
utilities/constants.py (1)
  • Timeout (220-231)
utilities/inference_utils.py (1)
  • create_isvc (552-767)
utilities/operator_utils.py (1)
  • get_cluster_service_version (14-29)
tests/conftest.py (2)
  • admin_client (73-74)
  • model_namespace (105-125)
tests/model_explainability/trustyai_service/trustyai_service_utils.py (1)
  • _get_pods (363-370)
🪛 Ruff (0.14.1)
tests/model_explainability/guardrails/test_guardrails.py

334-334: Unused method argument: otel_collector

(ARG002)


335-335: Unused method argument: tempo_stack

(ARG002)


356-356: Unused method argument: otel_collector

(ARG002)


357-357: Unused method argument: tempo_stack

(ARG002)


376-376: Unused method argument: otel_collector

(ARG002)


377-377: Unused method argument: tempo_stack

(ARG002)


390-390: Unused method argument: admin_client

(ARG002)


391-391: Unused method argument: model_namespace

(ARG002)


392-392: Unused method argument: orchestrator_config

(ARG002)


393-393: Unused method argument: minio_pod

(ARG002)


394-394: Unused method argument: minio_data_connection

(ARG002)


395-395: Unused method argument: guardrails_orchestrator

(ARG002)


396-396: Unused method argument: guardrails_gateway_config

(ARG002)


397-397: Unused method argument: otel_collector

(ARG002)


398-398: Unused method argument: tempo_stack

(ARG002)


408-408: Probable use of requests call without timeout

(S113)

tests/model_explainability/guardrails/conftest.py

155-155: Unused function argument: model_namespace

(ARG001)


194-194: Unused function argument: installed_tempo_operator

(ARG001)


221-221: Avoid specifying long messages outside the exception class

(TRY003)


303-303: Unused function argument: installed_opentelemetry_operator

(ARG001)


306-306: Unused function argument: minio_service_otel

(ARG001)


328-328: Avoid specifying long messages outside the exception class

(TRY003)


361-361: Possible binding to all interfaces

(S104)


438-438: Unused function argument: minio_pvc_otel

(ARG001)


481-481: Unused function argument: minio_deployment_otel

(ARG001)


547-547: Unused function argument: admin_client

(ARG001)


560-560: Unused function argument: admin_client

(ARG001)

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
tests/model_explainability/guardrails/test_guardrails.py (1)

387-413: Make trace test self-sufficient, add timeout, and discover service dynamically.

This test has several issues:

  1. Critical: Missing timeout on requests.get (line 405) is a security concern (S113).
  2. Major: Hardcoded service name fms_guardrails_orchestr8 is brittle and may not match actual service names.
  3. Major: Test doesn't generate traces itself—it relies on previous tests. This makes it non-deterministic and prone to failure when run in isolation.
  4. The unused fixture parameters are intentional for dependency injection, but many could be removed if the test becomes self-sufficient.

Apply this diff to make the test self-sufficient and robust:

     def test_guardrails_traces_in_tempo(
         self,
-        admin_client,
-        model_namespace,
-        orchestrator_config,
-        guardrails_orchestrator,
-        guardrails_gateway_config,
+        current_client_token,
+        openshift_ca_bundle_file,
+        guardrails_orchestrator_health_route,
         otel_collector,
         tempo_stack,
         tempo_traces_service_portforward,
     ):
         """
         Ensure that OpenTelemetry traces from Guardrails Orchestrator are collected in Tempo.
-        Equivalent to clicking 'Find Traces' in the Tempo UI.
         """
+        # Trigger a request to ensure traces are produced
+        check_guardrails_health_endpoint(
+            host=guardrails_orchestrator_health_route.host,
+            token=current_client_token,
+            ca_bundle_file=openshift_ca_bundle_file,
+        )
 
         @retry(wait_timeout=Timeout.TIMEOUT_1MIN, sleep=5)
         def check_traces():
-            response = requests.get(f"{tempo_traces_service_portforward}/api/traces?service=fms_guardrails_orchestr8")
+            # Discover services containing "guardrails"
+            svcs = requests.get(
+                f"{tempo_traces_service_portforward}/api/services",
+                timeout=Timeout.TIMEOUT_30SEC,
+            )
+            if svcs.status_code != http.HTTPStatus.OK:
+                return False
+            svc_names = svcs.json().get("data", [])
+            guardrails_svc = next((s for s in svc_names if "guardrails" in s.lower()), None)
+            if not guardrails_svc:
+                return False
+            
+            response = requests.get(
+                f"{tempo_traces_service_portforward}/api/traces?service={guardrails_svc}",
+                timeout=Timeout.TIMEOUT_30SEC,
+            )
             if response.status_code == http.HTTPStatus.OK:
                 data = response.json()
                 if data.get("data"):  # non-empty list of traces
                     return data
             return False
 
         traces = check_traces()
         assert traces["data"], "No traces found in Tempo for Guardrails Orchestrator"

Based on learnings

tests/model_explainability/guardrails/conftest.py (2)

419-438: Wait for PVC to be BOUND before yielding.

The PVC is yielded immediately without waiting for it to be bound. This can cause the MinIO deployment (which depends on this fixture) to start before the volume is ready, leading to failures.

Apply this diff:

     with PersistentVolumeClaim(**pvc_kwargs) as pvc:
+        pvc.wait_for_status(status=pvc.Status.BOUND, timeout=Timeout.TIMEOUT_2MIN)
         yield pvc

529-547: Guard against missing OpenTelemetry Collector service.

The next() call without a default will raise StopIteration if no service matches the label selector, causing a confusing error.

Apply this diff:

     service = next(
         Service.get(
             dyn_client=admin_client,
             namespace=model_namespace.name,
             label_selector="app.kubernetes.io/component=opentelemetry-collector",
-        )
+        ),
+        None,
     )
+    if service is None:
+        raise ResourceNotFoundError(
+            f"No OpenTelemetry Collector service found in namespace {model_namespace.name}"
+        )
🧹 Nitpick comments (2)
tests/model_explainability/guardrails/conftest.py (2)

155-191: Remove unused model_namespace parameter.

The model_namespace parameter is declared but never used in this fixture. The fixture only needs admin_client to install the operator at the cluster level.

Apply this diff:

 @pytest.fixture(scope="class")
-def installed_tempo_operator(admin_client: DynamicClient, model_namespace: Namespace) -> Generator[None, Any, None]:
+def installed_tempo_operator(admin_client: DynamicClient) -> Generator[None, Any, None]:
     """
     Installs the Tempo operator and waits for its deployment.
     """

549-560: Consider removing unused admin_client parameters.

The admin_client parameter is unused in both tempo_traces_endpoint and tempo_traces_service_portforward fixtures. While not critical, removing them would clean up the code.

Apply this diff:

 @pytest.fixture(scope="class")
-def tempo_traces_endpoint(tempo_stack, model_namespace: Namespace, admin_client: DynamicClient):
+def tempo_traces_endpoint(tempo_stack, model_namespace: Namespace):
     """
     Returns the TempoStack distributor endpoint dynamically using ocp_resources.Service.
 
     """
 @pytest.fixture(scope="class")
-def tempo_traces_service_portforward(admin_client, model_namespace, tempo_stack):
+def tempo_traces_service_portforward(model_namespace, tempo_stack):
     """
     Port-forwards the Tempo Query Frontend service to access traces locally.
     Equivalent CLI:
       oc -n <ns> port-forward svc/tempo-my-tempo-stack-query-frontend 16686:16686
     """

Also applies to: 562-586

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1b3def6 and 46709dd.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • pyproject.toml (1 hunks)
  • tests/fixtures/guardrails.py (1 hunks)
  • tests/model_explainability/guardrails/conftest.py (2 hunks)
  • tests/model_explainability/guardrails/constants.py (1 hunks)
  • tests/model_explainability/guardrails/test_guardrails.py (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/fixtures/guardrails.py
  • pyproject.toml
  • tests/model_explainability/guardrails/constants.py
🧰 Additional context used
🧬 Code graph analysis (2)
tests/model_explainability/guardrails/test_guardrails.py (4)
utilities/constants.py (1)
  • Timeout (220-231)
tests/model_explainability/guardrails/conftest.py (3)
  • otel_collector (304-376)
  • tempo_stack (194-261)
  • tempo_traces_service_portforward (563-586)
tests/conftest.py (2)
  • admin_client (73-74)
  • model_namespace (105-125)
tests/fixtures/guardrails.py (3)
  • orchestrator_config (69-78)
  • guardrails_orchestrator (20-65)
  • guardrails_gateway_config (82-92)
tests/model_explainability/guardrails/conftest.py (5)
utilities/certificates_utils.py (1)
  • create_ca_bundle_file (22-65)
utilities/constants.py (3)
  • KServeDeploymentType (7-10)
  • Timeout (220-231)
  • RuntimeTemplates (69-82)
utilities/operator_utils.py (1)
  • get_cluster_service_version (14-29)
tests/conftest.py (2)
  • admin_client (73-74)
  • model_namespace (105-125)
tests/model_explainability/trustyai_service/trustyai_service_utils.py (1)
  • _get_pods (363-370)
🪛 Ruff (0.14.1)
tests/model_explainability/guardrails/test_guardrails.py

333-333: Unused method argument: otel_collector

(ARG002)


334-334: Unused method argument: tempo_stack

(ARG002)


355-355: Unused method argument: otel_collector

(ARG002)


356-356: Unused method argument: tempo_stack

(ARG002)


375-375: Unused method argument: otel_collector

(ARG002)


376-376: Unused method argument: tempo_stack

(ARG002)


389-389: Unused method argument: admin_client

(ARG002)


390-390: Unused method argument: model_namespace

(ARG002)


391-391: Unused method argument: orchestrator_config

(ARG002)


392-392: Unused method argument: guardrails_orchestrator

(ARG002)


393-393: Unused method argument: guardrails_gateway_config

(ARG002)


394-394: Unused method argument: otel_collector

(ARG002)


395-395: Unused method argument: tempo_stack

(ARG002)


405-405: Probable use of requests call without timeout

(S113)

tests/model_explainability/guardrails/conftest.py

156-156: Unused function argument: model_namespace

(ARG001)


196-196: Unused function argument: installed_tempo_operator

(ARG001)


224-224: Avoid specifying long messages outside the exception class

(TRY003)


306-306: Unused function argument: installed_opentelemetry_operator

(ARG001)


309-309: Unused function argument: minio_service_otel

(ARG001)


331-331: Avoid specifying long messages outside the exception class

(TRY003)


364-364: Possible binding to all interfaces

(S104)


441-441: Unused function argument: minio_pvc_otel

(ARG001)


484-484: Unused function argument: minio_deployment_otel

(ARG001)


550-550: Unused function argument: admin_client

(ARG001)


563-563: Unused function argument: admin_client

(ARG001)

🔇 Additional comments (10)
tests/model_explainability/guardrails/test_guardrails.py (4)

1-2: LGTM! Imports support the new trace validation functionality.

The new imports (http, requests, retry, Timeout) are appropriately added to support the OpenTelemetry trace validation test.

Also applies to: 7-7, 30-30


265-293: LGTM! Gateway and OTEL configuration properly added.

The guardrails_gateway_config_data and otel_exporter_config: True additions enable the OpenTelemetry integration for HuggingFace detector tests.


299-309: LGTM! Fixture dependencies properly declared.

The usefixtures decorator correctly lists all infrastructure fixtures needed for OTEL/Tempo integration.


333-334: LGTM! Fixture parameters ensure proper test dependencies.

The otel_collector and tempo_stack fixtures are correctly added as method parameters to establish dependency ordering. While static analysis flags them as unused, this is the standard pytest pattern for ensuring fixtures are instantiated before test execution.

Also applies to: 355-356, 375-376

tests/model_explainability/guardrails/conftest.py (6)

1-36: LGTM! Imports support OTEL/Tempo infrastructure.

The new imports are appropriate for implementing OpenTelemetry and Tempo integration fixtures.


193-262: LGTM! TempoStack fixture properly orchestrates resource creation.

The fixture correctly:

  • Uses installed_tempo_operator for dependency ordering (flagged by static analysis as unused, but this is the standard pytest pattern)
  • Creates a TempoStack CR from operator ALM examples
  • Configures MinIO backend storage
  • Waits for readiness with appropriate timeout
  • Waits for pods to be ready

264-301: LGTM! OpenTelemetry operator installation follows proper pattern.

The fixture correctly installs the Red Hat OpenTelemetry operator, waits for deployment readiness, and handles cleanup.


303-377: LGTM! OpenTelemetry Collector fixture properly configured.

The fixture correctly:

  • Uses installed_opentelemetry_operator and minio_service_otel for dependency ordering (flagged as unused but necessary for pytest)
  • Extracts and customizes the OpenTelemetryCollector CR from operator ALM examples
  • Configures OTLP exporter to Tempo distributor endpoint
  • Sets up receivers and pipelines
  • Waits for collector pods to be ready

Note: Binding to 0.0.0.0 (line 364) is flagged by static analysis (S104) but is appropriate in this containerized Kubernetes context where the collector needs to accept connections from any pod.


379-417: LGTM! Reusable pod readiness helper.

This helper function addresses past review feedback about code duplication. It properly waits for pods to exist and then for each to reach ready status.


440-481: LGTM! MinIO deployment and service fixtures properly configured.

The fixtures correctly:

  • Use minio_pvc_otel and minio_deployment_otel for dependency ordering (flagged as unused but necessary)
  • Deploy MinIO with appropriate configuration
  • Create a service to expose MinIO
  • Create a secret with connection details

Also applies to: 483-527

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (11)
tests/model_explainability/guardrails/conftest.py (9)

70-72: SyntaxError: storage_uri split across two string literals.

Join into a single string; current code won’t parse.

-        storage_uri="oci://quay.io/trustyai_testing/detectors/deberta-v3-base-prompt-injection-v2"
-        "@sha256:8737d6c7c09edf4c16dc87426624fd8ed7d118a12527a36b670be60f089da215",
+        storage_uri="oci://quay.io/trustyai_testing/detectors/deberta-v3-base-prompt-injection-v2@sha256:8737d6c7c09edf4c16dc87426624fd8ed7d118a12527a36b670be60f089da215",

123-125: SyntaxError: second split storage_uri.

Same issue; merge into one literal.

-        storage_uri="oci://quay.io/trustyai_testing/detectors/granite-guardian-hap-38m"
-        "@sha256:9dd129668cce86dac674814c0a965b1526a01de562fd1e9a28d1892429bdad7b",
+        storage_uri="oci://quay.io/trustyai_testing/detectors/granite-guardian-hap-38m@sha256:9dd129668cce86dac674814c0a965b1526a01de562fd1e9a28d1892429bdad7b",

156-160: Unused fixture parameter model_namespace.

Remove or suppress since it’s not used here.

-@pytest.fixture(scope="class")
-def installed_tempo_operator(admin_client: DynamicClient, model_namespace: Namespace) -> Generator[None, Any, None]:
+@pytest.fixture(scope="class")
+def installed_tempo_operator(admin_client: DynamicClient) -> Generator[None, Any, None]:

If kept for ordering only, append “# noqa: ARG001” to the def line.


338-347: Avoid hard-coded OTLP port; derive from Tempo distributor service.

Compute port from Service spec; fall back to OTEL_EXPORTER_PORT.

-    otel_cr_dict["spec"]["config"] = {
+    # Derive distributor gRPC port from Service (fallback to OTEL_EXPORTER_PORT)
+    distributor_svc = Service(
+        client=admin_client,
+        name=f"tempo-{tempo_stack.name}-distributor",
+        namespace=namespace,
+    )
+    tempo_grpc_port = OTEL_EXPORTER_PORT
+    if distributor_svc.exists:
+        try:
+            ports = getattr(distributor_svc.instance.spec, "ports", []) or []
+            tempo_grpc_port = next(
+                (p.port for p in ports if getattr(p, "name", "") in ("otlp-grpc", "grpc")),
+                tempo_grpc_port,
+            )
+        except Exception:
+            pass
+
+    otel_cr_dict["spec"]["config"] = {
         "exporters": {
             "otlp": {
-                "endpoint": (
-                    f"tempo-{tempo_stack.name}-distributor.{namespace}.svc.cluster.local:{OTEL_EXPORTER_PORT}"
-                ),
+                "endpoint": f"tempo-{tempo_stack.name}-distributor.{namespace}.svc.cluster.local:{tempo_grpc_port}",
                 "tls": {"insecure": True},
             }
         },

549-559: Use gRPC-style endpoint (no scheme) and avoid hard-coded port if possible.

Remove http:// for OTLP gRPC and optionally derive port from Service.

-    service_name = f"tempo-{tempo_stack.name}-distributor"
-
-    port = OTEL_EXPORTER_PORT
-
-    return f"http://{service_name}.{model_namespace.name}.svc.cluster.local:{port}"
+    service_name = f"tempo-{tempo_stack.name}-distributor"
+    port = OTEL_EXPORTER_PORT
+    # Optional: look up service port similar to otel_collector config; fallback to default.
+    return f"{service_name}.{model_namespace.name}.svc.cluster.local:{port}"

217-225: Guard against missing TempoStack ALM example (StopIteration).

Use a default in next(...) then raise a clear error.

-    tempo_stack_dict: dict[str, Any] = next(
-        example
-        for example in alm_examples
-        if example["kind"] == "TempoStack" and example["apiVersion"].startswith("tempo.grafana.com/")
-    )
-    if not tempo_stack_dict:
+    tempo_stack_dict: dict[str, Any] = next(
+        (example for example in alm_examples
+         if example.get("kind") == "TempoStack"
+         and str(example.get("apiVersion", "")).startswith("tempo.grafana.com/")),
+        None,
+    )
+    if tempo_stack_dict is None:
         raise ResourceNotFoundError(f"No TempoStack dict found in alm_examples for CSV {tempo_csv.name}")

323-331: Same StopIteration risk for OpenTelemetryCollector ALM example.

Make next(...) safe with default.

-    otel_cr_dict: dict[str, Any] = next(
-        example
-        for example in alm_examples
-        if example["kind"] == "OpenTelemetryCollector" and example["apiVersion"] == "opentelemetry.io/v1beta1"
-    )
-    if not otel_cr_dict:
+    otel_cr_dict: dict[str, Any] = next(
+        (example for example in alm_examples
+         if example.get("kind") == "OpenTelemetryCollector"
+         and example.get("apiVersion") == "opentelemetry.io/v1beta1"),
+        None,
+    )
+    if otel_cr_dict is None:
         raise ResourceNotFoundError(f"No OpenTelemetryCollector example found in ALM examples for {otel_csv.name}")

419-437: Wait for PVC to be BOUND before yielding (flakiness).

Ensure claim is usable before MinIO starts.

     with PersistentVolumeClaim(**pvc_kwargs) as pvc:
-        yield pvc
+        pvc.wait_for_status(status=pvc.Status.BOUND, timeout=Timeout.TIMEOUT_2MIN)
+        yield pvc

529-547: Guard against missing OTEL collector service and fix gRPC endpoint format.

  • next(...) can raise StopIteration; handle missing service.
  • For gRPC, return host:port (no http://).
-    service = next(
-        Service.get(
-            dyn_client=admin_client,
-            namespace=model_namespace.name,
-            label_selector="app.kubernetes.io/component=opentelemetry-collector",
-        )
-    )
-
-    service_name = service.name
-
-    port = OTEL_EXPORTER_PORT
-    return f"http://{service_name}.{model_namespace.name}.svc.cluster.local:{port}"
+    service = next(
+        Service.get(
+            dyn_client=admin_client,
+            namespace=model_namespace.name,
+            label_selector="app.kubernetes.io/component=opentelemetry-collector",
+        ),
+        None,
+    )
+    if service is None:
+        raise ResourceNotFoundError(
+            f"No OpenTelemetry Collector service found in namespace {model_namespace.name}"
+        )
+    service_name = service.name
+    port = OTEL_EXPORTER_PORT
+    return f"{service_name}.{model_namespace.name}.svc.cluster.local:{port}"
tests/model_explainability/guardrails/test_guardrails.py (2)

234-234: Formatting: add missing space after comma in parametrize string.

-    "model_namespace, orchestrator_config, guardrails_gateway_config,guardrails_orchestrator",
+    "model_namespace, orchestrator_config, guardrails_gateway_config, guardrails_orchestrator",

405-405: Add timeout to requests.get.

Network calls should be bounded to avoid hanging tests.

-            response = requests.get(f"{tempo_traces_service_portforward}/api/traces?service=fms_guardrails_orchestr8")
+            response = requests.get(
+                f"{tempo_traces_service_portforward}/api/traces?service=fms_guardrails_orchestr8",
+                timeout=Timeout.TIMEOUT_30SEC,
+            )
🧹 Nitpick comments (3)
tests/model_explainability/guardrails/conftest.py (2)

362-365: Bind metrics exporter narrowly in tests.

0.0.0.0 is fine for ephemeral test envs; prefer 127.0.0.1 to reduce exposure if reused.

-            "metrics": {"readers": [{"pull": {"exporter": {"prometheus": {"host": "0.0.0.0", "port": 8888}}}}]}
+            "metrics": {"readers": [{"pull": {"exporter": {"prometheus": {"host": "127.0.0.1", "port": 8888}}}}]}

562-585: Port-forward robustness.

Pre-validate target Service and increase wait for slower clusters; use explicit “svc/” to avoid ambiguity.

-    service_name = f"tempo-{tempo_stack.name}-query-frontend"  # tempo-my-tempo-stack-query-frontend"
+    service_name = f"tempo-{tempo_stack.name}-query-frontend"
     local_port = 16686
     local_url = f"{LOCAL_HOST_URL}:{local_port}"
 
     try:
+        svc = Service(client=admin_client, name=service_name, namespace=namespace)
+        if not svc.exists:
+            raise ResourceNotFoundError(f"Service {service_name} not found in {namespace}")
         with portforward.forward(
-            pod_or_service=f"{service_name}",
+            pod_or_service=f"svc/{service_name}",
             namespace=namespace,
             from_port=local_port,
             to_port=local_port,
-            waiting=20,
+            waiting=60,
         ):
             LOGGER.info(f"Tempo traces service port-forward established: {local_url}")
             yield local_url
tests/model_explainability/guardrails/test_guardrails.py (1)

387-414: Make Tempo trace check resilient and bounded.

  • Trigger a request to produce traces.
  • Add timeouts; discover service via /api/services.
     def test_guardrails_traces_in_tempo(
@@
-        @retry(wait_timeout=Timeout.TIMEOUT_1MIN, sleep=5)
+        # Trigger a quick request to ensure traces exist
+        check_guardrails_health_endpoint(
+            host=guardrails_orchestrator_health_route.host,
+            token=current_client_token,
+            ca_bundle_file=openshift_ca_bundle_file,
+        )
+
+        @retry(wait_timeout=Timeout.TIMEOUT_1MIN, sleep=5)
         def check_traces():
-            response = requests.get(f"{tempo_traces_service_portforward}/api/traces?service=fms_guardrails_orchestr8")
+            svcs = requests.get(
+                f"{tempo_traces_service_portforward}/api/services",
+                timeout=Timeout.TIMEOUT_30SEC,
+            )
+            if svcs.status_code != http.HTTPStatus.OK:
+                return False
+            svc_names = svcs.json().get("data", [])
+            guardrails_svc = next((s for s in svc_names if "guardrails" in s), None)
+            if not guardrails_svc:
+                return False
+            response = requests.get(
+                f"{tempo_traces_service_portforward}/api/traces?service={guardrails_svc}",
+                timeout=Timeout.TIMEOUT_30SEC,
+            )
             if response.status_code == http.HTTPStatus.OK:
                 data = response.json()
                 if data.get("data"):  # non-empty list of traces
                     return data
             return False
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 46709dd and 280dfd4.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • pyproject.toml (1 hunks)
  • tests/fixtures/guardrails.py (1 hunks)
  • tests/model_explainability/guardrails/conftest.py (2 hunks)
  • tests/model_explainability/guardrails/constants.py (1 hunks)
  • tests/model_explainability/guardrails/test_guardrails.py (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • tests/model_explainability/guardrails/constants.py
  • tests/fixtures/guardrails.py
  • pyproject.toml
🧰 Additional context used
🧬 Code graph analysis (2)
tests/model_explainability/guardrails/conftest.py (4)
utilities/certificates_utils.py (1)
  • create_ca_bundle_file (22-65)
utilities/constants.py (2)
  • KServeDeploymentType (7-10)
  • Timeout (220-231)
utilities/inference_utils.py (1)
  • create_isvc (552-767)
utilities/operator_utils.py (1)
  • get_cluster_service_version (14-29)
tests/model_explainability/guardrails/test_guardrails.py (3)
utilities/constants.py (1)
  • Timeout (220-231)
tests/model_explainability/guardrails/conftest.py (3)
  • otel_collector (304-376)
  • tempo_stack (194-261)
  • tempo_traces_service_portforward (563-586)
tests/fixtures/guardrails.py (3)
  • orchestrator_config (69-78)
  • guardrails_orchestrator (20-65)
  • guardrails_gateway_config (82-92)
🪛 Ruff (0.14.1)
tests/model_explainability/guardrails/conftest.py

156-156: Unused function argument: model_namespace

(ARG001)


196-196: Unused function argument: installed_tempo_operator

(ARG001)


224-224: Avoid specifying long messages outside the exception class

(TRY003)


306-306: Unused function argument: installed_opentelemetry_operator

(ARG001)


309-309: Unused function argument: minio_service_otel

(ARG001)


331-331: Avoid specifying long messages outside the exception class

(TRY003)


364-364: Possible binding to all interfaces

(S104)


441-441: Unused function argument: minio_pvc_otel

(ARG001)


484-484: Unused function argument: minio_deployment_otel

(ARG001)


550-550: Unused function argument: admin_client

(ARG001)


563-563: Unused function argument: admin_client

(ARG001)

tests/model_explainability/guardrails/test_guardrails.py

333-333: Unused method argument: otel_collector

(ARG002)


334-334: Unused method argument: tempo_stack

(ARG002)


355-355: Unused method argument: otel_collector

(ARG002)


356-356: Unused method argument: tempo_stack

(ARG002)


375-375: Unused method argument: otel_collector

(ARG002)


376-376: Unused method argument: tempo_stack

(ARG002)


389-389: Unused method argument: admin_client

(ARG002)


390-390: Unused method argument: model_namespace

(ARG002)


391-391: Unused method argument: orchestrator_config

(ARG002)


392-392: Unused method argument: guardrails_orchestrator

(ARG002)


393-393: Unused method argument: guardrails_gateway_config

(ARG002)


394-394: Unused method argument: otel_collector

(ARG002)


395-395: Unused method argument: tempo_stack

(ARG002)


405-405: Probable use of requests call without timeout

(S113)

@kpunwatk
Copy link
Copy Markdown
Contributor Author

/verified

@rhods-ci-bot rhods-ci-bot added the Verified Verified pr in Jenkins label Oct 22, 2025
adolfo-ab
adolfo-ab previously approved these changes Oct 22, 2025
sheltoncyril
sheltoncyril previously approved these changes Oct 22, 2025
	modified:   tests/model_explainability/guardrails/conftest.py
	modified:   tests/model_explainability/guardrails/test_guardrails.py

	modified:   pyproject.toml
	modified:   tests/model_explainability/guardrails/conftest.py
	modified:   tests/model_explainability/guardrails/test_guardrails.py

	modified:   tests/model_explainability/guardrails/conftest.py

	modified:   tests/model_explainability/guardrails/conftest.py

	modified:   tests/model_explainability/guardrails/test_guardrails.py

	modified:   tests/fixtures/guardrails.py
	modified:   tests/model_explainability/guardrails/conftest.py
	modified:   tests/model_explainability/guardrails/test_guardrails.py

	modified:   tests/model_explainability/guardrails/conftest.py

	modified:   tests/fixtures/guardrails.py
	modified:   tests/model_explainability/guardrails/conftest.py
	modified:   tests/model_explainability/guardrails/test_guardrails.py

	modified:   tests/fixtures/guardrails.py
	modified:   tests/model_explainability/guardrails/conftest.py
	modified:   tests/model_explainability/guardrails/test_guardrails.py

	modified:   tests/model_explainability/guardrails/test_guardrails.py

	modified:   tests/model_explainability/guardrails/test_guardrails.py

	modified:   tests/fixtures/guardrails.py
	modified:   tests/model_explainability/guardrails/conftest.py
	modified:   tests/model_explainability/guardrails/constants.py
	modified:   tests/model_explainability/guardrails/test_guardrails.py

	modified:   tests/model_explainability/guardrails/conftest.py
	modified:   tests/model_explainability/guardrails/test_guardrails.py

	modified:   tests/model_explainability/guardrails/conftest.py
	modified:   tests/model_explainability/guardrails/test_guardrails.py

	modified:   tests/fixtures/guardrails.py

	modified:   tests/fixtures/guardrails.py

	modified:   tests/fixtures/guardrails.py
	modified:   tests/model_explainability/guardrails/conftest.py
	modified:   tests/model_explainability/guardrails/constants.py
	modified:   tests/model_explainability/guardrails/test_guardrails.py

	modified:   pyproject.toml
	modified:   tests/fixtures/guardrails.py
	modified:   tests/model_explainability/guardrails/conftest.py
	modified:   tests/model_explainability/guardrails/constants.py
	modified:   tests/model_explainability/guardrails/test_guardrails.py
	modified:   uv.lock

	modified:   tests/model_explainability/guardrails/conftest.py
	modified:   tests/model_explainability/guardrails/test_guardrails.py

	modified:   pyproject.toml
	modified:   tests/fixtures/guardrails.py
	modified:   tests/model_explainability/guardrails/conftest.py
	modified:   tests/model_explainability/guardrails/constants.py
	modified:   tests/model_explainability/guardrails/test_guardrails.py
	modified:   uv.lock

	modified:   tests/model_explainability/guardrails/conftest.py
	modified:   tests/model_explainability/guardrails/test_guardrails.py

	modified:   pyproject.toml
	modified:   tests/fixtures/guardrails.py
	modified:   tests/model_explainability/guardrails/conftest.py
	modified:   tests/model_explainability/guardrails/constants.py
	modified:   tests/model_explainability/guardrails/test_guardrails.py
	modified:   uv.lock

	modified:   tests/model_explainability/guardrails/conftest.py
	modified:   tests/model_explainability/guardrails/test_guardrails.py

	modified:   tests/model_explainability/guardrails/conftest.py

	modified:   tests/model_explainability/guardrails/conftest.py
	modified:   tests/model_explainability/guardrails/constants.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (5)
tests/model_explainability/guardrails/conftest.py (4)

437-438: PVC should wait for BOUND status before yielding.

The fixture yields the PVC immediately after creation without waiting for it to be bound. This can cause race conditions where the MinIO deployment tries to use the PVC before it's ready.

Apply this diff to wait for the PVC to be bound:

     with PersistentVolumeClaim(**pvc_kwargs) as pvc:
+        pvc.wait_for_status(status=pvc.Status.BOUND, timeout=Timeout.TIMEOUT_2MIN)
         yield pvc

536-542: Guard against missing OTEL collector service (StopIteration risk).

The next() call on the Service iterator will raise StopIteration if no matching service is found, rather than allowing error handling.

Apply this diff to handle the missing service gracefully:

     service = next(
         Service.get(
             dyn_client=admin_client,
             namespace=model_namespace.name,
             label_selector="app.kubernetes.io/component=opentelemetry-collector",
-        )
+        ),
+        None,
     )
+    if service is None:
+        raise ResourceNotFoundError(
+            f"No OpenTelemetry Collector service found in namespace {model_namespace.name} "
+            f"with label app.kubernetes.io/component=opentelemetry-collector"
+        )

325-332: Same StopIteration risk: add default to next() call.

The code uses next() without a default, which will raise StopIteration if no OpenTelemetryCollector example exists. The subsequent if not otel_cr_dict: check will never execute if the example is missing.

Apply this diff to handle the missing example safely:

-    otel_cr_dict: dict[str, Any] = next(
-        example
-        for example in alm_examples
-        if example["kind"] == "OpenTelemetryCollector" and example["apiVersion"] == "opentelemetry.io/v1beta1"
-    )
-
-    if not otel_cr_dict:
+    otel_cr_dict: dict[str, Any] = next(
+        (example
+         for example in alm_examples
+         if example.get("kind") == "OpenTelemetryCollector"
+         and example.get("apiVersion") == "opentelemetry.io/v1beta1"),
+        None,
+    )
+    if otel_cr_dict is None:
         raise ResourceNotFoundError(f"No OpenTelemetryCollector example found in ALM examples for {otel_csv.name}")

219-225: StopIteration risk remains: next() needs a default argument.

Despite past review comments indicating this was addressed in commit 1a80663, the code still uses next() without a default. This will raise StopIteration if no TempoStack example is found, and the subsequent if not tempo_stack_dict: check will never execute.

Apply this diff to handle the missing ALM example gracefully:

-    tempo_stack_dict: dict[str, Any] = next(
-        example
-        for example in alm_examples
-        if example["kind"] == "TempoStack" and example["apiVersion"].startswith("tempo.grafana.com/")
-    )
-    if not tempo_stack_dict:
+    tempo_stack_dict: dict[str, Any] = next(
+        (example
+         for example in alm_examples
+         if example.get("kind") == "TempoStack" and str(example.get("apiVersion", "")).startswith("tempo.grafana.com/")),
+        None,
+    )
+    if tempo_stack_dict is None:
         raise ResourceNotFoundError(f"No TempoStack dict found in alm_examples for CSV {tempo_csv.name}")
tests/model_explainability/guardrails/test_guardrails.py (1)

405-405: Add timeout to requests call to prevent hanging.

The requests.get() call on line 405 does not specify a timeout, which can cause the test to hang indefinitely if the Tempo service is unresponsive.

Apply this diff:

-            response = requests.get(f"{tempo_traces_service_portforward}/api/traces?service=fms_guardrails_orchestr8")
+            response = requests.get(
+                f"{tempo_traces_service_portforward}/api/traces?service=fms_guardrails_orchestr8",
+                timeout=Timeout.TIMEOUT_30SEC,
+            )

Based on static analysis hints.

🧹 Nitpick comments (3)
tests/model_explainability/guardrails/test_guardrails.py (3)

234-234: Fix formatting: missing space after comma.

There's a missing space after the comma between guardrails_gateway_config and guardrails_orchestrator in the parametrize decorator.

Apply this diff:

-    "model_namespace, orchestrator_config, guardrails_gateway_config,guardrails_orchestrator",
+    "model_namespace, orchestrator_config, guardrails_gateway_config, guardrails_orchestrator",

333-334: Remove redundant fixture parameters already declared in usefixtures.

The otel_collector and tempo_stack parameters are added to these test methods but never used. Since these fixtures are already declared in the class-level @pytest.mark.usefixtures decorator (lines 299-309), they will be set up automatically. Adding them as parameters is redundant.

Apply this diff to remove the redundant parameters:

     def test_guardrails_multi_detector_unsuitable_input(
         self,
         current_client_token,
         llm_d_inference_sim_isvc,
         guardrails_orchestrator_route,
         prompt_injection_detector_route,
         hap_detector_route,
         openshift_ca_bundle_file,
         orchestrator_config,
         guardrails_orchestrator,
-        otel_collector,
-        tempo_stack,
     ):

Apply similar changes to test_guardrails_multi_detector_negative_detection (lines 355-356) and test_guardrails_standalone_detector_endpoint (lines 375-376).

Based on static analysis hints.

Also applies to: 355-356, 375-376


387-414: Consider making the trace test more robust and self-sufficient.

The current test has several areas that could be improved:

  1. Self-sufficiency: The test relies on traces being generated by other tests. Consider triggering a health check request at the start to ensure traces are produced.

  2. Service discovery: The service name fms_guardrails_orchestr8 is hardcoded. If the service naming changes, the test will fail. Consider discovering the service name dynamically via /api/services.

  3. Unused parameters: Many fixture parameters (admin_client, model_namespace, orchestrator_config, guardrails_orchestrator, guardrails_gateway_config) are declared but never used. Some may be needed for setup ordering, but others appear redundant since the class already uses @pytest.mark.usefixtures.

Example improvements:

def test_guardrails_traces_in_tempo(
    self,
    current_client_token,
    openshift_ca_bundle_file,
    guardrails_orchestrator_health_route,
    tempo_traces_service_portforward,
):
    """
    Ensure that OpenTelemetry traces from Guardrails Orchestrator are collected in Tempo.
    """
    # Trigger a request to generate traces
    check_guardrails_health_endpoint(
        host=guardrails_orchestrator_health_route.host,
        token=current_client_token,
        ca_bundle_file=openshift_ca_bundle_file,
    )
    
    @retry(wait_timeout=Timeout.TIMEOUT_1MIN, sleep=5)
    def check_traces():
        # Discover service name dynamically
        svcs_response = requests.get(
            f"{tempo_traces_service_portforward}/api/services",
            timeout=Timeout.TIMEOUT_30SEC,
        )
        if svcs_response.status_code != http.HTTPStatus.OK:
            return False
        
        service_names = svcs_response.json().get("data", [])
        guardrails_svc = next((s for s in service_names if "guardrails" in s), None)
        if not guardrails_svc:
            return False
        
        response = requests.get(
            f"{tempo_traces_service_portforward}/api/traces?service={guardrails_svc}",
            timeout=Timeout.TIMEOUT_30SEC,
        )
        if response.status_code == http.HTTPStatus.OK:
            data = response.json()
            if data.get("data"):
                return data
        return False
    
    traces = check_traces()
    assert traces["data"], "No traces found in Tempo for Guardrails Orchestrator"
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 280dfd4 and f77258a.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (5)
  • pyproject.toml (1 hunks)
  • tests/fixtures/guardrails.py (1 hunks)
  • tests/model_explainability/guardrails/conftest.py (2 hunks)
  • tests/model_explainability/guardrails/constants.py (1 hunks)
  • tests/model_explainability/guardrails/test_guardrails.py (8 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • pyproject.toml
  • tests/fixtures/guardrails.py
🧰 Additional context used
🧬 Code graph analysis (2)
tests/model_explainability/guardrails/test_guardrails.py (3)
utilities/constants.py (1)
  • Timeout (220-231)
tests/model_explainability/guardrails/conftest.py (3)
  • otel_collector (305-377)
  • tempo_stack (195-262)
  • tempo_traces_service_portforward (564-587)
tests/fixtures/guardrails.py (3)
  • orchestrator_config (69-78)
  • guardrails_orchestrator (20-65)
  • guardrails_gateway_config (82-92)
tests/model_explainability/guardrails/conftest.py (4)
utilities/constants.py (2)
  • Timeout (220-231)
  • RuntimeTemplates (69-82)
utilities/operator_utils.py (1)
  • get_cluster_service_version (14-29)
tests/conftest.py (2)
  • admin_client (73-74)
  • model_namespace (105-125)
tests/model_explainability/trustyai_service/trustyai_service_utils.py (1)
  • _get_pods (363-370)
🪛 Ruff (0.14.1)
tests/model_explainability/guardrails/test_guardrails.py

333-333: Unused method argument: otel_collector

(ARG002)


334-334: Unused method argument: tempo_stack

(ARG002)


355-355: Unused method argument: otel_collector

(ARG002)


356-356: Unused method argument: tempo_stack

(ARG002)


375-375: Unused method argument: otel_collector

(ARG002)


376-376: Unused method argument: tempo_stack

(ARG002)


389-389: Unused method argument: admin_client

(ARG002)


390-390: Unused method argument: model_namespace

(ARG002)


391-391: Unused method argument: orchestrator_config

(ARG002)


392-392: Unused method argument: guardrails_orchestrator

(ARG002)


393-393: Unused method argument: guardrails_gateway_config

(ARG002)


394-394: Unused method argument: otel_collector

(ARG002)


395-395: Unused method argument: tempo_stack

(ARG002)


405-405: Probable use of requests call without timeout

(S113)

tests/model_explainability/guardrails/conftest.py

157-157: Unused function argument: model_namespace

(ARG001)


197-197: Unused function argument: installed_tempo_operator

(ARG001)


225-225: Avoid specifying long messages outside the exception class

(TRY003)


307-307: Unused function argument: installed_opentelemetry_operator

(ARG001)


310-310: Unused function argument: minio_service_otel

(ARG001)


332-332: Avoid specifying long messages outside the exception class

(TRY003)


365-365: Possible binding to all interfaces

(S104)


442-442: Unused function argument: minio_pvc_otel

(ARG001)


485-485: Unused function argument: minio_deployment_otel

(ARG001)


551-551: Unused function argument: admin_client

(ARG001)


564-564: Unused function argument: admin_client

(ARG001)

tests/model_explainability/guardrails/constants.py

8-8: Possible hardcoded password assigned to: "SUPER_SECRET"

(S105)

🔇 Additional comments (5)
tests/model_explainability/guardrails/constants.py (1)

7-9: LGTM! Test infrastructure constants are well-defined.

The three constants are appropriate for the OTEL/Tempo test infrastructure:

  • OTEL_EXPORTER_PORT uses the standard OTLP gRPC port (4317)
  • SUPER_SECRET is properly marked with pragma: allowlist secret for test credentials
  • TEMPO serves as an identifier for MinIO bucket and service names
tests/model_explainability/guardrails/conftest.py (2)

157-157: Verify if model_namespace parameter is needed for fixture dependency.

The model_namespace parameter is declared but not used in the fixture body. In pytest, sometimes fixture parameters are included only to establish dependency ordering.

Please verify whether model_namespace is required as a dependency. If it's only needed to ensure the namespace exists before installing the operator, the dependency should be explicit. If not needed, consider removing it.

Apply this diff to remove the unused parameter:

-def installed_tempo_operator(admin_client: DynamicClient, model_namespace: Namespace) -> Generator[None, Any, None]:
+def installed_tempo_operator(admin_client: DynamicClient) -> Generator[None, Any, None]:

Based on static analysis hints.


352-353: Binding to all interfaces (0.0.0.0) is appropriate for container contexts.

The static analysis tool flagged binding to 0.0.0.0 as a potential security issue (S104). However, in containerized environments like Kubernetes, binding receivers and metrics endpoints to 0.0.0.0 is the expected pattern to allow the service to be accessible within the cluster network.

Also applies to: 365-365

tests/model_explainability/guardrails/test_guardrails.py (2)

265-293: LGTM! Gateway and OTEL configuration additions are well-structured.

The addition of guardrails_gateway_config_data and otel_exporter_config properly extends test coverage to include gateway and OpenTelemetry integration for HuggingFace detector tests.


299-309: LGTM! Fixture dependencies properly declared for OTEL/Tempo infrastructure.

The @pytest.mark.usefixtures decorator correctly declares the dependencies on MinIO, Tempo, and OpenTelemetry infrastructure fixtures needed for the test class.

Copy link
Copy Markdown
Contributor

@sheltoncyril sheltoncyril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@github-actions
Copy link
Copy Markdown

Status of building tag latest: success.
Status of pushing tag latest to image registry: success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants