Skip to content

feat: add tests for guardrails orchestrator autoconfig#611

Merged
dbasunag merged 2 commits intoopendatahub-io:mainfrom
adolfo-ab:guardrails-autoconfig
Sep 16, 2025
Merged

feat: add tests for guardrails orchestrator autoconfig#611
dbasunag merged 2 commits intoopendatahub-io:mainfrom
adolfo-ab:guardrails-autoconfig

Conversation

@adolfo-ab
Copy link
Copy Markdown
Contributor

@adolfo-ab adolfo-ab commented Sep 15, 2025

Adds new tests for the GuardrailsOrchestrator autoconfig feature, which radically simplifies the manual configuration needed for the GuardrailsOrchestrator.

Description

  • Adds tests for the GuardrailsOrchestrator autoconfig, with and without gateway
  • Refactor various utils surrounding the orchestrator tests (dataclasses, constants...)
  • Updates the openshift-python-wrapper version to get updated GuardrailsOrchestrator CR

How Has This Been Tested?

Running all the relevant tests on PSI

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

  • Chores

    • Bumped OpenShift wrapper dependency to >=11.0.94 for improved compatibility.
  • Tests

    • Expanded Guardrails test coverage: added AutoConfig scenarios (with/without gateway), health/info endpoint checks, and richer detector configurations.
    • Streamlined fixtures and test labels to support dynamic configuration flags and automatic detector setup; updated detection prompts and example data for realism.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Sep 15, 2025

📝 Walkthrough

Walkthrough

Updated a dependency and refactored test code: changed an Openshift dependency version; changed the guardrails orchestrator fixture API and parameter handling; updated several guardrails-related test constants, prompts, and helpers; added AutoConfig/gateway tests; and adjusted a llama_stack test to pass an orchestrator_config flag.

Changes

Cohort / File(s) Summary of changes
Dependency version bump
pyproject.toml
Updated openshift-python-wrapper dependency constraint from >=11.0.92 to >=11.0.94.
Guardrails orchestrator fixture refactor
tests/fixtures/guardrails.py
Removed orchestrator_config parameter from fixture signature; use request.getfixturevalue to obtain orchestrator_config when requested by test params. Added support for auto_config, adjusted guardrails gateway config handling, moved built-in detector handling to a dedicated block, set explicit "DEBUG" log level, and yield the deployed GuardrailsOrchestrator.
Llama stack test param update
tests/llama_stack/safety/test_trustyai_fms_provider.py
Extended parametrized dict to include "orchestrator_config": True alongside existing flags.
Guardrails model_explainability constants & tests
tests/model_explainability/guardrails/constants.py, tests/model_explainability/guardrails/conftest.py, tests/model_explainability/guardrails/test_guardrails.py
Refactored GuardrailsDetectionPrompt fields (prompt/detectioncontent, detection_name, detection_text); added AUTOCONFIG_DETECTOR_LABEL, new prompt constants (PII_*, PROMPT_INJECTION_*, HAP_*), updated EXAMPLE_EMAIL_ADDRESS; removed GUARDRAILS_MULTI_DETECTOR_INPUT_PROMPTS. Added label usage in ISVC fixtures, new helper functions (create_detector_config, health check helpers), updated tests to use new prompts, and added AutoConfig and gateway-related test classes.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title succinctly and accurately describes the primary change—adding tests for the Guardrails Orchestrator autoconfig feature—and matches the PR objectives and diffs which add autoconfig-related tests and refactors. It is concise, specific, and free of noisy details.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

The following are automatically added/executed:

  • PR size label.
  • Run pre-commit
  • Run tox
  • Add PR author as the PR assignee
  • Build image based on the PR

Available user actions:

  • To mark a PR as WIP, add /wip in a comment. To remove it from the PR comment /wip cancel to the PR.
  • To block merging of a PR, add /hold in a comment. To un-block merging of PR comment /hold cancel.
  • To mark a PR as approved, add /lgtm in a comment. To remove, add /lgtm cancel.
    lgtm label removed on each new commit push.
  • To mark PR as verified comment /verified to the PR, to un-verify comment /verified cancel to the PR.
    verified label removed on each new commit push.
  • To Cherry-pick a merged PR /cherry-pick <target_branch_name> to the PR. If <target_branch_name> is valid,
    and the current PR is merged, a cherry-picked PR would be created and linked to the current PR.
  • To build and push image to quay, add /build-push-pr-image in a comment. This would create an image with tag
    pr-<pr_number> to quay repository. This image tag, however would be deleted on PR merge or close action.
Supported labels

{'/build-push-pr-image', '/cherry-pick', '/verified', '/lgtm', '/wip', '/hold'}

@adolfo-ab adolfo-ab force-pushed the guardrails-autoconfig branch 2 times, most recently from cfa556d to aa2f3c8 Compare September 16, 2025 13:36
@github-actions github-actions bot added size/xl and removed size/l labels Sep 16, 2025
@adolfo-ab adolfo-ab force-pushed the guardrails-autoconfig branch 3 times, most recently from 1971f9f to b5a134e Compare September 16, 2025 13:49
@adolfo-ab adolfo-ab marked this pull request as ready for review September 16, 2025 13:52
@adolfo-ab adolfo-ab requested review from a team and sheltoncyril as code owners September 16, 2025 13:52
@adolfo-ab adolfo-ab force-pushed the guardrails-autoconfig branch from b5a134e to 443563c Compare September 16, 2025 13:53
@adolfo-ab
Copy link
Copy Markdown
Contributor Author

/verified

@rhods-ci-bot rhods-ci-bot added the Verified Verified pr in Jenkins label Sep 16, 2025
@adolfo-ab adolfo-ab changed the title wip: add tests for guardrails orchestrator autoconfig feat: add tests for guardrails orchestrator autoconfig Sep 16, 2025
@adolfo-ab adolfo-ab force-pushed the guardrails-autoconfig branch from aa89e97 to cfc260e Compare September 16, 2025 14:50
@rhods-ci-bot rhods-ci-bot removed Verified Verified pr in Jenkins commented-by-dbasunag labels Sep 16, 2025
@adolfo-ab
Copy link
Copy Markdown
Contributor Author

/verified

@rhods-ci-bot rhods-ci-bot added the Verified Verified pr in Jenkins label Sep 16, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
tests/model_explainability/guardrails/constants.py (2)

3-6: Clarify test-only intent for unsafe prompts; AUTOCONFIG_DETECTOR_LABEL verified

  • Add an inline comment above the constants in tests/model_explainability/guardrails/constants.py, e.g.:
    "# Test-only prompts to exercise Guardrails detectors (PII/prompt-injection/HAP). Intentionally contain unsafe content."
  • AUTOCONFIG_DETECTOR_LABEL is imported/used in tests/model_explainability/guardrails/conftest.py and tests/model_explainability/guardrails/test_guardrails.py — single canonical label found; no inconsistencies detected.

31-37: Prefer None over empty string for unknown detection_text

GuardrailsDetectionPrompt.detection_text is Optional[str] and a repo search found no code comparing against "" — use None to signal "not available" and avoid ambiguity.

File: tests/model_explainability/guardrails/constants.py (PII_OUTPUT_DETECTION_PROMPT)

 PII_OUTPUT_DETECTION_PROMPT: GuardrailsDetectionPrompt = GuardrailsDetectionPrompt(
     content="Output example email address, nothing else.",
     detector_id="regex",
     detection_name="email_address",
     detection_type="pii",
-    detection_text="",
+    detection_text=None,
 )
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 33abd93 and 1d37d29.

📒 Files selected for processing (1)
  • tests/model_explainability/guardrails/constants.py (1 hunks)
🔇 Additional comments (4)
tests/model_explainability/guardrails/constants.py (4)

11-14: Docstring addition LGTM

Clear purpose and scope for the container object.


23-29: PII input detection prompt is correct — keep detection_type = "pii"

Tests assert detection_type == "pii" (tests/llama_stack/safety/test_trustyai_fms_provider.py:100), so the lowercase value matches existing tests; no change needed.


16-20: Make the prompt container immutable and slotted

pyproject.toml requires-python = "==3.13.*", so dataclass(frozen=True, slots=True) and PEP 604 '|' unions are supported — apply this diff.

-@dataclass
+@dataclass(frozen=True, slots=True)
 class GuardrailsDetectionPrompt:

47-53: HAP classification: set detection_text to the expected label or None (do not mirror the full prompt)

Tests call verify_builtin_detector_unsuitable_input_response(prompt.detection_text); verify_detection asserts exact equality when detection_text is truthy, otherwise it only checks for a non-empty detection text. Choose one:

  • If the HAP detector returns a specific label, set detection_text to that label (e.g., "LABEL_1").
  • If you only want to assert presence (not exact match), set detection_text=None (this relaxes the test to require non-empty text).

Location: tests/model_explainability/guardrails/constants.py (HAP_INPUT_DETECTION_PROMPT) — verify behavior in tests/model_explainability/guardrails/utils.py -> verify_builtin_detector_unsuitable_input_response / verify_detection.

Suggested diff (if you want relaxed check):

-    detection_text=HAP_PROMPT,
+    detection_text=None,

Comment on lines +39 to +45
PROMPT_INJECTION_INPUT_DETECTION_PROMPT: GuardrailsDetectionPrompt = GuardrailsDetectionPrompt(
content=PROMPT_INJECTION_PROMPT,
detector_id="prompt-injection-detector",
detection_name="detection",
detection_type="INJECTION",
detection_text=PROMPT_INJECTION_PROMPT,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

🧩 Analysis chain

Confirm expected field values for prompt-injection detector

Double-check detector_id="prompt-injection-detector", detection_name="detection", and detection_type="INJECTION" match the GuardrailsOrchestrator outputs for the wrapper version pinned in this PR.


🌐 Web query:

GuardrailsOrchestrator prompt-injection detector: what are the expected values for detector_id, detection_name, and detection_type returned by the latest openshift-python-wrapper?

💡 Result:

Expected values (from the OpenShift AI / Guardrails Orchestrator docs):

  • detector_id: "hap" [1].
  • detection_name: "sequence_classifier" (appears as the "detection" field in examples). [1][2]
  • detection_type: "sequence_classification". [1][2]

Sources: Red Hat OpenShift AI Guardrails Orchestrator documentation. [1] [2]


Update Guardrails detector constants to documented Orchestrator values

Replace detector_id="prompt-injection-detector" → "hap", detection_name="detection" → "sequence_classifier", and detection_type="INJECTION" → "sequence_classification" in tests/model_explainability/guardrails/constants.py (lines 39–45) so they match the OpenShift AI Guardrails Orchestrator docs.

🤖 Prompt for AI Agents
In tests/model_explainability/guardrails/constants.py around lines 39 to 45, the
GuardrailsDetectionPrompt constants use outdated values; update detector_id from
"prompt-injection-detector" to "hap", detection_name from "detection" to
"sequence_classifier", and detection_type from "INJECTION" to
"sequence_classification" so the test constants match the OpenShift AI
Guardrails Orchestrator documented values.

Copy link
Copy Markdown
Collaborator

@dbasunag dbasunag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's take it. I will update the packages after it is merged.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (6)
tests/model_explainability/guardrails/constants.py (6)

3-6: Confirm selector label and guard sensitive prompts from logs.

  • Verify AUTOCONFIG_DETECTOR_LABEL exactly matches the label the Orchestrator watches.
  • Good switch to [email protected] (RFC 2606-safe).
  • Consider ensuring PROMPT_INJECTION_PROMPT/HAP_PROMPT are not emitted to CI logs at info level.

11-15: Clarify detection_text semantics in the docstring.

Spell out that detection_text should be the detected snippet (e.g., PII span) and should be None for pure classifications to avoid ambiguity.


31-37: Use None instead of empty string for unknown detection_text.

Empty string can be misread as “detected empty span.” Prefer None to mean “not applicable/unknown.”

Apply this diff:

-    detection_text="",
+    detection_text=None,

47-53: Classification prompt: detection_text should likely be None; verify detection_type.

For HAP, detection_text should not mirror the input text; set to None unless the detector returns a span. Confirm whether detection_type truly returns “LABEL_1” or a semantic label.

Apply this diff:

-    detection_text=HAP_PROMPT,
+    detection_text=None,

16-20: Make the dataclass immutable and tighten types.

  • Consider @DataClass(frozen=True, slots=True) to keep these “constants” immutable and lean.
  • If values are constrained, prefer Literal/Enum for detection_type and detection_name to prevent drift.
  • Confirm repo runs on Python ≥3.10 (PEP 604 unions).

Example (outside this hunk):

from dataclasses import dataclass

@dataclass(frozen=True, slots=True)
class GuardrailsDetectionPrompt:
    ...
#!/bin/bash
# Verify Python version constraints and usages of detection_type/name.
fd -a 'pyproject.toml' || true
fd -a 'tox.ini' || true
rg -nC2 -g '!**/venv/**' -e 'python-version' -e '^requires-python' || true
rg -nC2 -e 'detection_type\s*=' tests || true
rg -nC2 -e 'detector_id\s*=' tests || true

39-45: Normalize detection_type and detection_name to expected values.

Casing and labels differ across prompts (“pii”, “INJECTION”). Align to the detector’s actual enum/strings or define shared constants to avoid brittle tests. Also, “detection” as a detection_name is vague—consider a more specific value if that’s what the detector emits.

#!/bash
# Find assertions/expectations that rely on specific detection_type/name values.
rg -nC2 -e 'INJECTION|pii|LABEL_1|prompt-injection-detector' tests || true
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 33abd93 and 1d37d29.

📒 Files selected for processing (1)
  • tests/model_explainability/guardrails/constants.py (1 hunks)
🔇 Additional comments (1)
tests/model_explainability/guardrails/constants.py (1)

23-29: PII input prompt looks good; confirm detector_id.

LGTM for content and expected detection_text. Please confirm “regex” exactly matches the autoconfig’d detector_id.

Copy link
Copy Markdown
Contributor

@sheltoncyril sheltoncyril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@dbasunag dbasunag merged commit 5a39803 into opendatahub-io:main Sep 16, 2025
12 of 13 checks passed
@github-actions
Copy link
Copy Markdown

Status of building tag latest: success.
Status of pushing tag latest to image registry: success.

sheltoncyril pushed a commit to sheltoncyril/opendatahub-tests that referenced this pull request Oct 2, 2025
…o#611)

* feat: add tests for guardrails orchestrator autoconfig

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
mwaykole pushed a commit to mwaykole/opendatahub-tests that referenced this pull request Jan 23, 2026
…o#611)

* feat: add tests for guardrails orchestrator autoconfig

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants