-
Notifications
You must be signed in to change notification settings - Fork 37
Updating HF docs to use ModelCars #1050
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughDocumentation updates refactor detector deployment approach by removing separate model storage manifest and replacing with pre-published detectors accessed via storageUri. Changes include updating Prompt Injection detector configuration to use Red Hat registry images and introducing new HAP detector deployment alongside simplified instructions. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
modules/using-a-hugging-face-prompt-injection-detector-with-the-guardrails-orchestrator.adoc (2)
304-317: HAP detector hostname in ConfigMap does not match deployment.Line 307 references
ibm-hap-38m-detector-predictoras the HAP detector hostname, but the HAP detector InferenceService is namedhap-detector(line 236), which produces the service namehap-detector-predictor. This mismatch will cause the Orchestrator to fail service discovery.Update line 307 to use the correct hostname:
hap: type: text_contents service: - hostname: ibm-hap-38m-detector-predictor + hostname: hap-detector-predictor port: 8000Alternatively, if you intend to use a pre-existing
ibm-hap-38m-detectorservice, ensure it is deployed separately and update the documentation accordingly.
364-369: Fix typo on line 367.Line 367 contains a typo: "lvel" should be "level".
-.. Carry out content filtering for a text generation LLM at the input lvel, output level, or both. +.. Carry out content filtering for a text generation LLM at the input level, output level, or both.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
modules/using-a-hugging-face-prompt-injection-detector-with-the-guardrails-orchestrator.adoc(4 hunks)
🔇 Additional comments (1)
modules/using-a-hugging-face-prompt-injection-detector-with-the-guardrails-orchestrator.adoc (1)
99-176: Prompt Injection detector configuration looks good overall.The updates to the Prompt Injection detector—including the new display-name and template-name annotations, model format name, and Red Hat registry image—align well with the migration to pre-published detectors. The storageUri correctly references a remote OCI-based detector image.
| . Create `hap_detector.yaml`: | ||
| + | ||
| [source,yaml] | ||
| ---- | ||
| apiVersion: serving.kserve.io/v1alpha1 | ||
| kind: ServingRuntime | ||
| metadata: | ||
| name: guardrails-detector-runtime-hap | ||
| annotations: | ||
| openshift.io/display-name: guardrails-detector-runtime-hap | ||
| opendatahub.io/recommended-accelerators: '["nvidia.com/gpu"]' | ||
| opendatahub.io/template-name: guardrails-detector-huggingface-runtime | ||
| labels: | ||
| opendatahub.io/dashboard: 'true' | ||
|
|
||
| spec: | ||
| annotations: | ||
| prometheus.io/port: '8080' | ||
| prometheus.io/path: '/metrics' | ||
| multiModel: false | ||
| supportedModelFormats: | ||
| - autoSelect: true | ||
| name: guardrails-detector-hf-runtime | ||
| containers: | ||
| - name: kserve-container | ||
| image: registry.redhat.io/rhoai/odh-guardrails-detector-huggingface-runtime-rhel9:v2.25 | ||
| command: | ||
| - uvicorn | ||
| - app:app | ||
| args: | ||
| - "--workers" | ||
| - "4" | ||
| - "--host" | ||
| - "0.0.0.0" | ||
| - "--port" | ||
| - "8000" | ||
| - "--log-config" | ||
| - "/common/log_conf.yaml" | ||
| env: | ||
| - name: MODEL_DIR | ||
| value: /mnt/models | ||
| - name: HF_HOME | ||
| value: /tmp/hf_home | ||
| ports: | ||
| - containerPort: 8000 | ||
| protocol: TCP | ||
|
|
||
| --- | ||
| apiVersion: serving.kserve.io/v1beta1 | ||
| kind: InferenceService | ||
| metadata: | ||
| name: hap-detector | ||
| labels: | ||
| opendatahub.io/dashboard: 'true' | ||
| annotations: | ||
| openshift.io/display-name: hap-detector | ||
| serving.knative.openshift.io/enablePassthrough: 'true' | ||
| sidecar.istio.io/inject: 'true' | ||
| sidecar.istio.io/rewriteAppHTTPProbers: 'true' | ||
| serving.kserve.io/deploymentMode: RawDeployment | ||
|
|
||
| spec: | ||
| predictor: | ||
| maxReplicas: 1 | ||
| minReplicas: 1 | ||
| model: | ||
| modelFormat: | ||
| name: guardrails-detector-hf-runtime | ||
| name: '' | ||
| runtime: guardrails-detector-runtime-hap | ||
| storageUri: 'oci://quay.io/trustyai_testing/detectors/deberta-v3-base-prompt-injection-v2@sha256:8737d6c7c09edf4c16dc87426624fd8ed7d118a12527a36b670be60f089da215' | ||
| resources: | ||
| limits: | ||
| cpu: '1' | ||
| memory: 2Gi | ||
| nvidia.com/gpu: '0' | ||
| requests: | ||
| cpu: '1' | ||
| memory: 2Gi | ||
| nvidia.com/gpu: '0' | ||
|
|
||
| --- | ||
| apiVersion: route.openshift.io/v1 | ||
| kind: Route | ||
| metadata: | ||
| name: hap-detector-route | ||
| spec: | ||
| to: | ||
| kind: Service | ||
| name: hap-detector-predictor | ||
| ---- | ||
| . Apply `hap_detector.yaml` to configure a serving runtime, inference service, and route for the HAP detector: | ||
| + | ||
| [source,bash] | ||
| ---- | ||
| oc apply -f hap_detector.yaml | ||
| ---- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
HAP detector storageUri references wrong model.
The HAP detector configuration at line 255 references the same Prompt Injection model (deberta-v3-base-prompt-injection-v2), which will cause HAP detection to fail. The storageUri should point to a HAP-specific model instead.
Update the storageUri on line 255 to reference the correct HAP model. For example, it might be something like oci://quay.io/trustyai_testing/detectors/deberta-v3-base-hap-v2@sha256:... or similar, depending on your registry.
If you're unsure of the correct HAP model reference, please verify with the TrustyAI project documentation or team.
---
🏁 Script executed:
```shell
#!/bin/bash
# Search for all storageUri occurrences in the document to compare HAP vs Prompt Injection
rg -n "storageUri" modules/using-a-hugging-face-prompt-injection-detector-with-the-guardrails-orchestrator.adoc -A1 -B1
Length of output: 503
🏁 Script executed:
#!/bin/bash
# Search for Prompt Injection detector configuration and related model references
rg -n "prompt-injection\|prompt.injection\|Prompt Injection" modules/using-a-hugging-face-prompt-injection-detector-with-the-guardrails-orchestrator.adoc -i -C3Length of output: 0
🏁 Script executed:
#!/bin/bash
# Get the full content around the lines to understand the complete picture
sed -n '1,300p' modules/using-a-hugging-face-prompt-injection-detector-with-the-guardrails-orchestrator.adoc | tail -200Length of output: 5199
HAP detector storageUri references wrong model (line 255).
The HAP detector is configured with the identical storageUri as the Prompt Injection detector (deberta-v3-base-prompt-injection-v2). This will deploy the wrong model, causing HAP detection to fail. Use a HAP-specific model instead.
Description
How Has This Been Tested?
Merge criteria:
Summary by CodeRabbit