-
Notifications
You must be signed in to change notification settings - Fork 37
[WIP] RHAIENG-1227:RAGAS Eval Provider #1028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughAdds new AsciiDoc content documenting Ragas evaluation: an assembly page and three modules covering configuring a remote provider for production, setting up an inline provider for development, and running a notebook-driven evaluation workflow. Also inserts the new assembly into monitoring docs. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Dev as Developer / Operator
participant OC as OpenShift CLI/API
participant LS as Llama Stack Operator
participant KS as KServe
participant S3 as S3 Storage
participant KFP as Kubeflow Pipelines
rect rgb(240,248,255)
Dev->>OC: create secrets (S3, KFP token, model info)
Dev->>OC: apply ConfigMap (embedding model, endpoints, creds)
Dev->>LS: apply LlamaStackDistribution (remote inference + RAGAS)
end
rect rgb(255,250,240)
LS->>KS: request model deployment
KS->>S3: pull model / store artifacts
KS-->>LS: model endpoint ready
LS-->>Dev: expose route / pod ready
end
rect rgb(240,255,240)
Dev->>KFP: verify Pipelines token & endpoint
Dev->>LS: trigger test evaluation (Python/KFP client)
LS->>KFP: submit evaluation run
KFP-->>Dev: run status / results
end
sequenceDiagram
autonumber
participant Client as Python Client
participant API as Llama Stack API
rect rgb(248,249,255)
Client->>API: create dataset (inputs, responses, contexts)
API-->>Client: dataset_id
Client->>API: register benchmark (RAGAS scoring_functions)
API-->>Client: benchmark_id
Client->>API: run evaluation job (benchmark_config, model info)
API-->>Client: job_id
end
rect rgb(245,255,250)
Client->>API: poll job status
alt job complete
API-->>Client: results (per-metric scores)
else job failed
API-->>Client: error
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (8)
modules/setting-up-ragas-inline-provider.adoc (3)
180-197: Add error handling to Python API example.The client initialization and provider listing (lines 187–193) lack error handling. If the Llama Stack route is incorrect or the service is unavailable, users will encounter an unhelpful error.
Apply this diff to add basic error handling:
from llama_stack_client import LlamaStackClient # Get the Llama Stack route URL # oc get route llama-stack-ragas-inline -o jsonpath='{.spec.host}' -client = LlamaStackClient(base_url="https://__<llama-stack-route-url>__") +try: + client = LlamaStackClient(base_url="https://__<llama-stack-route-url>__") -# List available providers -providers = client.providers.list() -print("Available eval providers:") -for provider in providers.eval: - print(f" - {provider.provider_id} ({provider.provider_type})") + # List available providers + providers = client.providers.list() + print("Available eval providers:") + for provider in providers.eval: + print(f" - {provider.provider_id} ({provider.provider_type})") -# Expected output should include: -# - trustyai_ragas_inline (inline::trustyai_ragas) + # Expected output should include: + # - trustyai_ragas_inline (inline::trustyai_ragas) +except Exception as e: + print(f"Error connecting to Llama Stack: {e}") + print("Verify that the Llama Stack route URL is correct and the service is running.")
61-61: Emphasize production TLS requirement in security context.Line 61 sets
VLLM_TLS_VERIFY="false"with a comment referencing production use (line 73). While the note is present, the security implication warrants clearer emphasis for users who might copy-paste without reading carefully.Consider restructuring the NOTE block to frontload the security concern:
$ export VLLM_API_TOKEN="__<token_identifier>__" $ oc create secret generic llama-stack-inference-model-secret \ --from-literal INFERENCE_MODEL="$INFERENCE_MODEL" \ --from-literal VLLM_URL="$VLLM_URL" \ --from-literal VLLM_TLS_VERIFY="$VLLM_TLS_VERIFY" \ --from-literal VLLM_API_TOKEN="$VLLM_API_TOKEN" ---- + +[WARNING] +==== +TLS verification is disabled for demonstration. In production, set `VLLM_TLS_VERIFY` to `true` to enable TLS certificate verification and protect against man-in-the-middle attacks. +==== + -[NOTE] -==== -Replace `__<token_identifier>__` with your actual model API token. In production environments, set `VLLM_TLS_VERIFY` to `true` to enable TLS certificate verification. -==== +[NOTE] +==== +Replace `__<token_identifier>__` with your actual model API token. +====Also applies to: 73-73
144-151: Clarify pod readiness criteria and add timeout guidance.The procedure instructs users to wait for pod status to show
Running(line 151), but does not explain what "ready" means or specify a reasonable timeout. A hanging pod might silently fail.Enhance the wait step with a more explicit condition check:
. Wait for the deployment to complete: + [source,subs="+quotes"] ---- -$ oc get pods -w +$ oc wait --for=condition=ready pod -l app=llama-stack-ragas-inline --timeout=300s ---- + -Wait until the `llama-stack-ragas-inline` pod status shows `Running`. +Wait until the pod status shows `Ready 1/1`. If the timeout is exceeded, check pod logs for errors using: + +[source,subs="+quotes"] +---- +$ oc logs -l app=llama-stack-ragas-inline --tail=50 +----modules/evaluating-rag-system-quality-with-ragas.adoc (3)
77-77: Use consistent placeholder format.Line 77 uses
<your-llama-stack-url>while other documentation files use__<placeholder>__format. This inconsistency can confuse users about which parts require substitution.Apply this diff to align with the project's placeholder convention:
# Initialize the client -client = LlamaStackClient(base_url="<your-llama-stack-url>") +client = LlamaStackClient(base_url="https://__<llama-stack-route-url>__")
80-84: Add error handling and validation to dataset registration.The dataset registration (lines 80–84) lacks error handling. If the dataset registration fails (e.g., invalid format, network error), users receive no guidance.
Apply this diff to add basic validation and error handling:
# Register the dataset -client.datasets.register( - dataset_id="my-rag-evaluation", - dataset_def=dataset, - provider_id="huggingface", # or your configured provider -) +try: + client.datasets.register( + dataset_id="my-rag-evaluation", + dataset_def=dataset, + provider_id="huggingface", # or your configured provider + ) + print("✓ Dataset registered successfully") +except Exception as e: + print(f"✗ Failed to register dataset: {e}") + raise
140-151: Add exponential backoff guidance and timeout limit to polling loop.The job status polling (lines 140–151) uses a fixed 5-second interval with no maximum retry count or timeout. A frozen job could cause the script to run indefinitely.
Apply this diff to add timeout and clearer failure messaging:
import time +import sys # Poll for completion +max_wait_seconds = 3600 # 1 hour timeout +elapsed = 0 while True: job_status = client.eval.job_status(job_id=response.job_id) if job_status.status == "completed": print("Evaluation completed successfully") break elif job_status.status == "failed": print(f"Evaluation failed: {job_status.error}") - break + sys.exit(1) else: print(f"Status: {job_status.status}, Progress: {job_status.progress}%") + elapsed += 5 + if elapsed > max_wait_seconds: + print(f"✗ Evaluation timeout after {max_wait_seconds} seconds") + sys.exit(1) time.sleep(5)modules/configuring-ragas-remote-provider-for-production.adoc (2)
126-140: Add fallback guidance if Llama Stack route is not automatically created.The procedure assumes the external route is automatically created during LSD deployment (line 133), but provides no troubleshooting steps if the route does not exist.
Add a note clarifying what happens if the route is missing:
. Create an external route for the Llama Stack server: + [NOTE] ==== The KUBEFLOW_LLAMA_STACK_URL must be an external route that is accessible from the Kubeflow Pipeline pods. ==== + The route is automatically created when you deploy the LlamaStackDistribution. You can get the route URL with the following command: + +If the route is not automatically created, verify that the LlamaStackDistribution CR was deployed successfully: + +[source,subs="+quotes"] +---- +$ oc get llamastackdistribution llama-stack-ragas-remote -o jsonpath='{.status.conditions[*].message}' +---- + [source,subs="+quotes"] ---- $ export LLAMA_STACK_URL=$(oc get route llama-stack-ragas-remote -o jsonpath='{.spec.host}') $ echo "Llama Stack URL: https://$LLAMA_STACK_URL" + +If the command returns an error, the LSD deployment may still be in progress. Wait a few moments and retry. ----
253-299: Add error handling and retry logic to test evaluation.The test evaluation (lines 269–299) and provider listing (lines 253–267) lack error handling. Connection failures or invalid credentials will surface only as cryptic exceptions.
Apply this diff to add error handling and clearer diagnostic output:
from llama_stack_client import LlamaStackClient -client = LlamaStackClient(base_url="https://__<llama-stack-route-url>__") +try: + client = LlamaStackClient(base_url="https://__<llama-stack-route-url>__") +except Exception as e: + print(f"✗ Failed to connect to Llama Stack: {e}") + print("Verify that the Llama Stack URL is correct and the service is running.") + raise # List available eval providers -providers = client.providers.list() -print("Available eval providers:") -for provider in providers.eval: - print(f" - {provider.provider_id} ({provider.provider_type})") +try: + providers = client.providers.list() + print("✓ Available eval providers:") + for provider in providers.eval: + print(f" - {provider.provider_id} ({provider.provider_type})") +except Exception as e: + print(f"✗ Failed to list providers: {e}") + raise # Expected output should include: # - trustyai_ragas_remote (remote::trustyai_ragas)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
modules/configuring-ragas-remote-provider-for-production.adoc(1 hunks)modules/evaluating-rag-system-quality-with-ragas.adoc(1 hunks)modules/setting-up-ragas-inline-provider.adoc(1 hunks)monitoring-data-science-models.adoc(1 hunks)
🔇 Additional comments (1)
monitoring-data-science-models.adoc (1)
32-32: LGTM—include reference correctly positioned.The addition integrates RAG evaluation documentation into the monitoring flow following LLM evaluation, with proper syntax and formatting consistency.
ecf7994 to
26aa719
Compare
d04705f to
36d6af2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
modules/configuring-ragas-remote-provider-for-production.adoc(1 hunks)
🔇 Additional comments (2)
modules/configuring-ragas-remote-provider-for-production.adoc (2)
142-145: I'll verify the technical claims in this review by searching for the official DSPA and OpenShift AI documentation on route naming and operator labels.Verify the DSPA route configuration against your deployed instances.
This review suggests updating the route discovery method but the specific patterns referenced cannot be fully verified against official OpenShift AI documentation. One documented example uses the route name
ds-pipeline-pipelines-definition, which differs from both the current code and the suggested alternatives.Test the proposed changes against your actual DSPA instances to confirm which route discovery method works correctly in your environment:
- Verify the current label selector returns the expected KFP route
- Test the suggested route name pattern with your DSPA instance name
- Confirm the alternative label selector works reliably
Once validated, update the documentation with the confirmed approach.
412-414: Verify/update Next steps links — modules/configuring-ragas-remote-provider-for-production.adoc (lines 412–414)
- The Next steps entry links to {odhdocshome}/working-with-rag/#evaluating-rag-system-quality-with-ragas_rag; modules/evaluating-rag-system-quality-with-ragas.adoc exists, but modules/integrating-ragas-into-cicd-pipelines.adoc and modules/monitoring-ragas-evaluation-pipelines.adoc are not present in modules/.
- Before merge: either create the two missing files, update the links to point to existing documentation (or the correct working-with-rag anchor), or replace the references with a "coming soon" placeholder.
| [role='_abstract'] | ||
| You can configure the RAGAS remote provider to run evaluations as distributed jobs using {productname-short} pipelines. The remote provider enables production-scale evaluations by running RAGAS in a separate Kubeflow Pipelines environment, providing resource isolation, improved scalability, and integration with CI/CD workflows. | ||
|
|
||
| .Prerequisites |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we mention installing the RHOAI operator, deploying DSCI, and DSC with LlamaStack component enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adolfo-ab That's a good idea. i will add those. In the DSC, what is the exact path of the spec.component that needs to be enabled? Is it the spec.components.llamastackoperator or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added something at line 12, and I'll update it with more correct info when I hear back on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answered in L12
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (2)
modules/configuring-ragas-remote-provider-for-production.adoc (2)
142-145: Replace unsupported label selector with DSPA instance-name route pattern (Duplicate from past review).Line 142 uses the label selector
app=ds-pipeline-dspato retrieve the Kubeflow Pipelines endpoint, but this label is not a published/supported selector in the Data Science Pipelines Operator. Use the documented route naming pattern instead:ds-pipeline-ui-<DSPA_INSTANCE_NAME>or query the actual route by component label.Apply this diff to use the documented route naming pattern:
-$ export KFP_ENDPOINT=$(oc get route -n <project_name> -l app=ds-pipeline-dspa -o jsonpath='{.items[0].spec.host}') +$ export KFP_ENDPOINT=$(oc get route ds-pipeline-ui-__<dspa_instance_name>__ -n <project_name> -o jsonpath='{.spec.host}')Alternatively, if the DSPA instance name is not known, provide guidance to list available routes:
+# To find your DSPA instance name and route: +$ oc get route -n <project_name> | grep ds-pipeline +# Then use: oc get route ds-pipeline-ui-<your_instance_name> -n <project_name> -o jsonpath='{.spec.host}'
128-129: Add security advisory for disabled TLS verification (Duplicate from past review with enhancement).Lines 128–129 set
VLLM_TLS_VERIFY="false"with only an inline comment. While the comment acknowledges this is not production-safe, no structured security warning block is provided. Add a prominent[IMPORTANT]admonition to clearly warn against this practice in production.Apply this diff to add security guidance immediately after line 128:
$ export VLLM_TLS_VERIFY="false" # Use "true" in production $ export VLLM_API_TOKEN="<token_identifier>" + +[IMPORTANT] +==== +*Never disable TLS verification in production environments.* Setting `VLLM_TLS_VERIFY=false` +exposes your system to man-in-the-middle (MITM) attacks. This setting is only appropriate +for development and testing with self-signed certificates. + +For production deployments: + +* Set `VLLM_TLS_VERIFY=true` +* Use a valid CA-signed certificate for your inference service +* If using self-signed certificates in production, add your internal CA to the system trust store instead of disabling verification +* Rotate `VLLM_API_TOKEN` regularly and store it securely in a secret +====Also applies to: 217-223
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
assemblies/evaluating-rag-systems.adoc(1 hunks)modules/configuring-ragas-remote-provider-for-production.adoc(1 hunks)modules/evaluating-rag-system-quality-with-ragas.adoc(1 hunks)modules/setting-up-ragas-inline-provider.adoc(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- assemblies/evaluating-rag-systems.adoc
🚧 Files skipped from review as they are similar to previous changes (2)
- modules/setting-up-ragas-inline-provider.adoc
- modules/evaluating-rag-system-quality-with-ragas.adoc
dmaniloff
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @skrthomas !! Left some comments w/ some suggestions.
|
|
||
| * You have access to a workbench or notebook environment where you can run Python code. | ||
|
|
||
| .Procedure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as above this is just duplicating the contents of https://github.com/trustyai-explainability/llama-stack-provider-ragas/blob/main/demos/basic_demo.ipynb which is going to be the more up to date version so we should just link to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay @dmaniloff @adolfo-ab I re-did this procedure to focus more on the steps a user would need to take to run the Jupyter notebook instead of trying to re-create the notebook in the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's overkill now. IMO explaining how to create a workbench and clone the upstream repo is a bit out of scope for these docs and we're basically redirecting the user somewhere else (the upstream repo) to learn how to use the provider. Personally I think it would be better to put less emphasis on creating the workbench and show the code snippets here, briefly explaining what we're doing on each step (registering the dataset, registering the benchmark, running an evaluation, checking the results). But I leave the final decision up to you guys @dmaniloff @skrthomas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure what the best approach was. When I talked with the AI docs team about it, the example use case I got was this https://opendatahub.io/docs/working-with-data-in-s3-compatible-object-store/, and it seemed like they went through the steps for cloning the repo and creating the workbench. My initial inclination was to do as you say, replicate the code snippets and explain them in procedural steps, which is what I originally had (although I understand there were some inaccuracies in how the Python ultimately got copied over). In OCP docs, this is the approach we would usually take (replication rather than linking) because we don't ever link to upstream repos in OCP. In RHOAI, however, it appears that the guidance is different. The content strategist for our team advised the following when I was asking the team about which direction was best for this:
The notebooks are upstream but they have been worked on extensively by engineering teams. Also, by cloning the notebooks, users can run them in their local environment or on runtime in a cluster. That is something that they can't do in the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
assemblies/evaluating-rag-systems.adoc (1)
5-5: AsciiDoc ID should use lowercase convention.The anchor ID uses a capital "R" in
Ragas, but AsciiDoc convention typically uses lowercase for IDs. Consider updating toevaluating-rag-systems-with-ragas_{context}for consistency with standard documentation patterns.-[id="evaluating-rag-systems-with-Ragas_{context}"] +[id="evaluating-rag-systems-with-ragas_{context}"]
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
assemblies/evaluating-rag-systems.adoc(1 hunks)modules/evaluating-rag-system-quality-with-ragas.adoc(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- modules/evaluating-rag-system-quality-with-ragas.adoc
🔇 Additional comments (1)
assemblies/evaluating-rag-systems.adoc (1)
1-86: Content structure and messaging align well with prior review feedback.The documentation correctly:
- Uses "Ragas" capitalization (not "RAGAS") as per upstream docs convention
- Distinguishes between the Ragas product and the Ragas provider pattern
- Includes external links to authoritative documentation
- Removes high-level prerequisites in favor of procedure-specific requirements
- Clearly separates inline (development) vs. remote (production) deployment modes with appropriate trade-offs
The content is well-structured, the use cases are comprehensive, and the metrics explanations are clear. The integration of this assembly into the monitoring documentation should help engineers understand RAG evaluation workflows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
modules/configuring-ragas-remote-provider-for-production.adoc (1)
126-134: Consider expanding TLS security guidance with an IMPORTANT admonition block.The inline comment on line 126 acknowledges the production concern, but past review feedback indicates an explicit IMPORTANT block should be added with comprehensive security guidance (e.g., MITM risks, proper certificate handling, credential rotation).
Consider adding the following IMPORTANT block after line 134 to enhance security guidance:
+ +[IMPORTANT] +==== +Never disable TLS verification in production environments. Setting `VLLM_TLS_VERIFY=false` +exposes your system to man-in-the-middle (MITM) attacks. This setting is only appropriate +for development and testing with self-signed certificates. For production, use a valid +certificate and set `VLLM_TLS_VERIFY=true`. Ensure credentials like `VLLM_API_TOKEN` and +S3 keys are kept secure and rotated regularly. +====
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
modules/configuring-ragas-remote-provider-for-production.adoc (2)
315-315: Fix pod name inconsistency in deployment verification step.Line 315 references the wrong pod name. The YAML at line 186 defines the pod as
llama-stack-pod, but the instruction tells users to wait forllama-stack-ragas-remote, causing verification to fail.-Wait until the `llama-stack-ragas-remote` pod status shows `Running`. +Wait until the `llama-stack-pod` pod status shows `Running`.
126-127: Enhance TLS verification security guidance with explicit warnings.Lines 126-127 set
VLLM_TLS_VERIFY="false"but provide insufficient security context. Add a dedicated admonition block immediately after line 127 with production-environment warnings and safe alternatives.Apply this diff to add comprehensive security guidance:
$ export VLLM_API_TOKEN="<token_identifier>" +[IMPORTANT] +==== +Never disable TLS verification in production environments. Setting `VLLM_TLS_VERIFY=false` +exposes your system to man-in-the-middle (MITM) attacks and is a critical security risk. + +For production deployments: + +* Set `VLLM_TLS_VERIFY=true` and use a valid CA-signed certificate, or +* Add your internal CA to the system trust store in the container image, or +* Use a dedicated development/testing namespace with explicit CA trust rather than disabling verification. + +Ensure `VLLM_API_TOKEN` and other credentials are also secured appropriately (rotated regularly, accessed only by authorized service accounts). +==== + $ oc create secret generic llama-stack-inference-model-secret \
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
modules/configuring-ragas-remote-provider-for-production.adoc(1 hunks)
🔇 Additional comments (1)
modules/configuring-ragas-remote-provider-for-production.adoc (1)
318-320: Close unclosedifdef::upstream[]block to fix AsciiDoc syntax error.The file ends with an open
ifdef::upstream[]block (line 318) without a closingendif::[]directive. This breaks AsciiDoc rendering and prevents the documentation from being built.Add
endif::[]after the link:* link:{odhdocshome}/working-with-rag/#evaluating-rag-system-quality-with-ragas_rag[Evaluating RAG system quality with Ragas metrics] +endif::[]Also verify that the "Next steps" section is complete. If additional steps or links are intended for cloud-service or self-managed environments, add corresponding
ifdefblocks and content before the closingendif::[].Likely an incorrect or invalid review comment.
3790df2 to
c62de3c
Compare
c62de3c to
caa9b7f
Compare
| name: llama-stack-ragas-inline | ||
| namespace: <project_name> | ||
| spec: | ||
| imageName: quay.io/trustyai/llama-stack-ragas:latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmaniloff did you ever hear back about the image name we can use in the product docs? cc @adolfo-ab
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
quay.io/rhoai/odh-trustyai-ragas-lls-provider-dsp-rhel9:rhoai-3.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment above about the spec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I think I got image questions mixed up. I was meaning to ask about the KUBEFLOW_BASE_IMAGE in the Ragas remote provider ConfigMap.
| ifndef::upstream[] | ||
| For more information, see link:{rhoaidocshome}evaluating-rag-systems/setting-up-Ragas-inline-provider[Setting up the Ragas inline provider for development]. | ||
| endif::[] | ||
| * You have access to a workbench or notebook environment where you can run Python code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the llama stack server can take requests over http. the python client (which is what we use in the demo notebook to verify setup) is one of many clients that can talk to llama stack.
therefore, this is not strictly a requirement. i think i would phrase this part along the following lines:
- "to test your setup, read the demo notebok in (link to the 0.4.2 version of the ragas provider) which outlines the basic steps via the python client.
- alternatively, you can also submit an eval by directly using the http methods of the llama stack api (link to the llama stack 0.3.0 docs).
- if you do wish to follow the example provided in the notebook, you can do so from the Jupyter environment via the following steps (and then list your steps below, also mention that the llama stack pod will have to be accessible from the jupyter env in the cluster which may not be the case by default).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmaniloff thanks for this info. I will incorporate this to the intro! Just wanting to check, when you say link to the 3.0 llama stack api in the 3.0 docs, do you mean the 3.0 version of this: https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/2.25/html/working_with_llama_stack/overview-of-llama-stack_rag#the_llamastackdistribution_custom_resource_api_providers
Or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cc @kelbrown20
| .Prerequisites | ||
| * You have cluster administrator privileges for your {openshift-platform} cluster. | ||
| * You have installed the {productname-short} Operator. | ||
| * You have a DataScienceCluster custom resource in your environment; in the `spec.components` section the `llamastack` component is set to `enabled/true`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * You have a DataScienceCluster custom resource in your environment; in the `spec.components` section the `llamastack` component is set to `enabled/true`. | |
| * You have a DataScienceCluster custom resource in your environment; in the `spec.components` section the `llamastackoperator` component should be enabled: | |
| llamastackoperator: | |
| managementState: Managed | |
| [role='_abstract'] | ||
| You can configure the RAGAS remote provider to run evaluations as distributed jobs using {productname-short} pipelines. The remote provider enables production-scale evaluations by running RAGAS in a separate Kubeflow Pipelines environment, providing resource isolation, improved scalability, and integration with CI/CD workflows. | ||
|
|
||
| .Prerequisites |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answered in L12
| .Prerequisites | ||
| * You have logged in to {productname-long}. | ||
| * You have created a data science project. | ||
| * You have created a data science pipeline. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * You have created a data science pipeline. | |
| * You have created a data science pipeline server. |
| * You have created a data science project. | ||
| * You have created a data science pipeline. | ||
| * You have created a secret for your AWS credentials in your project namespace. | ||
| * You have deployed a Llama Stack distribution with the Ragas evaluation provider enabled, either Inline or Remote provider. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| * You have deployed a Llama Stack distribution with the Ragas evaluation provider enabled, either Inline or Remote provider. | |
| * You have deployed a Llama Stack distribution with the Ragas evaluation provider enabled (either Inline or Remote). |
| ifndef::upstream[] | ||
| For more information, see link:{rhoaidocshome}evaluating-rag-systems/setting-up-Ragas-inline-provider[Setting up the Ragas inline provider for development]. | ||
| endif::[] | ||
| * You have access to a workbench or notebook environment where you can run Python code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
|
|
||
| * You have access to a workbench or notebook environment where you can run Python code. | ||
|
|
||
| .Procedure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's overkill now. IMO explaining how to create a workbench and clone the upstream repo is a bit out of scope for these docs and we're basically redirecting the user somewhere else (the upstream repo) to learn how to use the provider. Personally I think it would be better to put less emphasis on creating the workbench and show the code snippets here, briefly explaining what we're doing on each step (registering the dataset, registering the benchmark, running an evaluation, checking the results). But I leave the final decision up to you guys @dmaniloff @skrthomas
| metadata: | ||
| name: llama-stack-ragas-inline | ||
| namespace: <project_name> | ||
| spec: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless a lot of things have changed recently with LlamaStackDistribution, this spec is not correct, it should be something like this:
spec:
replicas: 1
server:
containerSpec:
env:
- name: VLLM_URL
value: <model_url>
- name: VLLM_API_TOKEN
value: <model_api_token (if necessary)>
- name: INFERENCE_MODEL
value: <model_name>
- name: MILVUS_DB_PATH
value: ~/.llama/milvus.db
- name: VLLM_TLS_VERIFY
value: "false"
- name: FMS_ORCHESTRATOR_URL
value: http://localhost:123
- name: EMBEDDING_MODEL
value: granite-embedding-125m
AFAIK the only "extra" field that we need for Ragas inline is the EMBEDDING_MODEL.
cc: @dmaniloff @Bobbins228 @Artemon-line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay gotcha. I updated to exactly this in the docs. @dmaniloff @Bobbins228 can you confirm that the above YAML is sufficient for the Inline LlamaStackDistribution example? I think there was a lot of incorrect spec inferred from Claude code because in the resources, only EMBEDDING_MODEL is specifically mentioned for inline. But yeah, I can see that this is really off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought EMBEDDING_MODEL is needed for both inline and remote though https://trustyai.org/llama-stack-provider-ragas/llama-stack-provider-ragas/0.1/remote-provider.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, EMBEDDING_MODEL is needed for both.
| name: llama-stack-ragas-inline | ||
| namespace: <project_name> | ||
| spec: | ||
| imageName: quay.io/trustyai/llama-stack-ragas:latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment above about the spec
081f671 to
6b8cd49
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
modules/configuring-ragas-remote-provider-for-production.adoc(1 hunks)modules/evaluating-rag-system-quality-with-ragas.adoc(1 hunks)modules/setting-up-ragas-inline-provider.adoc(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- modules/configuring-ragas-remote-provider-for-production.adoc
- modules/setting-up-ragas-inline-provider.adoc
🔇 Additional comments (2)
modules/evaluating-rag-system-quality-with-ragas.adoc (2)
17-29: Prerequisites section aligns with earlier feedback.The prerequisites section properly incorporates the wording suggestions from prior review feedback ("created a data science pipeline server" and "Ragas evaluation provider enabled (Inline or Remote)") and includes appropriate conditional documentation links.
31-66: Procedure structure and workflow are sound.The step-by-step procedure effectively guides users through the evaluation workflow: workbench setup, repository cloning, notebook execution, pipeline monitoring, and results inspection. The approach of linking to the upstream notebook rather than duplicating content aligns with team discussions and the RHOAI documentation guidance on leveraging upstream repositories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
assemblies/evaluating-rag-systems.adoc(1 hunks)modules/configuring-ragas-remote-provider-for-production.adoc(1 hunks)modules/evaluating-rag-system-quality-with-ragas.adoc(1 hunks)modules/setting-up-ragas-inline-provider.adoc(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- modules/setting-up-ragas-inline-provider.adoc
- modules/configuring-ragas-remote-provider-for-production.adoc
🔇 Additional comments (4)
assemblies/evaluating-rag-systems.adoc (2)
5-5: Verify ID declaration capitalization for consistency.Line 5 uses uppercase "Ragas" in the ID, but the included files (lines 79–83) use lowercase
ragasin filenames. For consistency with typical AsciiDoc conventions and to ensure cross-references work reliably on case-sensitive systems, the ID should use lowercaseragas:[id="evaluating-rag-systems-with-ragas_{context}"].Confirm this matches the capitalization pattern used in the target module files (modules/setting-up-ragas-inline-provider.adoc, modules/configuring-ragas-remote-provider-for-production.adoc, and modules/evaluating-rag-system-quality-with-ragas.adoc).
79-83: Include statements are correct.The filenames use consistent lowercase
ragasconvention. Past issue flagged on line 81 appears to be resolved.modules/evaluating-rag-system-quality-with-ragas.adoc (2)
3-3: Standardize ID declaration capitalization to lowercase.Line 3 uses uppercase "Ragas" in the ID:
evaluating-rag-system-quality-with-Ragas_{context}. To match the lowercase convention used in related files and filenames, change this toevaluating-rag-system-quality-with-ragas_{context}.Confirm this aligns with how the ID is referenced in parent assembly or other cross-reference locations.
40-40: Image reference is valid.The image file
images/open.pngexists in the repository and is correctly referenced on line 40. No changes needed.
| * You have created a secret for your AWS credentials in your project namespace. | ||
| * You have deployed a Llama Stack distribution with the Ragas evaluation provider enabled (Inline or Remote). | ||
| ifdef::upstream[] | ||
| For more information, see link:{odhdocshome}/evaluating-rag-systems/configuring-Ragas-remote-provider-for-production[Configuring the Ragas remote provider for production]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix capitalization in link target IDs to match actual declarations.
Lines 24 and 27 reference link targets with uppercase "Ragas" (configuring-Ragas-remote-provider-for-production and setting-up-Ragas-inline-provider), but the actual ID declarations in the target files use lowercase ragas. This will cause links to fail on case-sensitive filesystems.
Update line 24 from configuring-Ragas-remote-provider-for-production to configuring-ragas-remote-provider-for-production and line 27 from setting-up-Ragas-inline-provider to setting-up-ragas-inline-provider.
ifdef::upstream[]
- For more information, see link:{odhdocshome}/evaluating-rag-systems/configuring-Ragas-remote-provider-for-production[Configuring the Ragas remote provider for production].
+ For more information, see link:{odhdocshome}/evaluating-rag-systems/configuring-ragas-remote-provider-for-production[Configuring the Ragas remote provider for production].
endif::[]
ifndef::upstream[]
- For more information, see link:{rhoaidocshome}evaluating-rag-systems/setting-up-Ragas-inline-provider[Setting up the Ragas inline provider for development].
+ For more information, see link:{rhoaidocshome}evaluating-rag-systems/setting-up-ragas-inline-provider[Setting up the Ragas inline provider for development].
endif::[]Also applies to: 27-27
🤖 Prompt for AI Agents
In modules/evaluating-rag-system-quality-with-ragas.adoc around lines 24 and 27,
the link target IDs use uppercase "Ragas" which don't match the actual lowercase
declarations; update line 24's target from
configuring-Ragas-remote-provider-for-production to
configuring-ragas-remote-provider-for-production and line 27's target from
setting-up-Ragas-inline-provider to setting-up-ragas-inline-provider so the link
IDs match exactly (case-sensitive).
Description
How Has This Been Tested?
Merge criteria:
Summary by CodeRabbit