Skip to content

Conversation

@skrthomas
Copy link
Contributor

@skrthomas skrthomas commented Oct 30, 2025

Description

How Has This Been Tested?

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

  • Documentation
    • Added comprehensive Ragas docs: remote provider production setup with prerequisites, secrets/configuration, manifests, deployment and verification steps.
    • Added inline provider setup for development with ConfigMap/manifest examples and quick-deploy guidance.
    • Added RAG evaluation guide with a Jupyter notebook demo, evaluation workflow, metric interpretation, and monitoring steps.
    • Integrated Ragas overview and evaluation content into monitoring and data‑science assemblies.

@coderabbitai
Copy link

coderabbitai bot commented Oct 30, 2025

Walkthrough

Adds new AsciiDoc content documenting Ragas evaluation: an assembly page and three modules covering configuring a remote provider for production, setting up an inline provider for development, and running a notebook-driven evaluation workflow. Also inserts the new assembly into monitoring docs.

Changes

Cohort / File(s) Summary
Remote provider production guide
modules/configuring-ragas-remote-provider-for-production.adoc
New procedure documenting prerequisites, secret creation (S3, Kubeflow Pipelines token, model info), kubeflow-ragas-config ConfigMap, LlamaStackDistribution (llama-stack-ragas-remote) manifest using envFrom and secret refs, commands to apply manifests, pod/route verification, health checks, and example CLI/Python test snippets.
Inline provider development guide
modules/setting-up-ragas-inline-provider.adoc
New procedure describing prerequisites, ragas-inline-config.yaml ConfigMap (EMBEDDING_MODEL), LlamaStackDistribution manifest with env vars (VLLM_URL, VLLM_API_TOKEN, INFERENCE_MODEL, MILVUS_DB_PATH, VLLM_TLS_VERIFY, FMS_ORCHESTRATOR_URL, EMBEDDING_MODEL), deployment via oc apply, pod readiness checks, verification notes, and next-step links.
RAGAS evaluation workflow / notebook
modules/evaluating-rag-system-quality-with-ragas.adoc
New notebook-driven evaluation guide: prerequisites (cluster/project context, pipeline server, AWS secret, Llama Stack distribution with Ragas), cloning repo, running basic_demo.ipynb in a workbench, dataset/benchmark registration, launching and monitoring evaluation runs (KFP/API), and viewing results; includes conditional upstream/non-upstream references.
RAGAS assembly page
assemblies/evaluating-rag-systems.adoc
New assembly introducing Ragas concepts and core metrics (Faithfulness, Answer Relevancy, Context Precision/Recall, Answer Correctness/Similarity), deployment modes (inline vs remote), use cases, prerequisites, and include links to the procedural modules.
Monitoring docs integration
monitoring-data-science-models.adoc
Adds include: assemblies/evaluating-rag-systems.adoc[leveloffset=+1] inserted after evaluating-large-language-models.adoc.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Dev as Developer / Operator
    participant OC as OpenShift CLI/API
    participant LS as Llama Stack Operator
    participant KS as KServe
    participant S3 as S3 Storage
    participant KFP as Kubeflow Pipelines
    rect rgb(240,248,255)
    Dev->>OC: create secrets (S3, KFP token, model info)
    Dev->>OC: apply ConfigMap (embedding model, endpoints, creds)
    Dev->>LS: apply LlamaStackDistribution (remote inference + RAGAS)
    end
    rect rgb(255,250,240)
    LS->>KS: request model deployment
    KS->>S3: pull model / store artifacts
    KS-->>LS: model endpoint ready
    LS-->>Dev: expose route / pod ready
    end
    rect rgb(240,255,240)
    Dev->>KFP: verify Pipelines token & endpoint
    Dev->>LS: trigger test evaluation (Python/KFP client)
    LS->>KFP: submit evaluation run
    KFP-->>Dev: run status / results
    end
Loading
sequenceDiagram
    autonumber
    participant Client as Python Client
    participant API as Llama Stack API
    rect rgb(248,249,255)
    Client->>API: create dataset (inputs, responses, contexts)
    API-->>Client: dataset_id
    Client->>API: register benchmark (RAGAS scoring_functions)
    API-->>Client: benchmark_id
    Client->>API: run evaluation job (benchmark_config, model info)
    API-->>Client: job_id
    end
    rect rgb(245,255,250)
    Client->>API: poll job status
    alt job complete
        API-->>Client: results (per-metric scores)
    else job failed
        API-->>Client: error
    end
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Check consistency of ConfigMap keys and envFrom/secret references in the two LlamaStackDistribution manifests.
  • Verify CLI commands and YAML examples (secrets, ConfigMaps, manifests) in modules/configuring-ragas-remote-provider-for-production.adoc.
  • Review notebook steps and demo notebook references in modules/evaluating-rag-system-quality-with-ragas.adoc.
  • Confirm the include placement and leveloffset=+1 in monitoring-data-science-models.adoc.

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title '[WIP] RHAIENG-1227:RAGAS Eval Provider' is partially related to the changeset. It references the ticket but is vague and lacks specificity about the actual content—which adds comprehensive documentation for setting up and evaluating RAG systems with Ragas. Revise the title to clearly describe the main change, e.g., 'Add Ragas evaluation provider documentation for inline and remote deployment' or similar, and remove the [WIP] prefix once ready for merge.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (8)
modules/setting-up-ragas-inline-provider.adoc (3)

180-197: Add error handling to Python API example.

The client initialization and provider listing (lines 187–193) lack error handling. If the Llama Stack route is incorrect or the service is unavailable, users will encounter an unhelpful error.

Apply this diff to add basic error handling:

 from llama_stack_client import LlamaStackClient
 
 # Get the Llama Stack route URL
 # oc get route llama-stack-ragas-inline -o jsonpath='{.spec.host}'
 
-client = LlamaStackClient(base_url="https://__<llama-stack-route-url>__")
+try:
+    client = LlamaStackClient(base_url="https://__<llama-stack-route-url>__")
 
-# List available providers
-providers = client.providers.list()
-print("Available eval providers:")
-for provider in providers.eval:
-    print(f"  - {provider.provider_id} ({provider.provider_type})")
+    # List available providers
+    providers = client.providers.list()
+    print("Available eval providers:")
+    for provider in providers.eval:
+        print(f"  - {provider.provider_id} ({provider.provider_type})")
 
-# Expected output should include:
-#   - trustyai_ragas_inline (inline::trustyai_ragas)
+    # Expected output should include:
+    #   - trustyai_ragas_inline (inline::trustyai_ragas)
+except Exception as e:
+    print(f"Error connecting to Llama Stack: {e}")
+    print("Verify that the Llama Stack route URL is correct and the service is running.")

61-61: Emphasize production TLS requirement in security context.

Line 61 sets VLLM_TLS_VERIFY="false" with a comment referencing production use (line 73). While the note is present, the security implication warrants clearer emphasis for users who might copy-paste without reading carefully.

Consider restructuring the NOTE block to frontload the security concern:

 $ export VLLM_API_TOKEN="__<token_identifier>__"
 
 $ oc create secret generic llama-stack-inference-model-secret \
   --from-literal INFERENCE_MODEL="$INFERENCE_MODEL" \
   --from-literal VLLM_URL="$VLLM_URL" \
   --from-literal VLLM_TLS_VERIFY="$VLLM_TLS_VERIFY" \
   --from-literal VLLM_API_TOKEN="$VLLM_API_TOKEN"
 ----
 +
+[WARNING]
+====
+TLS verification is disabled for demonstration. In production, set `VLLM_TLS_VERIFY` to `true` to enable TLS certificate verification and protect against man-in-the-middle attacks.
+====
+
-[NOTE]
-====
-Replace `__<token_identifier>__` with your actual model API token. In production environments, set `VLLM_TLS_VERIFY` to `true` to enable TLS certificate verification.
-====
+[NOTE]
+====
+Replace `__<token_identifier>__` with your actual model API token.
+====

Also applies to: 73-73


144-151: Clarify pod readiness criteria and add timeout guidance.

The procedure instructs users to wait for pod status to show Running (line 151), but does not explain what "ready" means or specify a reasonable timeout. A hanging pod might silently fail.

Enhance the wait step with a more explicit condition check:

 . Wait for the deployment to complete:
 +
 [source,subs="+quotes"]
 ----
-$ oc get pods -w
+$ oc wait --for=condition=ready pod -l app=llama-stack-ragas-inline --timeout=300s
 ----
 +
-Wait until the `llama-stack-ragas-inline` pod status shows `Running`.
+Wait until the pod status shows `Ready 1/1`. If the timeout is exceeded, check pod logs for errors using:
+
+[source,subs="+quotes"]
+----
+$ oc logs -l app=llama-stack-ragas-inline --tail=50
+----
modules/evaluating-rag-system-quality-with-ragas.adoc (3)

77-77: Use consistent placeholder format.

Line 77 uses <your-llama-stack-url> while other documentation files use __<placeholder>__ format. This inconsistency can confuse users about which parts require substitution.

Apply this diff to align with the project's placeholder convention:

 # Initialize the client
-client = LlamaStackClient(base_url="<your-llama-stack-url>")
+client = LlamaStackClient(base_url="https://__<llama-stack-route-url>__")

80-84: Add error handling and validation to dataset registration.

The dataset registration (lines 80–84) lacks error handling. If the dataset registration fails (e.g., invalid format, network error), users receive no guidance.

Apply this diff to add basic validation and error handling:

 # Register the dataset
-client.datasets.register(
-    dataset_id="my-rag-evaluation",
-    dataset_def=dataset,
-    provider_id="huggingface",  # or your configured provider
-)
+try:
+    client.datasets.register(
+        dataset_id="my-rag-evaluation",
+        dataset_def=dataset,
+        provider_id="huggingface",  # or your configured provider
+    )
+    print("✓ Dataset registered successfully")
+except Exception as e:
+    print(f"✗ Failed to register dataset: {e}")
+    raise

140-151: Add exponential backoff guidance and timeout limit to polling loop.

The job status polling (lines 140–151) uses a fixed 5-second interval with no maximum retry count or timeout. A frozen job could cause the script to run indefinitely.

Apply this diff to add timeout and clearer failure messaging:

 import time
+import sys
 
 # Poll for completion
+max_wait_seconds = 3600  # 1 hour timeout
+elapsed = 0
 while True:
     job_status = client.eval.job_status(job_id=response.job_id)
 
     if job_status.status == "completed":
         print("Evaluation completed successfully")
         break
     elif job_status.status == "failed":
         print(f"Evaluation failed: {job_status.error}")
-        break
+        sys.exit(1)
     else:
         print(f"Status: {job_status.status}, Progress: {job_status.progress}%")
+        elapsed += 5
+        if elapsed > max_wait_seconds:
+            print(f"✗ Evaluation timeout after {max_wait_seconds} seconds")
+            sys.exit(1)
         time.sleep(5)
modules/configuring-ragas-remote-provider-for-production.adoc (2)

126-140: Add fallback guidance if Llama Stack route is not automatically created.

The procedure assumes the external route is automatically created during LSD deployment (line 133), but provides no troubleshooting steps if the route does not exist.

Add a note clarifying what happens if the route is missing:

 . Create an external route for the Llama Stack server:
 +
 [NOTE]
 ====
 The KUBEFLOW_LLAMA_STACK_URL must be an external route that is accessible from the Kubeflow Pipeline pods.
 ====
 +
 The route is automatically created when you deploy the LlamaStackDistribution. You can get the route URL with the following command:
 +
+If the route is not automatically created, verify that the LlamaStackDistribution CR was deployed successfully:
+
+[source,subs="+quotes"]
+----
+$ oc get llamastackdistribution llama-stack-ragas-remote -o jsonpath='{.status.conditions[*].message}'
+----
+
 [source,subs="+quotes"]
 ----
 $ export LLAMA_STACK_URL=$(oc get route llama-stack-ragas-remote -o jsonpath='{.spec.host}')
 
 $ echo "Llama Stack URL: https://$LLAMA_STACK_URL"
+
+If the command returns an error, the LSD deployment may still be in progress. Wait a few moments and retry.
 ----

253-299: Add error handling and retry logic to test evaluation.

The test evaluation (lines 269–299) and provider listing (lines 253–267) lack error handling. Connection failures or invalid credentials will surface only as cryptic exceptions.

Apply this diff to add error handling and clearer diagnostic output:

 from llama_stack_client import LlamaStackClient
 
-client = LlamaStackClient(base_url="https://__<llama-stack-route-url>__")
+try:
+    client = LlamaStackClient(base_url="https://__<llama-stack-route-url>__")
+except Exception as e:
+    print(f"✗ Failed to connect to Llama Stack: {e}")
+    print("Verify that the Llama Stack URL is correct and the service is running.")
+    raise
 
 # List available eval providers
-providers = client.providers.list()
-print("Available eval providers:")
-for provider in providers.eval:
-    print(f"  - {provider.provider_id} ({provider.provider_type})")
+try:
+    providers = client.providers.list()
+    print("✓ Available eval providers:")
+    for provider in providers.eval:
+        print(f"  - {provider.provider_id} ({provider.provider_type})")
+except Exception as e:
+    print(f"✗ Failed to list providers: {e}")
+    raise
 
 # Expected output should include:
 #   - trustyai_ragas_remote (remote::trustyai_ragas)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6343c30 and ecf7994.

📒 Files selected for processing (4)
  • modules/configuring-ragas-remote-provider-for-production.adoc (1 hunks)
  • modules/evaluating-rag-system-quality-with-ragas.adoc (1 hunks)
  • modules/setting-up-ragas-inline-provider.adoc (1 hunks)
  • monitoring-data-science-models.adoc (1 hunks)
🔇 Additional comments (1)
monitoring-data-science-models.adoc (1)

32-32: LGTM—include reference correctly positioned.

The addition integrates RAG evaluation documentation into the monitoring flow following LLM evaluation, with proper syntax and formatting consistency.

@skrthomas skrthomas marked this pull request as draft October 30, 2025 02:26
@skrthomas skrthomas changed the title RHAIENG-1227:RAGAS Eval Provider [WIP] RHAIENG-1227:RAGAS Eval Provider Oct 30, 2025
@skrthomas skrthomas marked this pull request as ready for review November 3, 2025 14:33
@skrthomas skrthomas requested a review from adolfo-ab November 3, 2025 14:33
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 36d6af2 and 31de171.

📒 Files selected for processing (1)
  • modules/configuring-ragas-remote-provider-for-production.adoc (1 hunks)
🔇 Additional comments (2)
modules/configuring-ragas-remote-provider-for-production.adoc (2)

142-145: I'll verify the technical claims in this review by searching for the official DSPA and OpenShift AI documentation on route naming and operator labels.

Verify the DSPA route configuration against your deployed instances.

This review suggests updating the route discovery method but the specific patterns referenced cannot be fully verified against official OpenShift AI documentation. One documented example uses the route name ds-pipeline-pipelines-definition, which differs from both the current code and the suggested alternatives.

Test the proposed changes against your actual DSPA instances to confirm which route discovery method works correctly in your environment:

  • Verify the current label selector returns the expected KFP route
  • Test the suggested route name pattern with your DSPA instance name
  • Confirm the alternative label selector works reliably

Once validated, update the documentation with the confirmed approach.


412-414: Verify/update Next steps links — modules/configuring-ragas-remote-provider-for-production.adoc (lines 412–414)

  • The Next steps entry links to {odhdocshome}/working-with-rag/#evaluating-rag-system-quality-with-ragas_rag; modules/evaluating-rag-system-quality-with-ragas.adoc exists, but modules/integrating-ragas-into-cicd-pipelines.adoc and modules/monitoring-ragas-evaluation-pipelines.adoc are not present in modules/.
  • Before merge: either create the two missing files, update the links to point to existing documentation (or the correct working-with-rag anchor), or replace the references with a "coming soon" placeholder.

[role='_abstract']
You can configure the RAGAS remote provider to run evaluations as distributed jobs using {productname-short} pipelines. The remote provider enables production-scale evaluations by running RAGAS in a separate Kubeflow Pipelines environment, providing resource isolation, improved scalability, and integration with CI/CD workflows.

.Prerequisites

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention installing the RHOAI operator, deploying DSCI, and DSC with LlamaStack component enabled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adolfo-ab That's a good idea. i will add those. In the DSC, what is the exact path of the spec.component that needs to be enabled? Is it the spec.components.llamastackoperator or something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added something at line 12, and I'll update it with more correct info when I hear back on this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answered in L12

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
modules/configuring-ragas-remote-provider-for-production.adoc (2)

142-145: Replace unsupported label selector with DSPA instance-name route pattern (Duplicate from past review).

Line 142 uses the label selector app=ds-pipeline-dspa to retrieve the Kubeflow Pipelines endpoint, but this label is not a published/supported selector in the Data Science Pipelines Operator. Use the documented route naming pattern instead: ds-pipeline-ui-<DSPA_INSTANCE_NAME> or query the actual route by component label.

Apply this diff to use the documented route naming pattern:

-$ export KFP_ENDPOINT=$(oc get route -n <project_name> -l app=ds-pipeline-dspa -o jsonpath='{.items[0].spec.host}')
+$ export KFP_ENDPOINT=$(oc get route ds-pipeline-ui-__<dspa_instance_name>__ -n <project_name> -o jsonpath='{.spec.host}')

Alternatively, if the DSPA instance name is not known, provide guidance to list available routes:

+# To find your DSPA instance name and route:
+$ oc get route -n <project_name> | grep ds-pipeline
+# Then use: oc get route ds-pipeline-ui-<your_instance_name> -n <project_name> -o jsonpath='{.spec.host}'

128-129: Add security advisory for disabled TLS verification (Duplicate from past review with enhancement).

Lines 128–129 set VLLM_TLS_VERIFY="false" with only an inline comment. While the comment acknowledges this is not production-safe, no structured security warning block is provided. Add a prominent [IMPORTANT] admonition to clearly warn against this practice in production.

Apply this diff to add security guidance immediately after line 128:

 $ export VLLM_TLS_VERIFY="false"  # Use "true" in production
 $ export VLLM_API_TOKEN="<token_identifier>"
+
+[IMPORTANT]
+====
+*Never disable TLS verification in production environments.* Setting `VLLM_TLS_VERIFY=false` 
+exposes your system to man-in-the-middle (MITM) attacks. This setting is only appropriate 
+for development and testing with self-signed certificates.
+
+For production deployments:
+
+* Set `VLLM_TLS_VERIFY=true`
+* Use a valid CA-signed certificate for your inference service
+* If using self-signed certificates in production, add your internal CA to the system trust store instead of disabling verification
+* Rotate `VLLM_API_TOKEN` regularly and store it securely in a secret
+====

Also applies to: 217-223

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 31de171 and a38d2fd.

📒 Files selected for processing (4)
  • assemblies/evaluating-rag-systems.adoc (1 hunks)
  • modules/configuring-ragas-remote-provider-for-production.adoc (1 hunks)
  • modules/evaluating-rag-system-quality-with-ragas.adoc (1 hunks)
  • modules/setting-up-ragas-inline-provider.adoc (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • assemblies/evaluating-rag-systems.adoc
🚧 Files skipped from review as they are similar to previous changes (2)
  • modules/setting-up-ragas-inline-provider.adoc
  • modules/evaluating-rag-system-quality-with-ragas.adoc

Copy link

@dmaniloff dmaniloff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @skrthomas !! Left some comments w/ some suggestions.


* You have access to a workbench or notebook environment where you can run Python code.

.Procedure

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment as above this is just duplicating the contents of https://github.com/trustyai-explainability/llama-stack-provider-ragas/blob/main/demos/basic_demo.ipynb which is going to be the more up to date version so we should just link to that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay @dmaniloff @adolfo-ab I re-did this procedure to focus more on the steps a user would need to take to run the Jupyter notebook instead of trying to re-create the notebook in the docs.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's overkill now. IMO explaining how to create a workbench and clone the upstream repo is a bit out of scope for these docs and we're basically redirecting the user somewhere else (the upstream repo) to learn how to use the provider. Personally I think it would be better to put less emphasis on creating the workbench and show the code snippets here, briefly explaining what we're doing on each step (registering the dataset, registering the benchmark, running an evaluation, checking the results). But I leave the final decision up to you guys @dmaniloff @skrthomas

Copy link
Contributor Author

@skrthomas skrthomas Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure what the best approach was. When I talked with the AI docs team about it, the example use case I got was this https://opendatahub.io/docs/working-with-data-in-s3-compatible-object-store/, and it seemed like they went through the steps for cloning the repo and creating the workbench. My initial inclination was to do as you say, replicate the code snippets and explain them in procedural steps, which is what I originally had (although I understand there were some inaccuracies in how the Python ultimately got copied over). In OCP docs, this is the approach we would usually take (replication rather than linking) because we don't ever link to upstream repos in OCP. In RHOAI, however, it appears that the guidance is different. The content strategist for our team advised the following when I was asking the team about which direction was best for this:

The notebooks are upstream but they have been worked on extensively by engineering teams. Also, by cloning the notebooks, users can run them in their local environment or on runtime in a cluster. That is something that they can't do in the docs.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
assemblies/evaluating-rag-systems.adoc (1)

5-5: AsciiDoc ID should use lowercase convention.

The anchor ID uses a capital "R" in Ragas, but AsciiDoc convention typically uses lowercase for IDs. Consider updating to evaluating-rag-systems-with-ragas_{context} for consistency with standard documentation patterns.

-[id="evaluating-rag-systems-with-Ragas_{context}"]
+[id="evaluating-rag-systems-with-ragas_{context}"]
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 56d40e0 and 4afaddb.

📒 Files selected for processing (2)
  • assemblies/evaluating-rag-systems.adoc (1 hunks)
  • modules/evaluating-rag-system-quality-with-ragas.adoc (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • modules/evaluating-rag-system-quality-with-ragas.adoc
🔇 Additional comments (1)
assemblies/evaluating-rag-systems.adoc (1)

1-86: Content structure and messaging align well with prior review feedback.

The documentation correctly:

  • Uses "Ragas" capitalization (not "RAGAS") as per upstream docs convention
  • Distinguishes between the Ragas product and the Ragas provider pattern
  • Includes external links to authoritative documentation
  • Removes high-level prerequisites in favor of procedure-specific requirements
  • Clearly separates inline (development) vs. remote (production) deployment modes with appropriate trade-offs

The content is well-structured, the use cases are comprehensive, and the metrics explanations are clear. The integration of this assembly into the monitoring documentation should help engineers understand RAG evaluation workflows.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
modules/configuring-ragas-remote-provider-for-production.adoc (1)

126-134: Consider expanding TLS security guidance with an IMPORTANT admonition block.

The inline comment on line 126 acknowledges the production concern, but past review feedback indicates an explicit IMPORTANT block should be added with comprehensive security guidance (e.g., MITM risks, proper certificate handling, credential rotation).

Consider adding the following IMPORTANT block after line 134 to enhance security guidance:

+
+[IMPORTANT]
+====
+Never disable TLS verification in production environments. Setting `VLLM_TLS_VERIFY=false` 
+exposes your system to man-in-the-middle (MITM) attacks. This setting is only appropriate 
+for development and testing with self-signed certificates. For production, use a valid 
+certificate and set `VLLM_TLS_VERIFY=true`. Ensure credentials like `VLLM_API_TOKEN` and 
+S3 keys are kept secure and rotated regularly.
+====
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4afaddb and 732ebab.

📒 Files selected for processing (1)
  • modules/configuring-ragas-remote-provider-for-production.adoc (1 hunks)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
modules/configuring-ragas-remote-provider-for-production.adoc (2)

315-315: Fix pod name inconsistency in deployment verification step.

Line 315 references the wrong pod name. The YAML at line 186 defines the pod as llama-stack-pod, but the instruction tells users to wait for llama-stack-ragas-remote, causing verification to fail.

-Wait until the `llama-stack-ragas-remote` pod status shows `Running`.
+Wait until the `llama-stack-pod` pod status shows `Running`.

126-127: Enhance TLS verification security guidance with explicit warnings.

Lines 126-127 set VLLM_TLS_VERIFY="false" but provide insufficient security context. Add a dedicated admonition block immediately after line 127 with production-environment warnings and safe alternatives.

Apply this diff to add comprehensive security guidance:

$ export VLLM_API_TOKEN="<token_identifier>"

+[IMPORTANT]
+====
+Never disable TLS verification in production environments. Setting `VLLM_TLS_VERIFY=false` 
+exposes your system to man-in-the-middle (MITM) attacks and is a critical security risk.
+
+For production deployments:
+
+* Set `VLLM_TLS_VERIFY=true` and use a valid CA-signed certificate, or
+* Add your internal CA to the system trust store in the container image, or
+* Use a dedicated development/testing namespace with explicit CA trust rather than disabling verification.
+
+Ensure `VLLM_API_TOKEN` and other credentials are also secured appropriately (rotated regularly, accessed only by authorized service accounts).
+====
+
$ oc create secret generic llama-stack-inference-model-secret \
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 732ebab and 3790df2.

📒 Files selected for processing (1)
  • modules/configuring-ragas-remote-provider-for-production.adoc (1 hunks)
🔇 Additional comments (1)
modules/configuring-ragas-remote-provider-for-production.adoc (1)

318-320: Close unclosed ifdef::upstream[] block to fix AsciiDoc syntax error.

The file ends with an open ifdef::upstream[] block (line 318) without a closing endif::[] directive. This breaks AsciiDoc rendering and prevents the documentation from being built.

Add endif::[] after the link:

 * link:{odhdocshome}/working-with-rag/#evaluating-rag-system-quality-with-ragas_rag[Evaluating RAG system quality with Ragas metrics]
+endif::[]

Also verify that the "Next steps" section is complete. If additional steps or links are intended for cloud-service or self-managed environments, add corresponding ifdef blocks and content before the closing endif::[].

Likely an incorrect or invalid review comment.

name: llama-stack-ragas-inline
namespace: <project_name>
spec:
imageName: quay.io/trustyai/llama-stack-ragas:latest
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmaniloff did you ever hear back about the image name we can use in the product docs? cc @adolfo-ab

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quay.io/rhoai/odh-trustyai-ragas-lls-provider-dsp-rhel9:rhoai-3.0

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above about the spec

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I got image questions mixed up. I was meaning to ask about the KUBEFLOW_BASE_IMAGE in the Ragas remote provider ConfigMap.

ifndef::upstream[]
For more information, see link:{rhoaidocshome}evaluating-rag-systems/setting-up-Ragas-inline-provider[Setting up the Ragas inline provider for development].
endif::[]
* You have access to a workbench or notebook environment where you can run Python code.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the llama stack server can take requests over http. the python client (which is what we use in the demo notebook to verify setup) is one of many clients that can talk to llama stack.

therefore, this is not strictly a requirement. i think i would phrase this part along the following lines:

  • "to test your setup, read the demo notebok in (link to the 0.4.2 version of the ragas provider) which outlines the basic steps via the python client.
  • alternatively, you can also submit an eval by directly using the http methods of the llama stack api (link to the llama stack 0.3.0 docs).
  • if you do wish to follow the example provided in the notebook, you can do so from the Jupyter environment via the following steps (and then list your steps below, also mention that the llama stack pod will have to be accessible from the jupyter env in the cluster which may not be the case by default).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmaniloff thanks for this info. I will incorporate this to the intro! Just wanting to check, when you say link to the 3.0 llama stack api in the 3.0 docs, do you mean the 3.0 version of this: https://docs.redhat.com/en/documentation/red_hat_openshift_ai_self-managed/2.25/html/working_with_llama_stack/overview-of-llama-stack_rag#the_llamastackdistribution_custom_resource_api_providers

Or something else?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.Prerequisites
* You have cluster administrator privileges for your {openshift-platform} cluster.
* You have installed the {productname-short} Operator.
* You have a DataScienceCluster custom resource in your environment; in the `spec.components` section the `llamastack` component is set to `enabled/true`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* You have a DataScienceCluster custom resource in your environment; in the `spec.components` section the `llamastack` component is set to `enabled/true`.
* You have a DataScienceCluster custom resource in your environment; in the `spec.components` section the `llamastackoperator` component should be enabled:
llamastackoperator:
managementState: Managed

[role='_abstract']
You can configure the RAGAS remote provider to run evaluations as distributed jobs using {productname-short} pipelines. The remote provider enables production-scale evaluations by running RAGAS in a separate Kubeflow Pipelines environment, providing resource isolation, improved scalability, and integration with CI/CD workflows.

.Prerequisites

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answered in L12

.Prerequisites
* You have logged in to {productname-long}.
* You have created a data science project.
* You have created a data science pipeline.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* You have created a data science pipeline.
* You have created a data science pipeline server.

* You have created a data science project.
* You have created a data science pipeline.
* You have created a secret for your AWS credentials in your project namespace.
* You have deployed a Llama Stack distribution with the Ragas evaluation provider enabled, either Inline or Remote provider.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* You have deployed a Llama Stack distribution with the Ragas evaluation provider enabled, either Inline or Remote provider.
* You have deployed a Llama Stack distribution with the Ragas evaluation provider enabled (either Inline or Remote).

ifndef::upstream[]
For more information, see link:{rhoaidocshome}evaluating-rag-systems/setting-up-Ragas-inline-provider[Setting up the Ragas inline provider for development].
endif::[]
* You have access to a workbench or notebook environment where you can run Python code.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1


* You have access to a workbench or notebook environment where you can run Python code.

.Procedure

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's overkill now. IMO explaining how to create a workbench and clone the upstream repo is a bit out of scope for these docs and we're basically redirecting the user somewhere else (the upstream repo) to learn how to use the provider. Personally I think it would be better to put less emphasis on creating the workbench and show the code snippets here, briefly explaining what we're doing on each step (registering the dataset, registering the benchmark, running an evaluation, checking the results). But I leave the final decision up to you guys @dmaniloff @skrthomas

metadata:
name: llama-stack-ragas-inline
namespace: <project_name>
spec:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless a lot of things have changed recently with LlamaStackDistribution, this spec is not correct, it should be something like this:

spec:
  replicas: 1
  server:
    containerSpec:
      env:
      - name: VLLM_URL
        value: <model_url>
      - name: VLLM_API_TOKEN
        value: <model_api_token (if necessary)>
      - name: INFERENCE_MODEL
        value: <model_name>
      - name: MILVUS_DB_PATH
        value: ~/.llama/milvus.db
      - name: VLLM_TLS_VERIFY
        value: "false"
      - name: FMS_ORCHESTRATOR_URL
        value: http://localhost:123
      - name: EMBEDDING_MODEL
        value: granite-embedding-125m

AFAIK the only "extra" field that we need for Ragas inline is the EMBEDDING_MODEL.
cc: @dmaniloff @Bobbins228 @Artemon-line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay gotcha. I updated to exactly this in the docs. @dmaniloff @Bobbins228 can you confirm that the above YAML is sufficient for the Inline LlamaStackDistribution example? I think there was a lot of incorrect spec inferred from Claude code because in the resources, only EMBEDDING_MODEL is specifically mentioned for inline. But yeah, I can see that this is really off.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought EMBEDDING_MODEL is needed for both inline and remote though https://trustyai.org/llama-stack-provider-ragas/llama-stack-provider-ragas/0.1/remote-provider.html

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, EMBEDDING_MODEL is needed for both.

name: llama-stack-ragas-inline
namespace: <project_name>
spec:
imageName: quay.io/trustyai/llama-stack-ragas:latest

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment above about the spec

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 081f671 and 6b8cd49.

📒 Files selected for processing (3)
  • modules/configuring-ragas-remote-provider-for-production.adoc (1 hunks)
  • modules/evaluating-rag-system-quality-with-ragas.adoc (1 hunks)
  • modules/setting-up-ragas-inline-provider.adoc (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • modules/configuring-ragas-remote-provider-for-production.adoc
  • modules/setting-up-ragas-inline-provider.adoc
🔇 Additional comments (2)
modules/evaluating-rag-system-quality-with-ragas.adoc (2)

17-29: Prerequisites section aligns with earlier feedback.

The prerequisites section properly incorporates the wording suggestions from prior review feedback ("created a data science pipeline server" and "Ragas evaluation provider enabled (Inline or Remote)") and includes appropriate conditional documentation links.


31-66: Procedure structure and workflow are sound.

The step-by-step procedure effectively guides users through the evaluation workflow: workbench setup, repository cloning, notebook execution, pipeline monitoring, and results inspection. The approach of linking to the upstream notebook rather than duplicating content aligns with team discussions and the RHOAI documentation guidance on leveraging upstream repositories.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6b8cd49 and 0e96a57.

📒 Files selected for processing (4)
  • assemblies/evaluating-rag-systems.adoc (1 hunks)
  • modules/configuring-ragas-remote-provider-for-production.adoc (1 hunks)
  • modules/evaluating-rag-system-quality-with-ragas.adoc (1 hunks)
  • modules/setting-up-ragas-inline-provider.adoc (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • modules/setting-up-ragas-inline-provider.adoc
  • modules/configuring-ragas-remote-provider-for-production.adoc
🔇 Additional comments (4)
assemblies/evaluating-rag-systems.adoc (2)

5-5: Verify ID declaration capitalization for consistency.

Line 5 uses uppercase "Ragas" in the ID, but the included files (lines 79–83) use lowercase ragas in filenames. For consistency with typical AsciiDoc conventions and to ensure cross-references work reliably on case-sensitive systems, the ID should use lowercase ragas: [id="evaluating-rag-systems-with-ragas_{context}"].

Confirm this matches the capitalization pattern used in the target module files (modules/setting-up-ragas-inline-provider.adoc, modules/configuring-ragas-remote-provider-for-production.adoc, and modules/evaluating-rag-system-quality-with-ragas.adoc).


79-83: Include statements are correct.

The filenames use consistent lowercase ragas convention. Past issue flagged on line 81 appears to be resolved.

modules/evaluating-rag-system-quality-with-ragas.adoc (2)

3-3: Standardize ID declaration capitalization to lowercase.

Line 3 uses uppercase "Ragas" in the ID: evaluating-rag-system-quality-with-Ragas_{context}. To match the lowercase convention used in related files and filenames, change this to evaluating-rag-system-quality-with-ragas_{context}.

Confirm this aligns with how the ID is referenced in parent assembly or other cross-reference locations.


40-40: Image reference is valid.

The image file images/open.png exists in the repository and is correctly referenced on line 40. No changes needed.

* You have created a secret for your AWS credentials in your project namespace.
* You have deployed a Llama Stack distribution with the Ragas evaluation provider enabled (Inline or Remote).
ifdef::upstream[]
For more information, see link:{odhdocshome}/evaluating-rag-systems/configuring-Ragas-remote-provider-for-production[Configuring the Ragas remote provider for production].
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix capitalization in link target IDs to match actual declarations.

Lines 24 and 27 reference link targets with uppercase "Ragas" (configuring-Ragas-remote-provider-for-production and setting-up-Ragas-inline-provider), but the actual ID declarations in the target files use lowercase ragas. This will cause links to fail on case-sensitive filesystems.

Update line 24 from configuring-Ragas-remote-provider-for-production to configuring-ragas-remote-provider-for-production and line 27 from setting-up-Ragas-inline-provider to setting-up-ragas-inline-provider.

  ifdef::upstream[]
- For more information, see link:{odhdocshome}/evaluating-rag-systems/configuring-Ragas-remote-provider-for-production[Configuring the Ragas remote provider for production].
+ For more information, see link:{odhdocshome}/evaluating-rag-systems/configuring-ragas-remote-provider-for-production[Configuring the Ragas remote provider for production].
  endif::[]
  ifndef::upstream[]
- For more information, see link:{rhoaidocshome}evaluating-rag-systems/setting-up-Ragas-inline-provider[Setting up the Ragas inline provider for development].
+ For more information, see link:{rhoaidocshome}evaluating-rag-systems/setting-up-ragas-inline-provider[Setting up the Ragas inline provider for development].
  endif::[]

Also applies to: 27-27

🤖 Prompt for AI Agents
In modules/evaluating-rag-system-quality-with-ragas.adoc around lines 24 and 27,
the link target IDs use uppercase "Ragas" which don't match the actual lowercase
declarations; update line 24's target from
configuring-Ragas-remote-provider-for-production to
configuring-ragas-remote-provider-for-production and line 27's target from
setting-up-Ragas-inline-provider to setting-up-ragas-inline-provider so the link
IDs match exactly (case-sensitive).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants