Skip to content

ci(docs): add external provider info to table#62

Merged
mergify[bot] merged 2 commits intoopendatahub-io:mainfrom
nathan-weinberg:extern
Oct 7, 2025
Merged

ci(docs): add external provider info to table#62
mergify[bot] merged 2 commits intoopendatahub-io:mainfrom
nathan-weinberg:extern

Conversation

@nathan-weinberg
Copy link
Copy Markdown
Collaborator

@nathan-weinberg nathan-weinberg commented Oct 7, 2025

What does this PR do?

adds details about external providers to the auto-gen docs

Summary by CodeRabbit

  • New Features

    • Added support for a remote RAGAS evaluation provider alongside the inline option, with explicit versioning.
  • Documentation

    • Distribution table now shows an "External?" column, clearer enablement indicators, and version annotations for providers.
    • Generated distribution docs now reflect external provider status per API/provider.
  • Chores

    • Build configuration updated to register and pin the RAGAS provider.
    • Container installation expanded to explicitly include the RAGAS dependency.

@nathan-weinberg nathan-weinberg added the do-not-merge Apply to PRs that should not be merged (yet) label Oct 7, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Oct 7, 2025

Walkthrough

Adds explicit ragas provider installations in Containerfile and build.yaml (inline and remote variants). Updates distribution README to include a new "External?" column. Extends the docs generator to load external provider info from build.yaml and render it in the distribution table.

Changes

Cohort / File(s) Summary
Dependency install steps
distribution/Containerfile
Adds a separate RUN pip install llama_stack_provider_ragas==0.3.0 step (explicitly installing ragas); other installs unchanged.
Build spec: eval providers
distribution/build.yaml
Adds module: llama_stack_provider_ragas==0.3.0 to inline::trustyai_ragas and introduces remote::trustyai_ragas with module: llama_stack_provider_ragas[remote]==0.3.0.
Docs generator enhancements
scripts/gen_distro_docs.py
Adds load_external_providers_info() to parse distribution/build.yaml and compute provider → external status; updates gen_distro_table() to include an "External?" column and incorporate external status into rows.
Distribution README table
distribution/README.md
Adds an "External?" column, reflows table columns, and makes External?/Enabled-by-default values explicit (including version qualifiers for some evaluators).

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Dev as Dev
  participant Script as gen_distro_docs.py
  participant YAML as distribution/build.yaml
  participant Table as distribution/README.md

  Dev->>Script: run docs generator
  Script->>YAML: load_external_providers_info() — read provider_type + module entries
  YAML-->>Script: provider entries (inline & remote, with modules)
  Script->>Script: compute External? status map
  Script->>Script: gen_distro_table() — build rows including External? and Enabled?
  Script->>Table: write updated table with External? column
  Table-->>Dev: updated README table
  note right of Script: new: integrate external provider metadata into table
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

I hop through manifests with twitching paws,
Finding ragas listed in the cause.
Inline and remote now proudly shown,
YAML and docs in tandem grown.
A happy rabbit stamps—new seeds sown.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title clearly and concisely summarizes the main change of adding external provider information to the documentation table and is directly related to the pull request objectives without irrelevant details.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6519950 and 05b3df5.

📒 Files selected for processing (4)
  • distribution/Containerfile (1 hunks)
  • distribution/README.md (1 hunks)
  • distribution/build.yaml (1 hunks)
  • scripts/gen_distro_docs.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • distribution/Containerfile
  • distribution/build.yaml
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: nathan-weinberg
PR: opendatahub-io/llama-stack-distribution#33
File: distribution/Containerfile:17-21
Timestamp: 2025-09-15T14:25:54.837Z
Learning: In the opendatahub-io/llama-stack-distribution repository, the distribution/Containerfile is auto-generated by distribution/build.py based on configuration in build.yaml. When providers are added to build.yaml, the build script automatically regenerates the Containerfile with the required dependencies. Changes to the Containerfile should not be flagged as manual edits if they correspond to legitimate changes in the build configuration.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-test-push (linux/amd64)
  • GitHub Check: Summary
🔇 Additional comments (3)
distribution/README.md (1)

1-37: LGTM! Auto-generated documentation reflects the new External? column correctly.

The table has been successfully updated with the new "External?" column, and the external provider information (including versions) is displayed correctly. Since this file is auto-generated by scripts/gen_distro_docs.py, any future adjustments should be made in the generator script or source configuration files.

scripts/gen_distro_docs.py (2)

39-81: Well-implemented external provider information extraction.

The load_external_providers_info() function correctly parses build.yaml to extract external provider information, including version details from module specifications. The logic properly handles cases with and without version strings, and the error handling is appropriate for a documentation generation script.


83-150: Excellent integration of external provider information into table generation.

The updates to gen_distro_table() seamlessly incorporate the external provider data:

  • The new "External?" column is properly added to the header
  • External status is fetched from build.yaml and included in each table row
  • The emoji indicators (❌/✅) for "Enabled by default?" improve readability
  • The sorting and structure remain clean and maintainable

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@cdoern
Copy link
Copy Markdown
Collaborator

cdoern commented Oct 7, 2025

thank you!

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
scripts/gen_distro_docs.py (2)

58-75: Add logging for skipped provider entries.

The function silently skips provider entries that don't match the expected structure (not a list, not a dict, missing provider_type). This could hide configuration errors in build.yaml and make debugging difficult.

Consider adding debug logging to track what's being skipped:

+import logging
+
+logger = logging.getLogger(__name__)
+
 def load_external_providers_info():
     """Load build.yaml and extract external provider information."""
     # ... existing code ...
     
     for _, provider_list in providers.items():
         if isinstance(provider_list, list):
             for provider in provider_list:
                 if isinstance(provider, dict) and "provider_type" in provider:
                     # ... existing logic ...
+                else:
+                    logger.debug(f"Skipping malformed provider entry: {provider}")
+        else:
+            logger.debug(f"Skipping non-list provider entry: {provider_list}")

67-74: Consider more defensive version extraction.

The version extraction uses split("==")[-1] which correctly handles cases like package[extra]==version. However, if a module string contains multiple == (e.g., malformed or complex dependency specs), this could extract an unexpected value.

Consider adding validation or a more explicit pattern:

                         if module_field:
                             # Extract version from module field (format: package_name==version)
                             if "==" in module_field:
-                                # Handle cases like package[extra]==version
-                                version_part = module_field.split("==")[-1]
-                                external_info[provider_type] = (
-                                    f"Yes (version {version_part})"
-                                )
+                                # Handle cases like package[extra]==version
+                                version_part = module_field.split("==")[-1].strip()
+                                if version_part:
+                                    external_info[provider_type] = (
+                                        f"Yes (version {version_part})"
+                                    )
+                                else:
+                                    external_info[provider_type] = "Yes"
                             else:
                                 external_info[provider_type] = "Yes"
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ccf58c6 and 6519950.

📒 Files selected for processing (2)
  • distribution/README.md (1 hunks)
  • scripts/gen_distro_docs.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • distribution/README.md
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: nathan-weinberg
PR: opendatahub-io/llama-stack-distribution#33
File: distribution/Containerfile:17-21
Timestamp: 2025-09-15T14:25:54.837Z
Learning: In the opendatahub-io/llama-stack-distribution repository, the distribution/Containerfile is auto-generated by distribution/build.py based on configuration in build.yaml. When providers are added to build.yaml, the build script automatically regenerates the Containerfile with the required dependencies. Changes to the Containerfile should not be flagged as manual edits if they correspond to legitimate changes in the build configuration.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: build-test-push (linux/amd64)
  • GitHub Check: Summary
🔇 Additional comments (1)
scripts/gen_distro_docs.py (1)

83-150: LGTM! Clean integration of external provider information.

The modifications to gen_distro_table() are well-structured:

  • The new "External?" column is properly integrated into the table header and rows
  • External provider information is loaded once at the beginning for efficiency
  • The default value "No" is appropriate for providers not found in build.yaml
  • The emoji indicators (❌/✅) improve readability compared to text

The changes align well with the PR objective of adding external provider information to the auto-generated documentation.

@nathan-weinberg nathan-weinberg removed the do-not-merge Apply to PRs that should not be merged (yet) label Oct 7, 2025
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
Signed-off-by: Nathan Weinberg <nweinber@redhat.com>
@mergify mergify bot merged commit 8a2a51c into opendatahub-io:main Oct 7, 2025
6 checks passed
@nathan-weinberg nathan-weinberg deleted the extern branch October 7, 2025 18:07
nathan-weinberg added a commit to nathan-weinberg/llama-stack-distribution that referenced this pull request Feb 12, 2026
fix: change "huggingface-cli" to "hf" in Konflux Containerfile
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants