Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/redhat-distro-container.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ jobs:
runs-on: ubuntu-latest
env:
INFERENCE_MODEL: meta-llama/Llama-3.2-1B-Instruct
EMBEDDING_MODEL: granite-embedding-125m
VLLM_URL: http://localhost:8000/v1
strategy:
matrix:
Expand Down
2 changes: 2 additions & 0 deletions distribution/Containerfile
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ RUN pip install \
uvicorn
RUN pip install \
llama_stack_provider_lmeval==0.2.4
RUN pip install \
llama_stack_provider_ragas[remote]==0.3.0
RUN pip install \
--extra-index-url https://test.pypi.org/simple/ llama_stack_provider_trustyai_fms==0.2.3
RUN pip install --extra-index-url https://download.pytorch.org/whl/cpu torch 'torchao>=0.12.0' torchvision
Expand Down
2 changes: 2 additions & 0 deletions distribution/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,9 @@ You can see an overview of the APIs and Providers the image ships with in the ta
| agents | inline::meta-reference | Yes | N/A |
| datasetio | inline::localfs | Yes | N/A |
| datasetio | remote::huggingface | Yes | N/A |
| eval | inline::trustyai_ragas | No | Set the `EMBEDDING_MODEL` environment variable |
| eval | remote::trustyai_lmeval | Yes | N/A |
| eval | remote::trustyai_ragas | No | Set the `KUBEFLOW_LLAMA_STACK_URL` environment variable |
| files | inline::localfs | Yes | N/A |
| inference | inline::sentence-transformers | Yes | N/A |
| inference | remote::azure | No | Set the `AZURE_API_KEY` environment variable |
Expand Down
2 changes: 2 additions & 0 deletions distribution/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ distribution_spec:
eval:
- provider_type: remote::trustyai_lmeval
module: llama_stack_provider_lmeval==0.2.4
- provider_type: inline::trustyai_ragas
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this only adds inline, right? what about remote::trustyai_ragas?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, asked the Trusty team about that in Slack

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will install the package with the additional remote deps so it will add both inline and remote providers.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you talking about the module line? Because my concern is around the provider_type line

module: llama_stack_provider_ragas[remote]==0.3.0
datasetio:
- provider_type: remote::huggingface
- provider_type: inline::localfs
Expand Down
17 changes: 17 additions & 0 deletions distribution/run.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,23 @@ providers:
config:
use_k8s: ${env.TRUSTYAI_LMEVAL_USE_K8S:=true}
base_url: ${env.VLLM_URL:=}
- provider_id: ${env.EMBEDDING_MODEL:+trustyai_ragas_inline}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- provider_id: ${env.EMBEDDING_MODEL:+trustyai_ragas_inline}
- provider_id: ${env.EMBEDDING_MODEL:+trustyai_ragas}

provider_type: inline::trustyai_ragas
module: llama_stack_provider_ragas.inline
config:
embedding_model: ${env.EMBEDDING_MODEL:=}
- provider_id: ${env.KUBEFLOW_LLAMA_STACK_URL:+trustyai_ragas_remote}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- provider_id: ${env.KUBEFLOW_LLAMA_STACK_URL:+trustyai_ragas_remote}
- provider_id: ${env.KUBEFLOW_LLAMA_STACK_URL:+trustyai_ragas}

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if both providers are enabled? Will this still work?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

Yes it will work, but users will not have a way to differentiate when doing client.benchmarks.register. I tested this and the provider to be added last via run.yaml will override the previous one.

Should we bring back the suffix to better support the case of both providers enabled?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and I'm going to leave it in the YAML for now

provider_type: remote::trustyai_ragas
module: llama_stack_provider_ragas.remote
config:
embedding_model: ${env.EMBEDDING_MODEL:=}
kubeflow_config:
results_s3_prefix: ${env.KUBEFLOW_RESULTS_S3_PREFIX:=}
s3_credentials_secret_name: ${env.KUBEFLOW_S3_CREDENTIALS_SECRET_NAME:=}
pipelines_endpoint: ${env.KUBEFLOW_PIPELINES_ENDPOINT:=}
namespace: ${env.KUBEFLOW_NAMESPACE:=}
llama_stack_url: ${env.KUBEFLOW_LLAMA_STACK_URL:=}
base_image: ${env.KUBEFLOW_BASE_IMAGE:=}
datasetio:
- provider_id: huggingface
provider_type: remote::huggingface
Expand Down
1 change: 1 addition & 0 deletions tests/smoke.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ function start_and_wait_for_llama_stack_container {
--net=host \
-p 8321:8321 \
--env INFERENCE_MODEL="$INFERENCE_MODEL" \
--env EMBEDDING_MODEL="$EMBEDDING_MODEL" \
--env VLLM_URL="$VLLM_URL" \
--env TRUSTYAI_LMEVAL_USE_K8S=False \
--name llama-stack \
Expand Down
Loading