Skip to content

Commit 5ade275

Browse files
authored
fix: update embeddings configuration to be compatible with rhoai 3.2 (#918)
In RHOAI 3.2, the default is to use remote embeddings. This PR adds the necessary configuration to continue using sentence-transformers. In a follow-up PR, we will add the ability to test remote embeddings. Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
1 parent 1901208 commit 5ade275

1 file changed

Lines changed: 60 additions & 54 deletions

File tree

tests/llama_stack/conftest.py

Lines changed: 60 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -52,60 +52,62 @@ def llama_stack_server_config(
5252
vector_io_provider_deployment_config_factory: Callable[[str], list[Dict[str, str]]],
5353
) -> Dict[str, Any]:
5454
"""
55-
Generate server configuration for LlamaStack distribution deployment and deploy vector I/O provider resources.
56-
57-
This fixture creates a comprehensive server configuration dictionary that includes
58-
container specifications, environment variables, and optional storage settings.
59-
The configuration is built based on test parameters and environment variables.
60-
Additionally, it deploys the specified vector I/O provider (e.g., Milvus) and configures
61-
the necessary environment variables for the provider integration.
62-
63-
Args:
64-
request: Pytest fixture request object containing test parameters
65-
vector_io_provider_deployment_config_factory: Factory function to deploy vector I/O providers
66-
and return their configuration environment variables
67-
68-
Returns:
69-
Dict containing server configuration with the following structure:
70-
- containerSpec: Container resource limits, environment variables, and port
71-
- distribution: Distribution name (defaults to "rh-dev")
72-
- storage: Optional storage size configuration
73-
74-
Environment Variables:
75-
The fixture configures the following environment variables:
76-
- INFERENCE_MODEL: Model identifier for inference
77-
- VLLM_API_TOKEN: API token for VLLM service
78-
- VLLM_URL: URL for VLLM service endpoint
79-
- VLLM_TLS_VERIFY: TLS verification setting (defaults to "false")
80-
- FMS_ORCHESTRATOR_URL: FMS orchestrator service URL
81-
- Vector I/O provider specific variables (deployed via factory):
82-
* For "milvus": MILVUS_DB_PATH
83-
* For "milvus-remote": MILVUS_ENDPOINT, MILVUS_TOKEN, MILVUS_CONSISTENCY_LEVEL
84-
85-
Test Parameters:
86-
The fixture accepts the following optional parameters via request.param:
87-
- inference_model: Override for INFERENCE_MODEL environment variable
88-
- vllm_api_token: Override for VLLM_API_TOKEN environment variable
89-
- vllm_url_fixture: Fixture name to get VLLM URL from
90-
- fms_orchestrator_url_fixture: Fixture name to get FMS orchestrator URL from
91-
- vector_io_provider: Vector I/O provider type ("milvus" or "milvus-remote")
92-
- llama_stack_storage_size: Storage size for the deployment
93-
- embedding_model: Embedding model identifier for inference
94-
- kubeflow_llama_stack_url: LlamaStack service URL for Kubeflow
95-
- kubeflow_pipelines_endpoint: Kubeflow Pipelines API endpoint URL
96-
- kubeflow_namespace: Namespace for Kubeflow resources
97-
- kubeflow_base_image: Base container image for Kubeflow pipelines
98-
- kubeflow_results_s3_prefix: S3 prefix for storing Kubeflow results
99-
- kubeflow_s3_credentials_secret_name: Secret name for S3 credentials
100-
- kubeflow_pipelines_token: Authentication token for Kubeflow Pipelines
101-
102-
Example:
103-
@pytest.mark.parametrize("llama_stack_server_config",
104-
[{"vector_io_provider": "milvus-remote"}],
105-
indirect=True)
106-
def test_with_remote_milvus(llama_stack_server_config):
107-
# Test will use remote Milvus configuration
108-
pass
55+
Generate server configuration for LlamaStack distribution deployment and deploy vector I/O provider resources.
56+
57+
This fixture creates a comprehensive server configuration dictionary that includes
58+
container specifications, environment variables, and optional storage settings.
59+
The configuration is built based on test parameters and environment variables.
60+
Additionally, it deploys the specified vector I/O provider (e.g., Milvus) and configures
61+
the necessary environment variables for the provider integration.
62+
63+
Args:
64+
request: Pytest fixture request object containing test parameters
65+
vector_io_provider_deployment_config_factory: Factory function to deploy vector I/O providers
66+
and return their configuration environment variables
67+
68+
Returns:
69+
Dict containing server configuration with the following structure:
70+
- containerSpec: Container resource limits, environment variables, and port
71+
- distribution: Distribution name (defaults to "rh-dev")
72+
- storage: Optional storage size configuration
73+
74+
Environment Variables:
75+
The fixture configures the following environment variables:
76+
- INFERENCE_MODEL: Model identifier for inference
77+
- VLLM_API_TOKEN: API token for VLLM service
78+
- VLLM_URL: URL for VLLM service endpoint
79+
- VLLM_TLS_VERIFY: TLS verification setting (defaults to "false")
80+
- FMS_ORCHESTRATOR_URL: FMS orchestrator service URL
81+
- ENABLE_SENTENCE_TRANSFORMERS: Enable sentence-transformers embeddings (set to "true")
82+
+ - EMBEDDING_PROVIDER: Embeddings provider to use (set to "sentence-transformers")
83+
- Vector I/O provider specific variables (deployed via factory):
84+
* For "milvus": MILVUS_DB_PATH
85+
* For "milvus-remote": MILVUS_ENDPOINT, MILVUS_TOKEN, MILVUS_CONSISTENCY_LEVEL
86+
87+
Test Parameters:
88+
The fixture accepts the following optional parameters via request.param:
89+
- inference_model: Override for INFERENCE_MODEL environment variable
90+
- vllm_api_token: Override for VLLM_API_TOKEN environment variable
91+
- vllm_url_fixture: Fixture name to get VLLM URL from
92+
- fms_orchestrator_url_fixture: Fixture name to get FMS orchestrator URL from
93+
- vector_io_provider: Vector I/O provider type ("milvus" or "milvus-remote")
94+
- llama_stack_storage_size: Storage size for the deployment
95+
- embedding_model: Embedding model identifier for inference
96+
- kubeflow_llama_stack_url: LlamaStack service URL for Kubeflow
97+
- kubeflow_pipelines_endpoint: Kubeflow Pipelines API endpoint URL
98+
- kubeflow_namespace: Namespace for Kubeflow resources
99+
- kubeflow_base_image: Base container image for Kubeflow pipelines
100+
- kubeflow_results_s3_prefix: S3 prefix for storing Kubeflow results
101+
- kubeflow_s3_credentials_secret_name: Secret name for S3 credentials
102+
- kubeflow_pipelines_token: Authentication token for Kubeflow Pipelines
103+
104+
Example:
105+
@pytest.mark.parametrize("llama_stack_server_config",
106+
[{"vector_io_provider": "milvus-remote"}],
107+
indirect=True)
108+
def test_with_remote_milvus(llama_stack_server_config):
109+
# Test will use remote Milvus configuration
110+
pass
109111
"""
110112

111113
env_vars = []
@@ -147,6 +149,10 @@ def test_with_remote_milvus(llama_stack_server_config):
147149
if embedding_model:
148150
env_vars.append({"name": "EMBEDDING_MODEL", "value": embedding_model})
149151

152+
# Use inline::sentence-transformers embeddings provider
153+
env_vars.append({"name": "ENABLE_SENTENCE_TRANSFORMERS", "value": "true"})
154+
env_vars.append({"name": "EMBEDDING_PROVIDER", "value": "sentence-transformers"})
155+
150156
# Kubeflow-related environment variables
151157
if params.get("enable_ragas_remote"):
152158
# Get fixtures only when Ragas Remote/Kubeflow is enabled

0 commit comments

Comments
 (0)