[duplicate] Update to OpenAI API and set LLS>=0.2.23#25
[duplicate] Update to OpenAI API and set LLS>=0.2.23#2525 commits merged intotrustyai-explainability:mainfrom
Conversation
- Changed llama-stack and llama-stack-client dependencies to point to GitHub repositories. - Unified demo notebooks into one, and run.yaml into one as well. - Refactored provider specifications to return a list. - Updated calls to embeddings and completions.
There was a problem hiding this comment.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `src/llama_stack_provider_ragas/inline/wrappers_inline.py:41-45` </location>
<code_context>
- model_id=self.embedding_model_id,
- contents=texts,
- task_type=EmbeddingTaskType.document,
+ response = await self.inference_api.openai_embeddings(
+ model=self.embedding_model_id,
+ input=texts,
)
- return response.embeddings # type: ignore
+ return [data.embedding for data in response.data] # type: ignore
except Exception as e:
logger.error(f"Document embedding failed: {str(e)}")
</code_context>
<issue_to_address>
**issue:** Consider handling empty or malformed response data in embedding extraction.
Add validation to ensure response.data is a non-empty list of objects with an 'embedding' attribute to prevent runtime errors.
</issue_to_address>
### Comment 2
<location> `src/llama_stack_provider_ragas/inline/wrappers_inline.py:140-143` </location>
<code_context>
"provider": "llama_stack",
}
+ # sampling params for this generation should be set via the benchmark config
+ # we will ignore the temperature and stop params passed in here
for _ in range(n):
- response = await self.inference_api.completion(
- model_id=self.model_id,
- content=prompt_text,
- sampling_params=gen_sampling_params,
+ response = await self.inference_api.openai_completion(
+ model=self.model_id,
+ prompt=prompt_text,
</code_context>
<issue_to_address>
**question:** The temperature and stop parameters passed to agenerate_text are now ignored.
If this behavior is intentional, please add documentation to clarify that function arguments for temperature and stop are ignored in favor of self.sampling_params.
</issue_to_address>
### Comment 3
<location> `src/llama_stack_provider_ragas/inline/wrappers_inline.py:168-170` </location>
<code_context>
)
+ # Extract text from OpenAI completion response
+ choice = response.choices[0] if response.choices else None
+ text = choice.text if choice else ""
+
# Store Llama Stack response info in llm_output
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Defaulting to empty string if no choices are returned may mask upstream issues.
Consider logging a warning or error when no choices are returned, as this could indicate an issue with the model or API.
```suggestion
# Extract text from OpenAI completion response
choice = response.choices[0] if response.choices else None
if not response.choices:
import logging
logging.warning("OpenAI completion response returned no choices. This may indicate an issue with the model or API.")
text = choice.text if choice else ""
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
- Updated LlamaStackInlineLLM and LlamaStackRemoteLLM to accept SamplingParams directly. - Removed unused prompt logging and token estimation methods. - Enhanced error handling for completion responses. - Adjusted tests to reflect changes in sampling parameters structure and usage. - Improved integration with Kubeflow by ensuring proper sampling parameters are passed.
pyproject.toml
Outdated
| [tool.hatch.metadata] | ||
| allow-direct-references = true | ||
|
|
There was a problem hiding this comment.
only temporary to allow git+https://github.com in the dependencies.
pyproject.toml
Outdated
| "llama-stack @ git+https://github.com/llamastack/llama-stack.git", | ||
| "llama-stack-client @ git+https://github.com/llamastack/llama-stack-client-python.git", |
There was a problem hiding this comment.
replace w/ 0.3.0 once ready.
There was a problem hiding this comment.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `src/llama_stack_provider_ragas/inline/wrappers_inline.py:39` </location>
<code_context>
async def aembed_documents(self, texts: list[str]) -> list[list[float]]:
"""Embed documents using Llama Stack inference API."""
try:
</code_context>
<issue_to_address>
**suggestion:** Consider handling empty or malformed embedding responses.
Currently, response.data is used without validation, which may cause exceptions if the API returns unexpected data. Please add checks to handle missing or malformed response data.
</issue_to_address>
### Comment 2
<location> `src/llama_stack_provider_ragas/remote/wrappers_remote.py:173-174` </location>
<code_context>
+ stop=self.sampling_params.stop if self.sampling_params else None,
)
+ if not response.choices:
+ logger.warning("Completion response returned no choices")
+
+ # Extract text from OpenAI completion response
</code_context>
<issue_to_address>
**suggestion:** Warns on empty choices but still appends empty generation.
Consider skipping the append or raising an error instead of adding an empty generation when no choices are returned.
</issue_to_address>
### Comment 3
<location> `src/llama_stack_provider_ragas/remote/kubeflow/components.py:111-116` </location>
<code_context>
+ # sampling_params is passed in from the benchmark config as model_dump()
+ # we need to convert it back to a SamplingParams object
+ sampling_params_obj = SamplingParams.model_validate(sampling_params)
+
llm = LlamaStackRemoteLLM(
base_url=llama_stack_base_url,
model_id=model,
- sampling_params=sampling_params,
+ sampling_params=sampling_params_obj,
)
embeddings = LlamaStackRemoteEmbeddings(
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Converts dict sampling_params to SamplingParams object.
If sampling_params is malformed, model_validate may raise an exception. Please add error handling or pre-validation to ensure robustness.
Suggested implementation:
```python
from llama_stack.apis.inference import SamplingParams
from pydantic import ValidationError
```
```python
# sampling_params is passed in from the benchmark config as model_dump()
# we need to convert it back to a SamplingParams object
try:
sampling_params_obj = SamplingParams.model_validate(sampling_params)
except ValidationError as e:
logger.error(f"Invalid sampling_params: {e}")
raise
```
</issue_to_address>
### Comment 4
<location> `pyproject.toml:28` </location>
<code_context>
keywords = ["llama-stack", "ragas", "evaluation"]
dependencies = [
"setuptools-scm",
- "llama-stack==0.2.23",
+ "llama-stack @ git+https://github.com/llamastack/llama-stack.git",
+ "llama-stack-client @ git+https://github.com/llamastack/llama-stack-client-python.git",
"greenlet==3.2.4", # inline/files/localfs errors saying greenlet not found
</code_context>
<issue_to_address>
**suggestion:** Direct references to git repositories for dependencies.
Pin the git dependencies to a specific commit or tag to ensure build reproducibility and stability.
Suggested implementation:
```
"llama-stack @ git+https://github.com/llamastack/llama-stack.git@<commit-or-tag>",
"llama-stack-client @ git+https://github.com/llamastack/llama-stack-client-python.git@<commit-or-tag>",
```
Replace `<commit-or-tag>` with the actual commit hash or tag you want to pin for each repository. For example, if you want to pin to commit `abc1234`, use:
`git+https://github.com/llamastack/llama-stack.git@abc1234`
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| if not response.choices: | ||
| logger.warning("Completion response returned no choices") |
There was a problem hiding this comment.
suggestion: Warns on empty choices but still appends empty generation.
Consider skipping the append or raising an error instead of adding an empty generation when no choices are returned.
There was a problem hiding this comment.
@nathan-weinberg I will post a PR to the distro to reflect these naming (provider_id/type) changes.
nathan-weinberg
left a comment
There was a problem hiding this comment.
Will do a more in-depth review once LLS 0.3.0 has released
ruivieira
left a comment
There was a problem hiding this comment.
@dmaniloff Changes LGTM, thanks.
Just added #27 as a reminder.
|
Thanks @dmaniloff for the quick fix 🙏🏽 Just we need this to be used for LLS 0.2.23 Would it make sense to have a new patch release for this purpose ? (i.e. 0.3.2) I am not sure if this would be needed for LLS 0.3.0, as the conflict causing deep (ibm-watson) is going to be dropped for LLS 0.3.0 please correct me if I'm wrong cc @ruivieira |
…rade pyarrow to >=21.0.0, and add datasets dependency.
d5eaa31
Uh oh!
There was an error while loading. Please reload this page.