Skip to content

[duplicate] Update to OpenAI API and set LLS>=0.2.23#25

Merged
25 commits merged intotrustyai-explainability:mainfrom
dmaniloff:upgrade-to-lls-0.3.0
Oct 17, 2025
Merged

[duplicate] Update to OpenAI API and set LLS>=0.2.23#25
25 commits merged intotrustyai-explainability:mainfrom
dmaniloff:upgrade-to-lls-0.3.0

Conversation

@dmaniloff
Copy link
Copy Markdown
Collaborator

@dmaniloff dmaniloff commented Oct 10, 2025

  • Updated calls to embeddings and completions to use OpenAI spec.
  • Refactored provider specifications to return a list w/ inline and remote.
    • Remote is included if dependencies are installed.
  • Unified demo notebooks into one, and run.yaml into one as well.
  • Updated tests.

- Changed llama-stack and llama-stack-client dependencies to point to GitHub repositories.
- Unified demo notebooks into one, and run.yaml into one as well.
- Refactored provider specifications to return a list.
- Updated calls to embeddings and completions.
@dmaniloff dmaniloff marked this pull request as draft October 10, 2025 16:00
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `src/llama_stack_provider_ragas/inline/wrappers_inline.py:41-45` </location>
<code_context>
-                model_id=self.embedding_model_id,
-                contents=texts,
-                task_type=EmbeddingTaskType.document,
+            response = await self.inference_api.openai_embeddings(
+                model=self.embedding_model_id,
+                input=texts,
             )
-            return response.embeddings  # type: ignore
+            return [data.embedding for data in response.data]  # type: ignore
         except Exception as e:
             logger.error(f"Document embedding failed: {str(e)}")
</code_context>

<issue_to_address>
**issue:** Consider handling empty or malformed response data in embedding extraction.

Add validation to ensure response.data is a non-empty list of objects with an 'embedding' attribute to prevent runtime errors.
</issue_to_address>

### Comment 2
<location> `src/llama_stack_provider_ragas/inline/wrappers_inline.py:140-143` </location>
<code_context>
                 "provider": "llama_stack",
             }

+            # sampling params for this generation should be set via the benchmark config
+            # we will ignore the temperature and stop params passed in here
             for _ in range(n):
-                response = await self.inference_api.completion(
-                    model_id=self.model_id,
-                    content=prompt_text,
-                    sampling_params=gen_sampling_params,
+                response = await self.inference_api.openai_completion(
+                    model=self.model_id,
+                    prompt=prompt_text,
</code_context>

<issue_to_address>
**question:** The temperature and stop parameters passed to agenerate_text are now ignored.

If this behavior is intentional, please add documentation to clarify that function arguments for temperature and stop are ignored in favor of self.sampling_params.
</issue_to_address>

### Comment 3
<location> `src/llama_stack_provider_ragas/inline/wrappers_inline.py:168-170` </location>
<code_context>
                 )

+                # Extract text from OpenAI completion response
+                choice = response.choices[0] if response.choices else None
+                text = choice.text if choice else ""
+
                 # Store Llama Stack response info in llm_output
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Defaulting to empty string if no choices are returned may mask upstream issues.

Consider logging a warning or error when no choices are returned, as this could indicate an issue with the model or API.

```suggestion
                # Extract text from OpenAI completion response
                choice = response.choices[0] if response.choices else None
                if not response.choices:
                    import logging
                    logging.warning("OpenAI completion response returned no choices. This may indicate an issue with the model or API.")
                text = choice.text if choice else ""
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

- Updated LlamaStackInlineLLM and LlamaStackRemoteLLM to accept SamplingParams directly.
- Removed unused prompt logging and token estimation methods.
- Enhanced error handling for completion responses.
- Adjusted tests to reflect changes in sampling parameters structure and usage.
- Improved integration with Kubeflow by ensuring proper sampling parameters are passed.
pyproject.toml Outdated
Comment on lines +88 to +90
[tool.hatch.metadata]
allow-direct-references = true

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only temporary to allow git+https://github.com in the dependencies.

pyproject.toml Outdated
Comment on lines +28 to +29
"llama-stack @ git+https://github.com/llamastack/llama-stack.git",
"llama-stack-client @ git+https://github.com/llamastack/llama-stack-client-python.git",
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replace w/ 0.3.0 once ready.

@trustyai-explainability trustyai-explainability deleted a comment from sourcery-ai bot Oct 13, 2025
@dmaniloff dmaniloff changed the title upgrade to llama-stack 0.3.0. Upgrade to llama-stack 0.3.0. Oct 13, 2025
@dmaniloff dmaniloff marked this pull request as ready for review October 13, 2025 16:25
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `src/llama_stack_provider_ragas/inline/wrappers_inline.py:39` </location>
<code_context>
     async def aembed_documents(self, texts: list[str]) -> list[list[float]]:
         """Embed documents using Llama Stack inference API."""
         try:
</code_context>

<issue_to_address>
**suggestion:** Consider handling empty or malformed embedding responses.

Currently, response.data is used without validation, which may cause exceptions if the API returns unexpected data. Please add checks to handle missing or malformed response data.
</issue_to_address>

### Comment 2
<location> `src/llama_stack_provider_ragas/remote/wrappers_remote.py:173-174` </location>
<code_context>
+                    stop=self.sampling_params.stop if self.sampling_params else None,
                 )

+                if not response.choices:
+                    logger.warning("Completion response returned no choices")
+
+                # Extract text from OpenAI completion response
</code_context>

<issue_to_address>
**suggestion:** Warns on empty choices but still appends empty generation.

Consider skipping the append or raising an error instead of adding an empty generation when no choices are returned.
</issue_to_address>

### Comment 3
<location> `src/llama_stack_provider_ragas/remote/kubeflow/components.py:111-116` </location>
<code_context>

+    # sampling_params is passed in from the benchmark config as model_dump()
+    # we need to convert it back to a SamplingParams object
+    sampling_params_obj = SamplingParams.model_validate(sampling_params)
+
     llm = LlamaStackRemoteLLM(
         base_url=llama_stack_base_url,
         model_id=model,
-        sampling_params=sampling_params,
+        sampling_params=sampling_params_obj,
     )
     embeddings = LlamaStackRemoteEmbeddings(
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Converts dict sampling_params to SamplingParams object.

If sampling_params is malformed, model_validate may raise an exception. Please add error handling or pre-validation to ensure robustness.

Suggested implementation:

```python
    from llama_stack.apis.inference import SamplingParams
    from pydantic import ValidationError

```

```python
    # sampling_params is passed in from the benchmark config as model_dump()
    # we need to convert it back to a SamplingParams object
    try:
        sampling_params_obj = SamplingParams.model_validate(sampling_params)
    except ValidationError as e:
        logger.error(f"Invalid sampling_params: {e}")
        raise

```
</issue_to_address>

### Comment 4
<location> `pyproject.toml:28` </location>
<code_context>
 keywords = ["llama-stack", "ragas", "evaluation"]
 dependencies = [
     "setuptools-scm",
-    "llama-stack==0.2.23",
+    "llama-stack @ git+https://github.com/llamastack/llama-stack.git",
+    "llama-stack-client @ git+https://github.com/llamastack/llama-stack-client-python.git",
     "greenlet==3.2.4", # inline/files/localfs errors saying greenlet not found
</code_context>

<issue_to_address>
**suggestion:** Direct references to git repositories for dependencies.

Pin the git dependencies to a specific commit or tag to ensure build reproducibility and stability.

Suggested implementation:

```
    "llama-stack @ git+https://github.com/llamastack/llama-stack.git@<commit-or-tag>",
    "llama-stack-client @ git+https://github.com/llamastack/llama-stack-client-python.git@<commit-or-tag>",

```

Replace `<commit-or-tag>` with the actual commit hash or tag you want to pin for each repository. For example, if you want to pin to commit `abc1234`, use:
`git+https://github.com/llamastack/llama-stack.git@abc1234`
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +173 to +174
if not response.choices:
logger.warning("Completion response returned no choices")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Warns on empty choices but still appends empty generation.

Consider skipping the append or raising an error instead of adding an empty generation when no choices are returned.

@trustyai-explainability trustyai-explainability deleted a comment from sourcery-ai bot Oct 13, 2025
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nathan-weinberg I will post a PR to the distro to reflect these naming (provider_id/type) changes.

Copy link
Copy Markdown
Contributor

@nathan-weinberg nathan-weinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do a more in-depth review once LLS 0.3.0 has released

@ruivieira ruivieira added the enhancement New feature or request label Oct 14, 2025
Copy link
Copy Markdown
Member

@ruivieira ruivieira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmaniloff Changes LGTM, thanks.

Just added #27 as a reminder.

@dmaniloff dmaniloff requested a review from Elbehery October 14, 2025 13:24
@Elbehery
Copy link
Copy Markdown
Collaborator

Thanks @dmaniloff for the quick fix 🙏🏽

Just we need this to be used for LLS 0.2.23

Would it make sense to have a new patch release for this purpose ? (i.e. 0.3.2)

I am not sure if this would be needed for LLS 0.3.0, as the conflict causing deep (ibm-watson) is going to be dropped for LLS 0.3.0

please correct me if I'm wrong

cc @ruivieira

@dmaniloff dmaniloff marked this pull request as draft October 15, 2025 15:56
@dmaniloff dmaniloff closed this pull request by merging all changes into trustyai-explainability:main in d5eaa31 Oct 17, 2025
@dmaniloff dmaniloff changed the title Upgrade to llama-stack 0.3.0. [duplicate] Update to OpenAI API and set LLS>=0.2.23 Oct 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants