Skip to content

Enhance vector store testing using Dataset#1296

Merged
jgarciao merged 8 commits intoopendatahub-io:mainfrom
jgarciao:enhance-vector-store-dataset
Mar 26, 2026
Merged

Enhance vector store testing using Dataset#1296
jgarciao merged 8 commits intoopendatahub-io:mainfrom
jgarciao:enhance-vector-store-dataset

Conversation

@jgarciao
Copy link
Copy Markdown
Contributor

@jgarciao jgarciao commented Mar 25, 2026

Refactors vector store tests to use questions and answers stored tests/llama_stack/dataset and using the Dataset class defined at tests/llama_stack/datasets.py

In the modified README.md files you'll find more information about how to use it

Summary by CodeRabbit

  • Tests

    • Added a class-scoped dataset fixture and a structured dataset loader to drive parametrized tests.
    • Introduced a finance corpus and 60 Q&A ground-truth records for IBM 2025 quarters; removed legacy earnings constants.
    • Updated vector-store tests to be dataset-driven, require exact per-document uploads, accept/validate file attributes, enforce dataset vs doc-sources exclusivity, and improved document-ingestion error messaging.
  • Documentation

    • Expanded dataset docs with folder layout, JSON/JSONL schemas, usage examples, and loading semantics.

@jgarciao jgarciao requested a review from a team as a code owner March 25, 2026 11:49
@github-actions
Copy link
Copy Markdown

The following are automatically added/executed:

  • PR size label.
  • Run pre-commit
  • Run tox
  • Add PR author as the PR assignee
  • Build image based on the PR

Available user actions:

  • To mark a PR as WIP, add /wip in a comment. To remove it from the PR comment /wip cancel to the PR.
  • To block merging of a PR, add /hold in a comment. To un-block merging of PR comment /hold cancel.
  • To mark a PR as approved, add /lgtm in a comment. To remove, add /lgtm cancel.
    lgtm label removed on each new commit push.
  • To mark PR as verified comment /verified to the PR, to un-verify comment /verified cancel to the PR.
    verified label removed on each new commit push.
  • To Cherry-pick a merged PR /cherry-pick <target_branch_name> to the PR. If <target_branch_name> is valid,
    and the current PR is merged, a cherry-picked PR would be created and linked to the current PR.
  • To build and push image to quay, add /build-push-pr-image in a comment. This would create an image with tag
    pr-<pr_number> to quay repository. This image tag, however would be deleted on PR merge or close action.
Supported labels

{'/lgtm', '/wip', '/build-push-pr-image', '/verified', '/hold', '/cherry-pick'}

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 25, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a path-based Dataset abstraction and test corpora for IBM finance Q1–Q4 2025, introduces a class-scoped dataset pytest fixture, extends vector-store utilities to upload dataset documents (with attributes), refactors tests to consume datasets, and removes several IBM-specific constants.

Changes

Cohort / File(s) Summary
Fixtures & test harness
tests/llama_stack/conftest.py
Added a class-scoped dataset pytest fixture (indirect param required). Updated vector_store fixture to accept an indirect dataset param, enforce mutual exclusivity with doc_sources, and trigger dataset-driven ingestion; docstring and examples updated.
Dataset loader & instances
tests/llama_stack/datasets.py
New module with immutable dataclasses DatasetDocumentQA, DatasetDocument, Dataset; loader helpers _load_document_qa / _load_documents_from_manifest; Dataset.load_qa(); and exported dataset instances FINANCE_DATASET, IBM_2025_Q4_EARNINGS, IBM_2025_Q4_EARNINGS_ENCRYPTED. Note: manifest-based path resolution may allow path traversal if manifest entries contain .. — review for CWE-22 (Path Traversal) and validate/normalize input (CWE-20).
Dataset files & docs
tests/llama_stack/dataset/...
tests/llama_stack/dataset/corpus/finance/documents.json, tests/llama_stack/dataset/ground_truth/finance_qa.jsonl, tests/llama_stack/dataset/README.md
Added a 4-entry document manifest and a 60-record QA JSONL ground-truth file; expanded README with layout, schemas, loader usage, and examples.
Utilities
tests/llama_stack/utils.py
Threaded optional attributes through file-create, poll, and assertion helpers; added vector_store_upload_dataset() to upload dataset documents with attributes; minor log/message fixes and a control-flow tweak in readiness helper. Verify attribute validation to avoid untrusted metadata injection (CWE-20).
Tests updated
tests/llama_stack/vector_io/test_vector_stores.py, tests/llama_stack/vector_io/upgrade/test_upgrade_vector_store_rag.py
Refactored tests to parametrize with dataset objects, derive search queries from dataset.load_qa(), assert uploaded-file counts equal len(dataset.documents), and replace direct document-constant usage with dataset references.
Constants removal
tests/llama_stack/constants.py
Removed IBM earnings-specific constants: IBM_2025_Q4_EARNINGS_DOC_ENCRYPTED, IBM_2025_Q4_EARNINGS_DOC_UNENCRYPTED, and IBM_EARNINGS_SEARCH_QUERIES_BY_MODE.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 1 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description is sparse and does not follow the provided template structure; it lacks required sections including Summary, Related Issues (with Fixes/JIRA), testing checkboxes, and Additional Requirements sections. Expand the description to match the template: add a detailed Summary explaining why the refactoring was needed, link any related issues under Related Issues, check the testing indicators (Locally/Jenkins), and complete Additional Requirements if applicable.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and directly summarizes the main change—refactoring vector store tests to use a Dataset abstraction—which aligns with the core objective across multiple modified test files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
tests/llama_stack/utils.py (1)

82-82: Loose typing on attributes parameter.

attributes: Any is inconsistent with vector_store_create_file_from_path (line 265) which uses dict[str, str | int | float | bool] | None. Align the types for consistency and type-checker benefits.

Fix type annotation
-    attributes: Any,
+    attributes: dict[str, str | int | float | bool] | None = None,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/llama_stack/utils.py` at line 82, Update the loose typing for the
attributes parameter to match vector_store_create_file_from_path: replace
attributes: Any with attributes: dict[str, str | int | float | bool] | None in
the function signature (and any related internal type hints) so the parameter
uses the same precise union mapping as vector_store_create_file_from_path for
consistency and better type checking.
tests/llama_stack/vector_io/test_vector_stores.py (1)

199-199: Unhandled StopIteration if no vector-mode QA records exist.

next() on an empty generator raises StopIteration, which pytest won't report as a clear assertion failure. While unlikely with controlled test data, consider providing a default or explicit assertion.

Option: Use next() with default and assert
-        vector_question = next(r.question for r in dataset.load_qa(retrieval_mode="vector"))
+        vector_records = dataset.load_qa(retrieval_mode="vector")
+        assert vector_records, f"Dataset has no QA records with retrieval_mode='vector'"
+        vector_question = vector_records[0].question
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/llama_stack/vector_io/test_vector_stores.py` at line 199, The test
currently assigns vector_question = next(r.question for r in
dataset.load_qa(retrieval_mode="vector")) which will raise StopIteration if no
vector-mode QA records are returned; change this to use a safe default and an
explicit assertion: call next(..., None) on the generator returned by
dataset.load_qa(retrieval_mode="vector") to get a possibly None vector_question,
then assert that vector_question is not None (with a descriptive message) before
using it, so missing vector-mode results produce a clear test failure instead of
an unhandled exception.
tests/llama_stack/conftest.py (1)

836-850: Silent precedence when both dataset and doc_sources are provided.

The docstring states these parameters are "mutually exclusive," but providing both causes dataset to silently win. Consider logging a warning or raising an error if both are non-None to catch test misconfiguration early.

Optional: Add explicit check
+        if dataset and doc_sources:
+            LOGGER.warning(
+                "Both 'dataset' and 'doc_sources' provided to vector_store fixture; "
+                "using 'dataset' (doc_sources ignored)"
+            )
         if dataset or doc_sources:
             try:
                 if dataset:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/llama_stack/conftest.py` around lines 836 - 850, The code silently
prefers dataset when both dataset and doc_sources are provided in the block that
calls vector_store_upload_dataset and vector_store_upload_doc_sources; add an
explicit check at the start of that branch to detect both non-None and either
raise a ValueError (to fail fast) or emit a clear warning via the test logger
indicating the mutually exclusive parameters were both supplied, then proceed
only with one path (or bail out). Reference the dataset and doc_sources
variables and the call sites vector_store_upload_dataset and
vector_store_upload_doc_sources so you can add the guard immediately before
those calls.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/llama_stack/dataset/corpus/finance/documents.json`:
- Line 47: The "publication_date" value for the Q4 2025 document is wrong
(1738022400); update the JSON object's "publication_date" field for the Q4 2025
entry to the correct Unix timestamp for the expected release date (e.g.,
1769558400 for Jan 28, 2026) so the test data matches Q4 2025 timing.

In `@tests/llama_stack/utils.py`:
- Line 293: Fix the typo in the log messages where the filename separator is
missing: update the LOGGER.info calls that reference uploaded_file.filename and
vector_store.id (the line with LOGGER.info(f"Adding uploaded file
(filename{uploaded_file.filename} to vector store {vector_store.id}") and the
similar call around line 303) to include "filename=" before the interpolated
uploaded_file.filename so the f-string becomes
"...(filename={uploaded_file.filename}..." to produce correct logs.

---

Nitpick comments:
In `@tests/llama_stack/conftest.py`:
- Around line 836-850: The code silently prefers dataset when both dataset and
doc_sources are provided in the block that calls vector_store_upload_dataset and
vector_store_upload_doc_sources; add an explicit check at the start of that
branch to detect both non-None and either raise a ValueError (to fail fast) or
emit a clear warning via the test logger indicating the mutually exclusive
parameters were both supplied, then proceed only with one path (or bail out).
Reference the dataset and doc_sources variables and the call sites
vector_store_upload_dataset and vector_store_upload_doc_sources so you can add
the guard immediately before those calls.

In `@tests/llama_stack/utils.py`:
- Line 82: Update the loose typing for the attributes parameter to match
vector_store_create_file_from_path: replace attributes: Any with attributes:
dict[str, str | int | float | bool] | None in the function signature (and any
related internal type hints) so the parameter uses the same precise union
mapping as vector_store_create_file_from_path for consistency and better type
checking.

In `@tests/llama_stack/vector_io/test_vector_stores.py`:
- Line 199: The test currently assigns vector_question = next(r.question for r
in dataset.load_qa(retrieval_mode="vector")) which will raise StopIteration if
no vector-mode QA records are returned; change this to use a safe default and an
explicit assertion: call next(..., None) on the generator returned by
dataset.load_qa(retrieval_mode="vector") to get a possibly None vector_question,
then assert that vector_question is not None (with a descriptive message) before
using it, so missing vector-mode results produce a clear test failure instead of
an unhandled exception.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 3ed501ad-d4f6-4583-8f7d-4e706efe6371

📥 Commits

Reviewing files that changed from the base of the PR and between 84eb111 and f63a80e.

📒 Files selected for processing (9)
  • tests/llama_stack/conftest.py
  • tests/llama_stack/constants.py
  • tests/llama_stack/dataset/README.md
  • tests/llama_stack/dataset/corpus/finance/documents.json
  • tests/llama_stack/dataset/ground_truth/finance_qa.jsonl
  • tests/llama_stack/datasets.py
  • tests/llama_stack/utils.py
  • tests/llama_stack/vector_io/test_vector_stores.py
  • tests/llama_stack/vector_io/upgrade/test_upgrade_vector_store_rag.py
💤 Files with no reviewable changes (1)
  • tests/llama_stack/constants.py

@jgarciao jgarciao force-pushed the enhance-vector-store-dataset branch from f63a80e to f4de94e Compare March 25, 2026 12:59
@jgarciao
Copy link
Copy Markdown
Contributor Author

/build-push-pr-image

@github-actions
Copy link
Copy Markdown

Status of building tag pr-1296: success.
Status of pushing tag pr-1296 to image registry: success.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/llama_stack/vector_io/test_vector_stores.py (1)

122-131: ⚠️ Potential issue | 🟠 Major

Add a bounded wait before exact completed-file count assertion to prevent flaky races.

Line 128 asserts an exact count immediately after listing filter="completed". If ingestion/indexing visibility is asynchronous, this will intermittently fail despite successful uploads. Poll until expected count (with timeout), then assert.

Suggested fix
+import time
...
-        completed_files = list(
-            unprivileged_llama_stack_client.vector_stores.files.list(
-                vector_store_id=store_id,
-                filter="completed",
-            )
-        )
-        assert len(completed_files) == len(dataset.documents), (
+        expected = len(dataset.documents)
+        deadline = time.time() + 60
+        while True:
+            completed_files = list(
+                unprivileged_llama_stack_client.vector_stores.files.list(
+                    vector_store_id=store_id,
+                    filter="completed",
+                )
+            )
+            if len(completed_files) == expected or time.time() >= deadline:
+                break
+            time.sleep(2)
+
+        assert len(completed_files) == expected, (
             f"Expected {len(dataset.documents)} completed vector store file(s) in {store_id!r} after upload, "
             f"found {len(completed_files)}"
         )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/llama_stack/vector_io/test_vector_stores.py` around lines 122 - 131,
The test currently asserts exact completed file count immediately after calling
unprivileged_llama_stack_client.vector_stores.files.list with
filter="completed", which can race with asynchronous ingestion; change the test
to poll until len(completed_files) == len(dataset.documents) (or until a short
timeout, e.g. few seconds) by repeatedly calling
unprivileged_llama_stack_client.vector_stores.files.list(vector_store_id=store_id,
filter="completed") with a small sleep between attempts and then assert the
count once satisfied (fail if timeout reached); locate this logic near the
existing completed_files block in
tests/llama_stack/vector_io/test_vector_stores.py and update the assertion to
use the polling loop referencing store_id and dataset.documents.
🧹 Nitpick comments (1)
tests/llama_stack/vector_io/test_vector_stores.py (1)

18-30: Remove duplicated dataset sources in test parametrization to avoid drift bugs.

dataset is carried both in vector_store["dataset"] and as a separate parametrized argument. A future edit can accidentally mismatch them and produce false failures (fixture uploads one dataset, assertions read another). Keep a single source of truth per test case.

Suggested refactor
-    "unprivileged_model_namespace, llama_stack_server_config, vector_store, dataset",
+    "unprivileged_model_namespace, llama_stack_server_config, vector_store",
...
-            {"vector_io_provider": "milvus", "dataset": IBM_2025_Q4_EARNINGS},
-            IBM_2025_Q4_EARNINGS,
+            {"vector_io_provider": "milvus", "dataset": IBM_2025_Q4_EARNINGS},
...
     def test_vector_stores_file_upload(
         self,
         unprivileged_llama_stack_client: LlamaStackClient,
         vector_store: VectorStore,
-        dataset: Dataset,
     ) -> None:
+        dataset: Dataset = vector_store.metadata["dataset"]  # or expose via fixture return type

Also applies to: 41-43, 53-55, 65-67, 77-79, 89-91

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/llama_stack/vector_io/test_vector_stores.py` around lines 18 - 30, The
test parametrization duplicates the dataset by including it inside the
vector_store dict and also as the separate "dataset" param, which risks drift;
remove the "dataset": ... entries from each vector_store dict in
tests/llama_stack/vector_io/test_vector_stores.py (the pytest.param entries that
currently include a vector_store dict and a separate dataset argument) so the
single source of truth is the standalone "dataset" parameter, and update the
corresponding pytest.param tuples at the other occurrences (the subsequent cases
mentioned) to rely only on the top-level dataset argument while leaving
vector_store keys like "vector_io_provider" unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/llama_stack/vector_io/test_vector_stores.py`:
- Around line 158-161: The test forcibly sets search_modes = ["vector"] when
provider_id == "faiss", but then immediately does queries =
queries_by_mode[search_mode] which raises KeyError if there are no vector QA
entries; add an explicit assertion before the loop such as assert "vector" in
queries_by_mode when provider_id == "faiss" to fail the test with a clear
message (reference symbols: provider_id, search_modes, queries_by_mode,
search_mode).

---

Outside diff comments:
In `@tests/llama_stack/vector_io/test_vector_stores.py`:
- Around line 122-131: The test currently asserts exact completed file count
immediately after calling
unprivileged_llama_stack_client.vector_stores.files.list with
filter="completed", which can race with asynchronous ingestion; change the test
to poll until len(completed_files) == len(dataset.documents) (or until a short
timeout, e.g. few seconds) by repeatedly calling
unprivileged_llama_stack_client.vector_stores.files.list(vector_store_id=store_id,
filter="completed") with a small sleep between attempts and then assert the
count once satisfied (fail if timeout reached); locate this logic near the
existing completed_files block in
tests/llama_stack/vector_io/test_vector_stores.py and update the assertion to
use the polling loop referencing store_id and dataset.documents.

---

Nitpick comments:
In `@tests/llama_stack/vector_io/test_vector_stores.py`:
- Around line 18-30: The test parametrization duplicates the dataset by
including it inside the vector_store dict and also as the separate "dataset"
param, which risks drift; remove the "dataset": ... entries from each
vector_store dict in tests/llama_stack/vector_io/test_vector_stores.py (the
pytest.param entries that currently include a vector_store dict and a separate
dataset argument) so the single source of truth is the standalone "dataset"
parameter, and update the corresponding pytest.param tuples at the other
occurrences (the subsequent cases mentioned) to rely only on the top-level
dataset argument while leaving vector_store keys like "vector_io_provider"
unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 18ebf898-e65b-4303-8e30-59c7877a6459

📥 Commits

Reviewing files that changed from the base of the PR and between f63a80e and 51b511e.

📒 Files selected for processing (4)
  • tests/llama_stack/conftest.py
  • tests/llama_stack/dataset/corpus/finance/documents.json
  • tests/llama_stack/utils.py
  • tests/llama_stack/vector_io/test_vector_stores.py
✅ Files skipped from review due to trivial changes (1)
  • tests/llama_stack/dataset/corpus/finance/documents.json
🚧 Files skipped from review as they are similar to previous changes (2)
  • tests/llama_stack/conftest.py
  • tests/llama_stack/utils.py

Introduces a lightweight dataset loader (datasets.py) with DatasetDocumentQA
and DatasetDocumentMetadata dataclasses, a documents.json manifest for
per-file attribute metadata, and QA records across IBM quarterly earnings
press releases.

Refactor the vector store fixture to support Dataset instances for document
uploads, improving test setup flexibility. Update vector store upload
functions to handle datasets and their associated attributes. Remove
deprecated constants and adjust test cases to use the new dataset structure.

Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
Made-with: Cursor
@jgarciao jgarciao force-pushed the enhance-vector-store-dataset branch from ccc4497 to 8a51abe Compare March 25, 2026 17:37
@jgarciao
Copy link
Copy Markdown
Contributor Author

/build-push-pr-image

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/llama_stack/conftest.py`:
- Around line 813-815: Replace the silent-priority behavior where both dataset
and doc_sources are read from params (the lines with dataset: Dataset | None =
params.get("dataset") and doc_sources: list[str] | None =
params.get("doc_sources")) with an explicit exclusivity check: after reading
both values, if both are not None raise a ValueError (or pytest.fail) explaining
that only one of "dataset" or "doc_sources" may be provided; apply the same
check to the other occurrence handling these variables (the block around the
second occurrence) so tests fail fast on misconfiguration instead of silently
preferring dataset.
- Around line 725-737: The dataset fixture should validate that request.param
exists and is the expected Dataset type and fail fast with a clear error; update
the dataset fixture (the function named dataset that takes request:
FixtureRequest and returns Dataset) to first check that hasattr(request,
"param") (or catch AttributeError), then verify isinstance(request.param,
Dataset), and if either check fails raise a descriptive error (e.g.,
pytest.UsageError or ValueError) stating that the fixture must be
indirect-parametrized with a Dataset instance so failures are explicit.

In `@tests/llama_stack/dataset/README.md`:
- Around line 116-119: Rewrite the step list in
tests/llama_stack/dataset/README.md to remove the repetitive "Create" lead-in
for steps 2–4: keep step 1 as-is, then use a single lead-in sentence such as
"Then add the following files and configuration for the subject:" and list the
items `corpus/<subject>/documents.json` (with schema note),
`ground_truth/<subject>_qa.jsonl` (with schema note), and a new Dataset entry in
tests/llama_stack/datasets.py following the FINANCE_DATASET pattern; ensure each
bullet stays concise and preserves the original file names and guidance.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 85014930-7c1f-4230-9483-768c1010e72b

📥 Commits

Reviewing files that changed from the base of the PR and between 51b511e and 8a51abe.

📒 Files selected for processing (9)
  • tests/llama_stack/conftest.py
  • tests/llama_stack/constants.py
  • tests/llama_stack/dataset/README.md
  • tests/llama_stack/dataset/corpus/finance/documents.json
  • tests/llama_stack/dataset/ground_truth/finance_qa.jsonl
  • tests/llama_stack/datasets.py
  • tests/llama_stack/utils.py
  • tests/llama_stack/vector_io/test_vector_stores.py
  • tests/llama_stack/vector_io/upgrade/test_upgrade_vector_store_rag.py
💤 Files with no reviewable changes (1)
  • tests/llama_stack/constants.py
✅ Files skipped from review due to trivial changes (1)
  • tests/llama_stack/dataset/corpus/finance/documents.json
🚧 Files skipped from review as they are similar to previous changes (5)
  • tests/llama_stack/vector_io/upgrade/test_upgrade_vector_store_rag.py
  • tests/llama_stack/dataset/ground_truth/finance_qa.jsonl
  • tests/llama_stack/vector_io/test_vector_stores.py
  • tests/llama_stack/datasets.py
  • tests/llama_stack/utils.py

@github-actions
Copy link
Copy Markdown

Status of building tag pr-1296: success.
Status of pushing tag pr-1296 to image registry: success.

Refactor the vector store fixture to support Dataset instances
for document uploads, improving test setup flexibility.
Update vector store upload functions to handle datasets and
their associated attributes. Remove deprecated constants and
adjust test cases to use the new dataset structure.

Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
Made-with: Cursor
@jgarciao jgarciao force-pushed the enhance-vector-store-dataset branch from 8a51abe to 70609e5 Compare March 25, 2026 17:54
@jgarciao
Copy link
Copy Markdown
Contributor Author

/build-push-pr-image

@github-actions
Copy link
Copy Markdown

Status of building tag pr-1296: success.
Status of pushing tag pr-1296 to image registry: success.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/llama_stack/vector_io/test_vector_stores.py`:
- Around line 156-157: After filtering search_modes for FAISS, ensure we don't
silently proceed with an empty list: check the variable search_modes (after the
provider_id == "faiss" branch) and either assert it's non-empty or call
pytest.skip with a clear message so the test fails or is skipped rather than
silently passing; update the block that contains provider_id and search_modes to
include this check immediately after the filtering so the subsequent for loop
over search_modes is guarded.
- Line 202: The test risks a StopIteration when calling next(...) over
dataset.load_qa(retrieval_mode="vector"); update the test to safely handle empty
results by consuming the generator into a list or using next(..., None) and then
assert the result is not None with a clear failure message (reference:
dataset.load_qa, vector_question variable, retrieval_mode="vector") so the test
fails with a descriptive assertion instead of raising StopIteration.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: fd21a939-0e1b-4ef8-b0c1-dfd725add207

📥 Commits

Reviewing files that changed from the base of the PR and between 8a51abe and 70609e5.

📒 Files selected for processing (3)
  • tests/llama_stack/conftest.py
  • tests/llama_stack/dataset/README.md
  • tests/llama_stack/vector_io/test_vector_stores.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/llama_stack/dataset/README.md

Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
Copy link
Copy Markdown
Contributor

@Ygnas Ygnas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
Made-with: Cursor
using f-strings to fix structlog not applying %s/%d-style substitution

Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
Made-with: Cursor
Copy link
Copy Markdown
Contributor

@Ygnas Ygnas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@jgarciao jgarciao enabled auto-merge (squash) March 26, 2026 14:00
Copy link
Copy Markdown
Contributor

@ChristianZaccaria ChristianZaccaria left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm /approve

This is a great PR to move to a dataset-driven model, where document corpus and QA ground truth are defined once and reused across providers and tests. This also offers greater modularity for the tests.

Great work Jorge! - I only left one nit. Thanks!

@jgarciao jgarciao merged commit e81fb1f into opendatahub-io:main Mar 26, 2026
9 checks passed
@github-actions
Copy link
Copy Markdown

Status of building tag latest: success.
Status of pushing tag latest to image registry: success.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants