You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(llama_stack): centralize vector/RAG config and shared helpers (#1266)
* feat(llama_stack): centralize vector/RAG config and shared helpers
- Fix automation for product bug https://redhat.atlassian.net/browse/RHAIENG-3816
- Move Postgres, vLLM, embedding, and AWS-related defaults into constants (env overrides)
- Add IBM 2025 earnings PDFs (encrypted/unencrypted) and finance query sets per search mode
- Add vector_store_create_and_poll, file-from-URL/path helpers, and upload assertions in utils
- Replace vector_store_with_example_docs with a doc_sources parameter on the vector_store fixture
- Refactor conftest and vector store + upgrade RAG tests to use the new constants and helpers
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
Made-with: Cursor
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
* fix: enhance doc_sources handling in vector_store fixture
- Improved error handling for doc_sources input, ensuring it is a list
and paths are validated against the repository root.
- Added logging for successful and failed ingestion of document sources.
- Streamlined the process for uploading files from URLs and local paths,
including directory handling.
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
* fix: delete unused constant
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
---------
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
These files are for **internal Open Data Hub / OpenShift AI integration tests** only. We use them to hit **[Llama Stack](https://github.com/meta-llama/llama-stack) vector store APIs**—think ingest, indexing, search, and the plumbing around that—not as a shipped dataset or for model training.
4
+
5
+
## IBM finance PDFs (`corpus/finance/`)
6
+
7
+
The PDFs here are IBM **quarterly earnings press releases** (the same material IBM posts for investors). If you need to replace or refresh them, download the official PDFs from IBM’s site:
8
+
9
+
[Quarterly earnings announcements](https://www.ibm.com/investor/financial-reporting/quarterly-earnings) (choose year and quarter, then open the press release PDF).
10
+
11
+
## PDF edge cases (`corpus/pdf-testing/`)
12
+
13
+
This folder is for **weird PDFs on purpose**: password-protected files, digitally signed ones (e.g. PAdES), and similar cases so we can test how ingestion and parsers behave when the file is not a plain “print to PDF” document.
14
+
15
+
## Small print
16
+
17
+
Not for external distribution as a “dataset.” PDFs stay under their publishers’ terms; don’t reuse them outside this test context without checking those terms.
0 commit comments