Skip to content

(feat): Auto rag migrate to responses api phase1#115

Merged
openshift-merge-bot[bot] merged 14 commits into
opendatahub-io:mainfrom
filip-komarzyniec:AutoRAG-migrate-to-responses-API-phase1
Jun 10, 2026
Merged

(feat): Auto rag migrate to responses api phase1#115
openshift-merge-bot[bot] merged 14 commits into
opendatahub-io:mainfrom
filip-komarzyniec:AutoRAG-migrate-to-responses-API-phase1

Conversation

@filip-komarzyniec

@filip-komarzyniec filip-komarzyniec commented Jun 1, 2026

Copy link
Copy Markdown

Description of your changes:

Aligned produced pattern.json file to the ADR document: https://github.com/LukaszCmielowski/architecture-decision-records/blob/autox_docs_updates/documentation/components/autorag/features/rag_pattern_inference.md, meaning:

  • added responses_template key representing the payload for Responses API calls,
  • changed vector_store to vectore_store_binding key with related changes
  • added create_model_response artifact (an interactive script) to each generated pattern
  • deleted build_responses_request artifact
  • simplified and improved SSL certificates handling for managed clusters scenarios
  • necessary changes to unit tests

Successful pipeline run: https://rh-ai.apps.rosa.ai-eng-gpu.socc.p3.openshiftapps.com/develop-train/pipelines/runs/ai-eng-cracow/runs/b213397d-0151-43d2-a30c-35bcba7e7482

Checklist:

Pre-Submission Checklist

Additional Checklist Items for New or Updated Components/Pipelines

  • metadata.yaml includes fresh lastVerified timestamp
  • All required files
    are present and complete
  • OWNERS file lists appropriate maintainers
  • README provides clear documentation with usage examples
  • Component follows snake_case naming convention
  • No security vulnerabilities in dependencies
  • Containerfile included if using a custom base image

Summary by CodeRabbit

  • Removed Features

    • Removed the deployment component that generated packaged Responses request bodies (examples, tests, READMEs, and the related pipeline step).
  • New Features

    • Added shared RAG notebook templates for indexing and inference workflows.
    • Added an interactive CLI script template for sending OpenAI-compatible responses requests.
  • Improvements

    • Refactored RAG template generation and OGX client initialization with improved SSL / self-signed-certificate handling and updated wiring.

@coderabbitai

coderabbitai Bot commented Jun 1, 2026

Copy link
Copy Markdown

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR removes the deployment prepare_responses component and its docs/tests/OWNERS entries; refactors training/rag_templates_optimization to load templates from components/training/autorag/shared and replace OGX SSL handling with an httpx probe/retry that can instantiate OgxClient with verify=False for self-signed certs; changes pattern JSON to emit settings.vector_store_binding and settings.responses_template; adds shared ogx_indexing and ogx_inference notebook templates and a create_model_response CLI script template; updates unit tests to mock httpx probes; removes the prepare-responses pipeline step; and updates pyproject packaging and package-data.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Description check ✅ Passed The description is largely complete with specific changes documented, an ADR reference, successful pipeline run verification, and pre-submission checklist completion noted. Required sections are present.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed Title accurately describes the main objective: migrating AutoRAG to use the Responses API, with phase 1 indicating this is the first part of a multi-phase effort.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@components/training/autorag/rag_templates_optimization/component.py`:
- Around line 696-700: In _build_pattern_json replace the incorrect use of the
built-in eval with the function parameter name: use evaluation_result.collection
(and any other properties accessed via eval) to reference the passed-in
evaluation_result; update the "vector_store_binding" dict entry (and any other
occurrences inside _build_pattern_json) so provider_id uses
vector_io_provider_id, provider_type uses getattr(provider, "provider_type",
"Unknown"), and vector_store_id uses evaluation_result.collection to avoid
shadowing the built-in eval.
- Around line 671-679: The variable generation_system_message_text is
accidentally created as a 1-tuple because of the trailing comma in the
generation.get call; change the assignment so generation_system_message_text is
a plain string (remove the extra comma/tuple wrapping around the default value)
so that responses_template.instructions and generation.system_message_text
receive a string rather than a tuple.

In
`@components/training/autorag/shared/notebook_templates/ogx_indexing_template.ipynb`:
- Around line 316-342: The call to create_ogx_client is passing the arguments in
reverse and re-reading env vars (discarding the getpass/default fallback);
change the invocation so it uses the already-resolved variables for base URL and
API key (the variables created earlier) and pass them in the correct order
matching the signature create_ogx_client(base_url, api_key) so the SSL probe
gets the URL and the OgxClient receives the API key.
- Around line 326-340: The except block for httpx.ConnectError in the OGX client
initialization currently only handles self-signed-certificate errors and
silently falls through for other ConnectError cases; update the except
httpx.ConnectError as e branch (around the httpx.get(base_url) probe and the
return OgxClient(...) logic) to re-raise the caught exception when the regex
search(r"\\bself.*signed.*certificate\\b", str(e)) does not match, so that
non-cert connection failures (DNS/refused/etc.) are not masked—keep the existing
self-signed handling and only return an OgxClient with verify=False when the
regex matches and the verify-free probe succeeds.

In
`@components/training/autorag/shared/notebook_templates/ogx_inference_template.ipynb`:
- Around line 105-131: The create_ogx_client call has its arguments swapped and
swallows non-self-signed ConnectError; fix by calling create_ogx_client with
(base_url, api_key) using os.getenv("OGX_CLIENT_BASE_URL") then
os.getenv("OGX_CLIENT_API_KEY") (so the getpass/default logic for
OGX_CLIENT_BASE_URL is respected), and inside create_ogx_client ensure the
except httpx.ConnectError as e branch only handles self-signed certificates
(matching via search) and re-raise the original ConnectError for all other cases
instead of suppressing it.

In
`@components/training/autorag/shared/script_templates/create_model_response.py.templ`:
- Around line 67-70: The code directly indexes assistant_last_message[0] and
output_text[0] after building them from response.json(), which can raise
IndexError/KeyError on unexpected payloads; update the parsing around
response.json(), assistant_last_message and output_text to validate the
structure before indexing (check that "output" exists and is a list, that there
is at least one message with role == "assistant", and that that message has a
"content" list containing an object with "type" == "output_text"), and handle
missing/empty cases by logging a clear error or falling back to a safe default
instead of indexing blindly.
- Around line 86-91: The except Exception block currently calls client.close()
but then allows the outer while loop to continue, causing subsequent
client.post(...) calls to run against a closed client; either remove the
client.close() from that except so the final client.close() after the loop
handles cleanup, or if the intent is to stop processing on any unexpected error,
call client.close() and then break or return immediately from the loop; locate
the while loop and the except Exception that references client.close() (and the
client.post calls) and implement one of these fixes so iterations never continue
using a closed client.

In `@pyproject.toml`:
- Line 94: The pyproject.toml lists
"kfp_components.components.training.autorag.shared" but that package isn't
discovered, causing CI failures; either add a Python package marker by creating
components/training/autorag/shared/__init__.py so discovery finds the package
(if it should be importable), or remove the
"kfp_components.components.training.autorag.shared" entry from pyproject.toml
and instead include its non-code assets via an existing discovered package
(e.g., kfp_components.components.training.autorag) using
package_data/MANIFEST.in or the pyproject include mechanism; update only the
package entry or add the __init__.py accordingly so package discovery passes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: b7fae005-309e-42be-9eee-27f5739b14a4

📥 Commits

Reviewing files that changed from the base of the PR and between ed02191 and b7218c2.

📒 Files selected for processing (23)
  • components/deployment/autorag/OWNERS
  • components/deployment/autorag/README.md
  • components/deployment/autorag/__init__.py
  • components/deployment/autorag/build_responses_request_bodies/OWNERS
  • components/deployment/autorag/build_responses_request_bodies/README.md
  • components/deployment/autorag/build_responses_request_bodies/__init__.py
  • components/deployment/autorag/build_responses_request_bodies/component.py
  • components/deployment/autorag/build_responses_request_bodies/example_pipelines.py
  • components/deployment/autorag/build_responses_request_bodies/metadata.yaml
  • components/deployment/autorag/build_responses_request_bodies/tests/__init__.py
  • components/deployment/autorag/build_responses_request_bodies/tests/openai_responses_request_validate.py
  • components/deployment/autorag/build_responses_request_bodies/tests/test_component_unit.py
  • components/deployment/autorag/build_responses_request_bodies/tests/test_openai_responses_api_compliance.py
  • components/training/autorag/rag_templates_optimization/component.py
  • components/training/autorag/rag_templates_optimization/notebook_templates/ogx_indexing_template.ipynb
  • components/training/autorag/rag_templates_optimization/notebook_templates/ogx_inference_template.ipynb
  • components/training/autorag/rag_templates_optimization/tests/test_component_unit.py
  • components/training/autorag/shared/notebook_templates/ogx_indexing_template.ipynb
  • components/training/autorag/shared/notebook_templates/ogx_inference_template.ipynb
  • components/training/autorag/shared/script_templates/create_model_response.py.templ
  • pipelines/training/autorag/documents_rag_optimization_pipeline/pipeline.py
  • pipelines/training/autorag/documents_rag_optimization_pipeline/tests/test_pipeline_unit.py
  • pyproject.toml
💤 Files with no reviewable changes (17)
  • components/deployment/autorag/build_responses_request_bodies/OWNERS
  • components/deployment/autorag/init.py
  • components/deployment/autorag/OWNERS
  • components/deployment/autorag/build_responses_request_bodies/README.md
  • components/deployment/autorag/build_responses_request_bodies/init.py
  • components/deployment/autorag/README.md
  • components/training/autorag/rag_templates_optimization/notebook_templates/ogx_indexing_template.ipynb
  • components/training/autorag/rag_templates_optimization/notebook_templates/ogx_inference_template.ipynb
  • pipelines/training/autorag/documents_rag_optimization_pipeline/tests/test_pipeline_unit.py
  • components/deployment/autorag/build_responses_request_bodies/metadata.yaml
  • components/deployment/autorag/build_responses_request_bodies/component.py
  • components/deployment/autorag/build_responses_request_bodies/tests/openai_responses_request_validate.py
  • components/deployment/autorag/build_responses_request_bodies/example_pipelines.py
  • pipelines/training/autorag/documents_rag_optimization_pipeline/pipeline.py
  • components/deployment/autorag/build_responses_request_bodies/tests/test_openai_responses_api_compliance.py
  • components/deployment/autorag/build_responses_request_bodies/tests/init.py
  • components/deployment/autorag/build_responses_request_bodies/tests/test_component_unit.py

Comment thread components/training/autorag/rag_templates_optimization/component.py Outdated
Comment thread components/training/autorag/rag_templates_optimization/component.py
Comment thread components/training/autorag/shared/notebook_templates/ogx_indexing_template.ipynb Outdated
Comment thread pyproject.toml
@filip-komarzyniec filip-komarzyniec force-pushed the AutoRAG-migrate-to-responses-API-phase1 branch from b7218c2 to f7b7fd3 Compare June 1, 2026 16:27
@MichalSteczko

Copy link
Copy Markdown

/retest

@LukaszCmielowski

LukaszCmielowski commented Jun 2, 2026

Copy link
Copy Markdown

@MichalSteczko please run e2e tests including generated py script and responses template under pattern.json.

@MichalSteczko MichalSteczko left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix the key error
Please add general error handling inside while true loop to avoid infinite looping.

Please add the messages of the caught errors.

Add the SSL error handling while verification is not disabled and server is not supporting verified connection.

Artifacts generated by the pipeline (FYI @LukaszCmielowski):
responses_artifacts.zip

Comment thread components/training/autorag/rag_templates_optimization/component.py

@LukaszCmielowski LukaszCmielowski left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like pattern.json is still incorrect:

pattern (1).json

  1. nulls are still there
        "timeout": null,
        "model_type": null,
        "provider_id": null,
        "provider_resource_id": null
      }
    },
  1. Retrieval method is simple with hybrid params; however in the responses template there are no hybrid params attached:
    "retrieval": {
      "method": "simple",
      "number_of_chunks": 3,
      "search_mode": "hybrid",
      "ranker_strategy": "rrf",
      "ranker_k": 60,
      "ranker_alpha": 1
    },
      "tools": [
        {
          "type": "file_search",
          "vector_store_ids": [
            "vs_c40a084f-6eca-43bf-8184-a0c5ad0d5308"
          ]
        }
      ],
  1. responses template the instruction seems like a result of hallucination - please check what was there in 3.4

  2. a) stream and store I'd suspect it should be vice versa b) make sure input as flat text works; wouldn't it be better to have messages passed there as it was in 3.4 ?. That will allow to have system message and instruction if required.

      "stream": false,
      "store": true,
      "input": "<user_query_placeholder>",

@filip-komarzyniec filip-komarzyniec force-pushed the AutoRAG-migrate-to-responses-API-phase1 branch from d2062b0 to 5d48957 Compare June 2, 2026 15:36
Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>
Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>
…ponents

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>
Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>
…ierarchy restructuring; project metadata related update

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>
… ADR doc

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>
Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>
Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>
@filip-komarzyniec filip-komarzyniec force-pushed the AutoRAG-migrate-to-responses-API-phase1 branch from 5d48957 to 2a2044f Compare June 2, 2026 18:35

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
components/training/autorag/shared/notebook_templates/ogx_indexing_template.ipynb (1)

342-342: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

getpass fallback is still discarded—os.getenv() bypasses resolved variables.

Line 314 resolves OGX_CLIENT_API_KEY with a getpass fallback, but line 342 calls os.getenv("OGX_CLIENT_API_KEY") directly. If the env var is unset, os.getenv() returns None and the interactive prompt is never used. Same issue for OGX_CLIENT_BASE_URL default.

The previous review fixed the argument order but not the variable usage.

🐛 Use the resolved variables
-client = create_ogx_client(os.getenv("OGX_CLIENT_BASE_URL"), os.getenv("OGX_CLIENT_API_KEY"))
+client = create_ogx_client(OGX_CLIENT_BASE_URL, OGX_CLIENT_API_KEY)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@components/training/autorag/shared/notebook_templates/ogx_indexing_template.ipynb`
at line 342, The call to create_ogx_client currently uses
os.getenv("OGX_CLIENT_BASE_URL") and os.getenv("OGX_CLIENT_API_KEY") which
bypasses the previously resolved values (including the getpass fallback); update
the invocation of create_ogx_client to pass the resolved variables (e.g., the
local variables that were set earlier for the base URL and API key, such as
ogx_base_url and ogx_api_key or whatever names were used at line 314) instead of
calling os.getenv() again so the interactive fallback is honored.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@components/training/autorag/shared/notebook_templates/ogx_indexing_template.ipynb`:
- Around line 326-331: The httpx.get calls in the notebook (the plain
httpx.get(base_url) probe and the retry httpx.get(base_url, verify=False)) lack
timeouts and can hang; update both calls to pass a sensible timeout (e.g.,
timeout=5 or timeout=10 seconds) so the requests fail fast on unresponsive
endpoints and handle the resulting exceptions consistently (preserving the
existing ConnectError handling flow).

---

Duplicate comments:
In
`@components/training/autorag/shared/notebook_templates/ogx_indexing_template.ipynb`:
- Line 342: The call to create_ogx_client currently uses
os.getenv("OGX_CLIENT_BASE_URL") and os.getenv("OGX_CLIENT_API_KEY") which
bypasses the previously resolved values (including the getpass fallback); update
the invocation of create_ogx_client to pass the resolved variables (e.g., the
local variables that were set earlier for the base URL and API key, such as
ogx_base_url and ogx_api_key or whatever names were used at line 314) instead of
calling os.getenv() again so the interactive fallback is honored.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 3e9bd6fc-9d30-4796-8c02-ee2035df494c

📥 Commits

Reviewing files that changed from the base of the PR and between 5d48957 and 2a2044f.

📒 Files selected for processing (24)
  • components/deployment/autorag/OWNERS
  • components/deployment/autorag/README.md
  • components/deployment/autorag/__init__.py
  • components/deployment/autorag/build_responses_request_bodies/OWNERS
  • components/deployment/autorag/build_responses_request_bodies/README.md
  • components/deployment/autorag/build_responses_request_bodies/__init__.py
  • components/deployment/autorag/build_responses_request_bodies/component.py
  • components/deployment/autorag/build_responses_request_bodies/example_pipelines.py
  • components/deployment/autorag/build_responses_request_bodies/metadata.yaml
  • components/deployment/autorag/build_responses_request_bodies/tests/__init__.py
  • components/deployment/autorag/build_responses_request_bodies/tests/openai_responses_request_validate.py
  • components/deployment/autorag/build_responses_request_bodies/tests/test_component_unit.py
  • components/deployment/autorag/build_responses_request_bodies/tests/test_openai_responses_api_compliance.py
  • components/training/autorag/rag_templates_optimization/component.py
  • components/training/autorag/rag_templates_optimization/notebook_templates/ogx_indexing_template.ipynb
  • components/training/autorag/rag_templates_optimization/notebook_templates/ogx_inference_template.ipynb
  • components/training/autorag/rag_templates_optimization/tests/test_component_unit.py
  • components/training/autorag/shared/__init__.py
  • components/training/autorag/shared/notebook_templates/ogx_indexing_template.ipynb
  • components/training/autorag/shared/notebook_templates/ogx_inference_template.ipynb
  • components/training/autorag/shared/script_templates/create_model_response.py.templ
  • pipelines/training/autorag/documents_rag_optimization_pipeline/pipeline.py
  • pipelines/training/autorag/documents_rag_optimization_pipeline/tests/test_pipeline_unit.py
  • pyproject.toml
💤 Files with no reviewable changes (17)
  • components/deployment/autorag/README.md
  • components/deployment/autorag/build_responses_request_bodies/README.md
  • components/deployment/autorag/build_responses_request_bodies/metadata.yaml
  • components/deployment/autorag/build_responses_request_bodies/example_pipelines.py
  • pipelines/training/autorag/documents_rag_optimization_pipeline/tests/test_pipeline_unit.py
  • components/deployment/autorag/build_responses_request_bodies/tests/test_component_unit.py
  • components/deployment/autorag/build_responses_request_bodies/init.py
  • pipelines/training/autorag/documents_rag_optimization_pipeline/pipeline.py
  • components/deployment/autorag/OWNERS
  • components/deployment/autorag/build_responses_request_bodies/OWNERS
  • components/training/autorag/rag_templates_optimization/notebook_templates/ogx_indexing_template.ipynb
  • components/deployment/autorag/build_responses_request_bodies/tests/test_openai_responses_api_compliance.py
  • components/deployment/autorag/build_responses_request_bodies/component.py
  • components/deployment/autorag/init.py
  • components/training/autorag/rag_templates_optimization/notebook_templates/ogx_inference_template.ipynb
  • components/deployment/autorag/build_responses_request_bodies/tests/init.py
  • components/deployment/autorag/build_responses_request_bodies/tests/openai_responses_request_validate.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • components/training/autorag/shared/script_templates/create_model_response.py.templ
  • pyproject.toml
  • components/training/autorag/shared/notebook_templates/ogx_inference_template.ipynb
  • components/training/autorag/rag_templates_optimization/tests/test_component_unit.py
  • components/training/autorag/rag_templates_optimization/component.py

Comment thread components/training/autorag/shared/notebook_templates/ogx_indexing_template.ipynb Outdated
@filip-komarzyniec

filip-komarzyniec commented Jun 2, 2026

Copy link
Copy Markdown
Author

@LukaszCmielowski

  1. removed

  2. hybrid search should be now correctly reflected in the generated responses_template pattern's section,

  3. It's exactly the same as before. The system_text differs based on the model and originates from ai4rag. Majority of the system_text templates placed there have not been modified for the last 5 months. (https://github.com/IBM/ai4rag/blob/main/ai4rag/search_space/src/model_props.py)

  4. streaming is currently not supported in ai4rag so it has to be disabled in the responses API call. store is by default true in OpenAI API, however looking at the source code for the OGX python client it seems to be disabled by default so I'll change it accordingly.

Recent successful run with the changes from the PR: https://rh-ai.apps.rosa.ai-eng-gpu.socc.p3.openshiftapps.com/develop-train/pipelines/runs/ai-eng-cracow/runs/f6258582-d8af-4041-9b7d-3807e2518a36 producing the attached artifacts (including only relevant):

create_model_response_final.py
pattern_final.json

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

@LukaszCmielowski LukaszCmielowski left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are still some issues in pattern.json:

  1. "instructions" is missing the first sentence from the "system_message_text" about the language.
  2. There is mismatch in hybrid params "retrieval" has both properties ranker_k and ranker_alpha - I suspect only one is required here and responses proeprty is correct - to be confirmed where the fix should go
  3. I got the feeling that in 3.4 the instruction was containing the prompt telling that file_search result should be used to create a final answer - to be confirmed.
  4. Could you please share the responses_body.json content for reference ? (before this PR changes) - If I'm not mistaken there were messages template used instead of flat instructions.

@filip-komarzyniec

Copy link
Copy Markdown
Author
  1. right, this occured yesterday during rebase on newest language-detection-related changes. Let me align the system instructions,

  2. This has been always the case that no matter the reranker chosen, all parameters are streamed in the pattern. In the long run any changes should go directly to the ai4rag library.

  3. As I highlighted in my previous comment -- the system message templates differ depending on the model and stay mostly unchanged for the last 5 months. Any related changes should also land in ai4rag library

  4. pattern_body.json file from 3.4
        {
        "model": "vllm-inference-gpu-apertus/redhataiapertus-8b-instruct-25",
        "stream": false,
        "store": true,
        "input": [
          {
            "type": "message",
            "role": "user",
            "content": [
              {
                "type": "input_text",
                "text": "What information is available in the indexed knowledge base?"
              }
            ]
          }
        ],
        "metadata": {
          "rag_pattern_name": "Pattern2",
          "rag_pattern_iteration": "1",
          "vector_datasource_type": "milvus-remote",
          "embedding_model_id": "vllm-embedding/bge-m3"
        },
        "instructions": "Please answer the question I provide in the user question below, using only information found in file_search results. If the question is unanswerable, please say you cannot answer. Respond in the same language as the user question.",
        "tools": [
          {
            "type": "file_search",
            "vector_store_ids": [
              "vs_65ade6ee-65c5-40ab-84e6-22dc969784e1"
            ],
            "max_num_results": 5,
            "ranking_options": {
              "ranker": "weighted",
              "alpha": 0.5,
              "impact_factor": 0.0
            }
          }
        ],
        "tool_choice": {
          "type": "file_search"
        },
        "include": [
          "file_search_call.results"
        ]
      }
    

@DorotaDR

DorotaDR commented Jun 9, 2026

Copy link
Copy Markdown

Looks fine to me, but I'd like to hold off on official approval until we hear back from @LukaszCmielowski.

@DorotaDR DorotaDR left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I need to request changes once again as I noticed an error during tests on cluster with self-signed cert

Comment thread components/training/autorag/rag_templates_optimization/component.py
…-managed clusters

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
…on self-managed clusters

Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED

@DorotaDR DorotaDR left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@openshift-ci openshift-ci Bot added the lgtm label Jun 10, 2026
Comment thread pyproject.toml
@openshift-ci openshift-ci Bot removed the lgtm label Jun 10, 2026
Signed-off-by: Filip Komarzyniec <fkomarzy@redhat.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
@openshift-ci openshift-ci Bot added the lgtm label Jun 10, 2026
@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: DorotaDR, LukaszCmielowski, MichalSteczko

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@LukaszCmielowski

Copy link
Copy Markdown

@nsingla fyi - there are changes in pyproject.toml related to autox components.

@LukaszCmielowski

Copy link
Copy Markdown

/ok-to-test

@LukaszCmielowski

Copy link
Copy Markdown

/ok-to-test cancel

@LukaszCmielowski

Copy link
Copy Markdown

/ok-to-test

@LukaszCmielowski

Copy link
Copy Markdown

/lgtm

@LukaszCmielowski

Copy link
Copy Markdown

/lgtm cancel

@openshift-ci openshift-ci Bot removed the lgtm label Jun 10, 2026
@LukaszCmielowski

Copy link
Copy Markdown

/lgtm

@openshift-ci openshift-ci Bot added the lgtm label Jun 10, 2026
@openshift-merge-bot openshift-merge-bot Bot merged commit 4cab58f into opendatahub-io:main Jun 10, 2026
33 of 36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants