Skip to content

Support llama stack tests on disconnected using https_proxy#1336

Open
jgarciao wants to merge 10 commits intoopendatahub-io:mainfrom
jgarciao:support-disconnected-llama-stack
Open

Support llama stack tests on disconnected using https_proxy#1336
jgarciao wants to merge 10 commits intoopendatahub-io:mainfrom
jgarciao:support-disconnected-llama-stack

Conversation

@jgarciao
Copy link
Copy Markdown
Contributor

@jgarciao jgarciao commented Apr 1, 2026

If defined, use SQUID_HTTPS_PROXY env var to be able to run the llama-stack and RAG tests on disconnected clusters using a https_proxy to connect to the external LLMs

The env var will be set by Jenkins when running on disconnected clusters. The PRs adding them are currently in review, but this PR can be merged as it will work on connected clusters if the env vars is not defined

Summary by CodeRabbit

  • Tests
    • Enhanced test infrastructure with improved timeout configurations for vector store and cluster operations.
    • Added support for proxy and SSL verification configuration in test environment.
    • Improved resource request management for embedding provider scenarios.
    • Refined storage configuration handling for air-gapped cluster environments.

@jgarciao jgarciao requested a review from a team as a code owner April 1, 2026 12:49
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 1, 2026

The following are automatically added/executed:

  • PR size label.
  • Run pre-commit
  • Run tox
  • Add PR author as the PR assignee
  • Build image based on the PR

Available user actions:

  • To mark a PR as WIP, add /wip in a comment. To remove it from the PR comment /wip cancel to the PR.
  • To block merging of a PR, add /hold in a comment. To un-block merging of PR comment /hold cancel.
  • To mark a PR as approved, add /lgtm in a comment. To remove, add /lgtm cancel.
    lgtm label removed on each new commit push.
  • To mark PR as verified comment /verified to the PR, to un-verify comment /verified cancel to the PR.
    verified label removed on each new commit push.
  • To Cherry-pick a merged PR /cherry-pick <target_branch_name> to the PR. If <target_branch_name> is valid,
    and the current PR is merged, a cherry-picked PR would be created and linked to the current PR.
  • To build and push image to quay, add /build-push-pr-image in a comment. This would create an image with tag
    pr-<pr_number> to quay repository. This image tag, however would be deleted on PR merge or close action.
Supported labels

{'/build-push-pr-image', '/hold', '/wip', '/cherry-pick', '/lgtm', '/verified'}

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adjusts Llama Stack test harness: adds HTTPS proxy and client-verify constants, injects disconnected-cluster and TLS/proxy settings into server config, increases client/polling timeouts, tweaks CPU defaults, skips storage on air-gapped clusters, removes a FAISS-specific env var, and normalizes vector-store helper parameters.

Changes

Cohort / File(s) Summary
Configuration Constants
tests/llama_stack/constants.py
Added HTTPS_PROXY: str (from SQUID_HTTPS_PROXY) and LLS_CLIENT_VERIFY_SSL: bool (from LLS_CLIENT_VERIFY_SSL) for outbound HTTP/S behavior.
Test Server Configuration & Fixtures
tests/llama_stack/conftest.py
Added is_disconnected_cluster fixture; use LLS_CLIENT_VERIFY_SSL for client TLS; increased httpx.Client and LlamaStackClient timeouts to 300s; set CPU requests/limits default to 2/4 and 4/8 for sentence-transformers; when disconnected, inject HTTPS_PROXY, REQUESTS_CA_BUNDLE, tlsConfig.caBundle (ConfigMap) and offline envs for sentence-transformers; skip applying server_config["storage"] when cluster is disconnected (emit warning).
Test Utilities
tests/llama_stack/utils.py
vector_store_create_and_poll(...) now passes timeout=240 and coerces attributes to dict(attributes) or None; wait_for_unique_llama_stack_pod(...) retry wait_timeout increased from 60s to 240s.
Fixtures (FAISS env cleanup)
tests/fixtures/vector_io.py
FAISS branch now only sets ENABLE_FAISS=faiss; removed hardcoded FAISS_KVSTORE_DB_PATH environment variable previously added for faiss provider.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Actionable Items

  • Validate HTTPS_PROXY input (scheme/host/port) and reject unusual schemes to avoid SSRF/redirects. Flag: CWE-918.
  • Treat HTTPS_PROXY as sensitive; strip credentials or avoid logging to prevent exposure. Flag: CWE-532.
  • Ensure LLS_CLIENT_VERIFY_SSL defaults to verification enabled in production; require explicit opt-in to disable to avoid MITM. Flag: CWE-295.
  • Confirm ConfigMap name/keys for tlsConfig.caBundle are not controllable by untrusted input to prevent arbitrary trust-store injection. Flag: CWE-73.
  • When skipping server_config["storage"] on disconnected clusters, fail-fast or clearly surface a test-level warning to avoid silent misconfiguration (do not silently continue).
🚥 Pre-merge checks | ✅ 1 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description is incomplete. It lacks the required template structure, related issues/JIRA links, and testing verification checkboxes. Add the standard PR template sections including Related Issues, testing verification checkboxes, and any additional requirements about test images or markers.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding support for HTTPS proxy configuration to enable llama stack tests on disconnected clusters.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jgarciao jgarciao force-pushed the support-disconnected-llama-stack branch from ca8b9ca to 03f426f Compare April 2, 2026 16:19
@github-actions github-actions bot added size/m and removed size/xl labels Apr 2, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/llama_stack/conftest.py`:
- Around line 290-295: The fixture currently removes server_config["storage"]
when params.get("llama_stack_storage_size") is set on disconnected clusters;
instead, preserve the requested storage config and skip the test on disconnected
runs: when params.get("llama_stack_storage_size") is truthy and
IS_DISCONNECTED_CLUSTER is true, call pytest.skip with a clear message (e.g.,
referencing RHAIENG-1819) rather than mutating server_config; ensure pytest is
imported in tests/llama_stack/conftest.py and leave the
storage_size/server_config["storage"] assignment intact for connected runs.
- Around line 250-252: The current code logs and injects the raw HTTPS_PROXY
string into env_vars which may contain credentials; stop emitting the proxy
value and move the proxy into a Kubernetes Secret reference: replace the inline
env_vars.append({"name": "HTTPS_PROXY", "value": HTTPS_PROXY}) with an env var
entry that uses a secretKeyRef (e.g.,
{"name":"HTTPS_PROXY","valueFrom":{"secretKeyRef":{"name":<llama_stack_secret_name>,"key":"https-proxy"}}}),
change the LOGGER.info call to a generic "HTTPS proxy enabled" message (do not
interpolate HTTPS_PROXY), and add "https-proxy" to
LLAMA_STACK_DISTRIBUTION_SECRET_DATA so the secret contains the proxy string for
reference by the secretKeyRef; update any code that builds server_config to
reference the secretKeyRef instead of embedding the raw proxy.
- Around line 646-652: The readiness setup uses httpx.Client(verify=verifySSL,
timeout=300) which makes all requests (including the
health/version/models/vector_stores/files checks performed by
wait_for_llama_stack_client_ready) honor a 300s timeout and can exhaust the
retry decorator's 90s wait_timeout; change the default httpx.Client timeout to a
short value (e.g., 10s) used for readiness checks and reserve the long 300s
timeout only for operations that need it by configuring LlamaStackClient or
per-request timeouts (adjust where http_client is created and where
LlamaStackClient is instantiated so readiness calls use the shorter timeout
while file/vector-store calls use 300s).

In `@tests/llama_stack/constants.py`:
- Line 32: The new constant HTTPS_PROXY lacks a type annotation; update the
declaration in tests/llama_stack/constants.py to include an explicit type so
mypy stays happy, for example change the line with HTTPS_PROXY to use a str
annotation (HTTPS_PROXY: str = os.getenv("SQUID_HTTPS_PROXY", "") or otherwise
ensure the result is typed as str) referencing the HTTPS_PROXY symbol.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 598d6eef-fc16-4491-98e7-b900eeebc37f

📥 Commits

Reviewing files that changed from the base of the PR and between 92eed4c and a9d47d5.

📒 Files selected for processing (4)
  • tests/fixtures/vector_io.py
  • tests/llama_stack/conftest.py
  • tests/llama_stack/constants.py
  • tests/llama_stack/utils.py
💤 Files with no reviewable changes (1)
  • tests/fixtures/vector_io.py

env_vars_vector_io = vector_io_provider_deployment_config_factory(provider_name=vector_io_provider)
env_vars.extend(env_vars_vector_io)

if IS_DISCONNECTED_CLUSTER and HTTPS_PROXY:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jgarciao for disconnected clusters, we have utility is_disconnected_cluster in utilities/infra.py. Can that be used instead of one extra environment variable?

Copy link
Copy Markdown
Contributor Author

@jgarciao jgarciao Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking on modifying the is_disconnected_cluster fixture to use IS_DISCONNECTED_CLUSTER once the new env var is set by Jenkins. It will work faster and it will probably be more reliable than the current implementation

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is a good idea. Specially since we can learn from the cluster whether it is connected or not. It also works for local runs etc.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In latest commit I've removed IS_DISCONNECTED_CLUSTER and now it is using is_disconnected_cluster.

Note: I had to define a is_disconnected_cluster fixture because llama_stack_server_config doesn't have a client parameter (it needs to support privileged and unprivileged clients)

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/llama_stack/constants.py`:
- Around line 34-35: LLS_CLIENT_VERIFY_SSL currently defaults to false which
disables SSL verification for all tests; change it to default to True
(LLS_CLIENT_VERIFY_SSL: bool) and only set it to False when explicitly opted out
or when IS_DISCONNECTED_CLUSTER is true—i.e., read env var if present but
otherwise set True, and if you want cluster-based behavior, compute
LLS_CLIENT_VERIFY_SSL = False only when IS_DISCONNECTED_CLUSTER is True and no
explicit env override; update the constant name usage accordingly so
httpx.Client instantiation reads this boolean.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 5bf58c99-eb59-4dd6-a6ae-bc687e86eeca

📥 Commits

Reviewing files that changed from the base of the PR and between a9d47d5 and 52f5bca.

📒 Files selected for processing (2)
  • tests/llama_stack/conftest.py
  • tests/llama_stack/constants.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/llama_stack/conftest.py

Copy link
Copy Markdown
Contributor

@Bobbins228 Bobbins228 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

jgarciao added 10 commits April 7, 2026 18:24
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
Increase timeout as workaround for known bug RHAIENG-1819
affecting the test execution

Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
Needed to avoid problems when using sentence-transformers

Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
As tlsconfig is used on disconnected, there is test inestability
due of the usage of tlsconfig. Adding a workaround for it

Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
Signed-off-by: Jorge Garcia Oncins <jgarciao@redhat.com>
@jgarciao jgarciao force-pushed the support-disconnected-llama-stack branch from 9c8290f to 9bb2a64 Compare April 7, 2026 16:24
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/llama_stack/conftest.py`:
- Around line 258-260: When is_disconnected_cluster is true and HTTPS_PROXY is
added to env_vars, also append a NO_PROXY entry to exempt local/internal hosts
so embedded/internal HTTPS endpoints (LLS_CORE_VLLM_EMBEDDING_URL,
LLS_CORE_VLLM_URL) are bypassed by the proxy; update the block that logs and
appends HTTPS_PROXY (referencing LOGGER, HTTPS_PROXY, and env_vars) to also
append {"name": "NO_PROXY", "value": "localhost,127.0.0.1"} (or configurable
exempt hosts) and include this in the log message.
- Line 20: The test module imports the package but not its infra submodule,
causing AttributeError when the llama_stack_server_config fixture calls
utilities.infra.is_disconnected_cluster; explicitly import the submodule instead
of the bare package (e.g., replace the top-level "import utilities" with an
explicit "from utilities import infra" or "import utilities.infra as infra") and
update the fixture references (llama_stack_server_config and any direct calls to
utilities.infra) to use the imported infra symbol so utilities.infra is
available at evaluation time.

In `@tests/llama_stack/utils.py`:
- Around line 105-110: Replace the hard-coded timeout=240 passed to
llama_stack_client.vector_stores.files.create with a value derived from the
function's wait_timeout budget: record a start time (time.monotonic()), compute
request_timeout = max(1, int(wait_timeout - (time.monotonic() - start))) before
calling vector_stores.files.create, and pass timeout=request_timeout (instead of
240); ensure you handle request_timeout <= 0 by raising or setting a minimal
timeout so the subsequent polling/deadline logic still enforces the overall
wait_timeout contract for the operation involving vector_store_id, file_id and
attributes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 8a233da3-eaa3-4957-b7c2-27d2e62dcf83

📥 Commits

Reviewing files that changed from the base of the PR and between 508a527 and 9bb2a64.

📒 Files selected for processing (4)
  • tests/fixtures/vector_io.py
  • tests/llama_stack/conftest.py
  • tests/llama_stack/constants.py
  • tests/llama_stack/utils.py
💤 Files with no reviewable changes (1)
  • tests/fixtures/vector_io.py
✅ Files skipped from review due to trivial changes (1)
  • tests/llama_stack/constants.py

from ocp_resources.service import Service
from semver import Version

import utilities
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "Checking whether the utilities package exposes the infra submodule:"
fd -a '__init__.py' utilities | while read -r file; do
  echo "== $file =="
  rg -nC2 '\binfra\b|__getattr__|__all__' "$file" || true
done

Repository: opendatahub-io/opendatahub-tests

Length of output: 367


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Contents of utilities/__init__.py ==="
cat -n utilities/__init__.py

echo ""
echo "=== Check for utilities/infra module/package ==="
fd -a 'infra' utilities

echo ""
echo "=== Actual usage in conftest.py around line 20 and 76-79 ==="
sed -n '15,85p' tests/llama_stack/conftest.py | cat -n

Repository: opendatahub-io/opendatahub-tests

Length of output: 3119


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Full import section of conftest.py ==="
head -n 50 tests/llama_stack/conftest.py | cat -n

echo ""
echo "=== Search for 'import utilities.infra' or 'from utilities import infra' ==="
rg 'import utilities\.infra|from utilities import infra' tests/llama_stack/conftest.py

echo ""
echo "=== Check if is_disconnected_cluster fixture is actually used ==="
rg 'is_disconnected_cluster' tests/llama_stack/conftest.py

Repository: opendatahub-io/opendatahub-tests

Length of output: 2309


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Search for uses of is_disconnected_cluster fixture ==="
rg 'is_disconnected_cluster' tests/llama_stack/ -l

echo ""
echo "=== Count usages ==="
rg 'is_disconnected_cluster' tests/llama_stack/ | wc -l

Repository: opendatahub-io/opendatahub-tests

Length of output: 188


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Search for test functions using is_disconnected_cluster as parameter ==="
rg 'def test.*is_disconnected_cluster' tests/llama_stack/

echo ""
echo "=== Show all lines with is_disconnected_cluster in conftest.py ==="
rg -n 'is_disconnected_cluster' tests/llama_stack/conftest.py

Repository: opendatahub-io/opendatahub-tests

Length of output: 152


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== All lines with is_disconnected_cluster in conftest.py with context ==="
rg -n 'is_disconnected_cluster' tests/llama_stack/conftest.py -B2 -A2

Repository: opendatahub-io/opendatahub-tests

Length of output: 1824


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Check if utilities/infra.py has is_disconnected_cluster function ==="
rg -n 'def is_disconnected_cluster' utilities/infra.py

Repository: opendatahub-io/opendatahub-tests

Length of output: 212


Fix broken utilities.infra import causing AttributeError at fixture evaluation.

Line 79 accesses utilities.infra.is_disconnected_cluster(...) but import utilities (line 20) does not import the infra submodule—utilities/__init__.py is empty. This causes AttributeError: module 'utilities' has no attribute 'infra' when the fixture is evaluated by llama_stack_server_config (line 89).

Required fix
-import utilities
+from utilities.infra import is_disconnected_cluster as detect_disconnected_cluster
@@
 `@pytest.fixture`(scope="class")
 def is_disconnected_cluster(admin_client: DynamicClient) -> bool:
     """Whether the target cluster is disconnected (air-gapped)."""
-    return utilities.infra.is_disconnected_cluster(client=admin_client)
+    return detect_disconnected_cluster(client=admin_client)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/llama_stack/conftest.py` at line 20, The test module imports the
package but not its infra submodule, causing AttributeError when the
llama_stack_server_config fixture calls utilities.infra.is_disconnected_cluster;
explicitly import the submodule instead of the bare package (e.g., replace the
top-level "import utilities" with an explicit "from utilities import infra" or
"import utilities.infra as infra") and update the fixture references
(llama_stack_server_config and any direct calls to utilities.infra) to use the
imported infra symbol so utilities.infra is available at evaluation time.

Comment on lines +258 to +260
if is_disconnected_cluster and HTTPS_PROXY:
LOGGER.info(f"Setting proxy and tlsconfig configuration (https_proxy:{HTTPS_PROXY})")
env_vars.append({"name": "HTTPS_PROXY", "value": HTTPS_PROXY})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -nC2 '"HTTPS_PROXY"|"NO_PROXY"|"https_proxy"|"no_proxy"|LLS_CORE_VLLM_URL|LLS_CORE_VLLM_EMBEDDING_URL' tests/llama_stack

Repository: opendatahub-io/opendatahub-tests

Length of output: 3832


🏁 Script executed:

#!/bin/bash
# Get the full context around lines 258-260 in conftest.py
head -n 280 tests/llama_stack/conftest.py | tail -n 40

Repository: opendatahub-io/opendatahub-tests

Length of output: 2080


🏁 Script executed:

#!/bin/bash
# Search for NO_PROXY/no_proxy in conftest.py
rg -n 'NO_PROXY|no_proxy' tests/llama_stack/conftest.py

Repository: opendatahub-io/opendatahub-tests

Length of output: 58


🏁 Script executed:

#!/bin/bash
# Check what HTTPS_PROXY constant value is
rg -n 'HTTPS_PROXY\s*=' tests/llama_stack/conftest.py

Repository: opendatahub-io/opendatahub-tests

Length of output: 58


🏁 Script executed:

#!/bin/bash
# Check imports in conftest.py to find HTTPS_PROXY
head -n 50 tests/llama_stack/conftest.py | grep -n HTTPS_PROXY

Repository: opendatahub-io/opendatahub-tests

Length of output: 95


🏁 Script executed:

#!/bin/bash
# Check all references to LLS_CORE_VLLM_URL in constants.py
grep -n "LLS_CORE_VLLM_URL" tests/llama_stack/constants.py

Repository: opendatahub-io/opendatahub-tests

Length of output: 133


🏁 Script executed:

#!/bin/bash
# Get the full constants.py to understand defaults
cat tests/llama_stack/constants.py | head -n 65

Repository: opendatahub-io/opendatahub-tests

Length of output: 2579


Add NO_PROXY exemption when setting HTTPS_PROXY for disconnected clusters.

When HTTPS_PROXY is set, all outbound HTTPS requests use the proxy. LLS_CORE_VLLM_EMBEDDING_URL defaults to https://nomic-embed-text-v1-5.example.com:443/v1, and LLS_CORE_VLLM_URL may also be HTTPS. Both will be routed through the proxy without a NO_PROXY exemption, causing connectivity failures in disconnected clusters where the proxy cannot reach these internal endpoints. Set NO_PROXY to "localhost,127.0.0.1" or the appropriate exempt hosts alongside HTTPS_PROXY.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/llama_stack/conftest.py` around lines 258 - 260, When
is_disconnected_cluster is true and HTTPS_PROXY is added to env_vars, also
append a NO_PROXY entry to exempt local/internal hosts so embedded/internal
HTTPS endpoints (LLS_CORE_VLLM_EMBEDDING_URL, LLS_CORE_VLLM_URL) are bypassed by
the proxy; update the block that logs and appends HTTPS_PROXY (referencing
LOGGER, HTTPS_PROXY, and env_vars) to also append {"name": "NO_PROXY", "value":
"localhost,127.0.0.1"} (or configurable exempt hosts) and include this in the
log message.

Comment on lines 105 to 110
vs_file = llama_stack_client.vector_stores.files.create(
vector_store_id=vector_store_id, file_id=file_id, attributes=attributes
vector_store_id=vector_store_id,
file_id=file_id,
timeout=240, # Increased timeout for slow processing (e.g., sentence-transformers)
attributes=dict(attributes) if attributes else attributes,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate the file
find . -name "utils.py" -path "*/llama_stack/*" | head -20

Repository: opendatahub-io/opendatahub-tests

Length of output: 138


🏁 Script executed:

# Read the relevant section of the file
cat -n tests/llama_stack/utils.py | sed -n '85,130p'

Repository: opendatahub-io/opendatahub-tests

Length of output: 2329


🏁 Script executed:

rg -n "vector_store_create_and_poll" tests/

Repository: opendatahub-io/opendatahub-tests

Length of output: 214


🏁 Script executed:

# Also check the function signature to see default wait_timeout
cat -n tests/llama_stack/utils.py | sed -n '75,98p'

Repository: opendatahub-io/opendatahub-tests

Length of output: 1187


🏁 Script executed:

cat -n tests/llama_stack/utils.py | sed -n '290,310p'

Repository: opendatahub-io/opendatahub-tests

Length of output: 1300


🏁 Script executed:

# Also check if there are any other uses or integration points
rg -B 5 -A 5 "vector_store_create_and_poll\(" tests/llama_stack/utils.py | head -40

Repository: opendatahub-io/opendatahub-tests

Length of output: 1049


🏁 Script executed:

# Check if there are other callers or integration tests that rely on the wait_timeout contract
rg -B 3 -A 3 "wait_timeout" tests/llama_stack/utils.py | head -50

Repository: opendatahub-io/opendatahub-tests

Length of output: 1799


🏁 Script executed:

# Verify the actual execution path and timing behavior
cat -n tests/llama_stack/utils.py | sed -n '112,122p'

Repository: opendatahub-io/opendatahub-tests

Length of output: 719


Request timeout must respect wait_timeout contract.

The hard-coded timeout=240 in the create() call (line 108) is independent of wait_timeout, causing total execution to potentially exceed the documented budget. The deadline is set after create completes (line 112), allowing the function to run up to 240s + wait_timeout. With default wait_timeout=240, a caller gets up to 480s instead of the promised 240s.

Tie the request timeout to the polling timeout so total execution respects the wait_timeout contract.

Proposed fix
 def vector_store_create_and_poll(
     llama_stack_client: LlamaStackClient,
     vector_store_id: str,
     file_id: str,
     *,
     attributes: dict[str, str | int | float | bool] | None = None,
     poll_interval_sec: float = 5.0,
     wait_timeout: float = 240.0,
+    request_timeout: float | None = None,
 ) -> VectorStoreFile:
+    create_timeout = wait_timeout if request_timeout is None else min(request_timeout, wait_timeout)
     vs_file = llama_stack_client.vector_stores.files.create(
         vector_store_id=vector_store_id,
         file_id=file_id,
-        timeout=240,  # Increased timeout for slow processing (e.g., sentence-transformers)
+        timeout=create_timeout,
         attributes=dict(attributes) if attributes else attributes,
     )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
vs_file = llama_stack_client.vector_stores.files.create(
vector_store_id=vector_store_id, file_id=file_id, attributes=attributes
vector_store_id=vector_store_id,
file_id=file_id,
timeout=240, # Increased timeout for slow processing (e.g., sentence-transformers)
attributes=dict(attributes) if attributes else attributes,
)
def vector_store_create_and_poll(
llama_stack_client: LlamaStackClient,
vector_store_id: str,
file_id: str,
*,
attributes: dict[str, str | int | float | bool] | None = None,
poll_interval_sec: float = 5.0,
wait_timeout: float = 240.0,
request_timeout: float | None = None,
) -> VectorStoreFile:
create_timeout = wait_timeout if request_timeout is None else min(request_timeout, wait_timeout)
vs_file = llama_stack_client.vector_stores.files.create(
vector_store_id=vector_store_id,
file_id=file_id,
timeout=create_timeout,
attributes=dict(attributes) if attributes else attributes,
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/llama_stack/utils.py` around lines 105 - 110, Replace the hard-coded
timeout=240 passed to llama_stack_client.vector_stores.files.create with a value
derived from the function's wait_timeout budget: record a start time
(time.monotonic()), compute request_timeout = max(1, int(wait_timeout -
(time.monotonic() - start))) before calling vector_stores.files.create, and pass
timeout=request_timeout (instead of 240); ensure you handle request_timeout <= 0
by raising or setting a minimal timeout so the subsequent polling/deadline logic
still enforces the overall wait_timeout contract for the operation involving
vector_store_id, file_id and attributes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants