Skip to content

Sync upstream -> midstream#102

Merged
saichandrapandraju merged 2 commits intoopendatahub-io:mainfrom
trustyai-explainability:main
Apr 9, 2026
Merged

Sync upstream -> midstream#102
saichandrapandraju merged 2 commits intoopendatahub-io:mainfrom
trustyai-explainability:main

Conversation

@saichandrapandraju
Copy link
Copy Markdown
Collaborator

@saichandrapandraju saichandrapandraju commented Apr 9, 2026

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for configurable HuggingFace cache path to optimize evaluation runs in disconnected environments.
    • Introduced attack_success_rate metric to evaluation results for enhanced reporting.
    • Enhanced S3 credential handling for improved results download reliability.
  • Tests

    • Added test coverage for HuggingFace cache path configuration scenarios.

saichandrapandraju and others added 2 commits April 8, 2026 21:01
Add hf_cache_path parameter for air-gapped clusters where the
multilingual.TranslationIntent probe cannot download Helsinki-NLP
translation models from HuggingFace.

- KFP mode: hf_cache_path is an S3 key prefix (or fully-qualified
  s3://bucket/prefix URI). The garak_scan component downloads the
  cache from S3 into a temp dir and sets HF_HUB_CACHE before running.
  Bare prefixes are stripped of leading slashes; empty prefixes log a
  warning instead of blocking.
- Simple mode: hf_cache_path is a local mount path (evalhub handles
  the download). We set HF_HUB_CACHE via env to run_garak_scan.
- Default (empty): unchanged behavior, HF downloads as before.

Also includes:
- Read AWS_S3_BUCKET and AWS_S3_ENDPOINT from K8s Data Connection
  secret as fallback (user config > secret > env var).
- Emit overall attack_success_rate as first metric in job results.
- Pin eval-hub-sdk to ==0.1.4.

Made-with: Cursor
…cache

feat: support disconnected mode with pre-downloaded HF model cache
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 9, 2026

📝 Walkthrough

Walkthrough

This pull request introduces Hugging Face cache path configuration across the Garak evaluation pipeline. Changes include: tightening the eval-hub-sdk[adapter] dependency to an exact pinned version, adding hf_cache_path parameter propagation through the KFP pipeline and simple execution paths, implementing S3-to-local cache download logic in the KFP component, deriving S3 credentials from multiple fallback sources, extracting additional environment variables from Kubernetes secrets, and emitting a new attack_success_rate metric during result parsing. Tests validate environment variable propagation and updated result parsing behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Security and Design Issues

CWE-94 (Code Injection via Environment Variables): In kfp_pipeline.py, the hf_cache_path parameter is user-controlled and written directly into S3 URI construction (s3://bucket/{hf_cache_path}) without validation. An attacker providing a crafted path could inject traversal sequences (e.g., ../) or malformed S3 prefixes. Validate and sanitize hf_cache_path against path traversal patterns before use.

CWE-611 (XML/Path Traversal via S3 Prefix): S3 object download logic in kfp_pipeline.py iterates all objects under a computed prefix. If hf_cache_path is attacker-controlled, this could enumerate or download unintended objects from the S3 bucket. Implement strict prefix allowlisting or validate the path structure.

CWE-798 (Hardcoded Credentials / Credential Precedence): In garak_adapter.py, S3 credentials are derived from kfp_config fields, then fallback to decoded secret values, then environment variables. This precedence creates an implicit security model: document the intended priority and ensure tests validate that earlier sources override later ones. Verify that kfp_config values (if user-supplied) don't override secret-backed credentials unintentionally.

Pinned Dependency Risk: Setting eval-hub-sdk[adapter]==0.1.4 (exact pin) prevents automatic security patch adoption. If a vulnerability is discovered in 0.1.4, manual intervention is required. Consider pinning to a range (e.g., >=0.1.4,<0.2) or establish a patch management policy.

Missing Input Validation: attack_success_rate extraction from overall_summary["attack_success_rate"] uses float() coercion without bounds checking. A malicious or corrupted result file could supply extremely large values. Validate numeric ranges before storing metrics.

🚥 Pre-merge checks | ✅ 1 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Title check ⚠️ Warning The title 'Sync upstream -> midstream' is vague and does not clearly describe the actual changes—which involve adding HuggingFace cache path support, tightening a dependency constraint, and enhancing results parsing with attack success rate metrics. Use a specific, descriptive title that captures the primary changes, e.g., 'Add HuggingFace cache path support to Garak evaluation pipeline' or 'Support HF_HUB_CACHE configuration in eval-hub adapter'
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
tests/test_evalhub_adapter.py (1)

192-271: Cover the KFP cache import path with regression tests.

These additions only exercise _run_simple. The new hf_cache_path branch in src/llama_stack_provider_trustyai_garak/evalhub/kfp_pipeline.py is still untested for the cases that matter most here: empty prefixes and ../ object keys.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_evalhub_adapter.py` around lines 192 - 271, Add regression tests
that exercise the KFP cache import branch in kfp_pipeline.py by creating jobs
with parameters containing hf_cache_path and two edge-case object keys (an empty
prefix and a "../" parent-path key), then invoking the code path that handles
KFP imports; monkeypatch the KFP import helper (referenced as import_kfp_cache
in src/llama_stack_provider_trustyai_garak/evalhub/kfp_pipeline.py) to capture
its inputs and assert the adapter transforms/normalizes the object key and
prefix correctly instead of delegating raw values (use the same test harness
style as the existing test_simple_mode_passes_hf_cache_env tests:
_load_evalhub_garak_adapter, adapter.run_benchmark_job, monkeypatch.setattr for
the import helper, and assertions on the captured call).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/llama_stack_provider_trustyai_garak/evalhub/garak_adapter.py`:
- Around line 419-430: The code reads hf_cache_path from config.parameters and
assigns it directly into env["HF_HUB_CACHE"] (used when calling run_garak_scan),
which can be non-string or whitespace-only and will break subprocess env
expectations; validate and coerce the value to a proper non-empty string before
inserting into env: retrieve hf_cache = (config.parameters or
{}).get("hf_cache_path", None), if hf_cache is not None convert to
str(hf_cache).strip(), and only set env["HF_HUB_CACHE"] when the stripped string
is non-empty; keep env typed as dict[str,str] and pass env if non-empty else
None to run_garak_scan.

In `@src/llama_stack_provider_trustyai_garak/evalhub/kfp_pipeline.py`:
- Around line 473-519: The code trusts hf_cache_path and S3 keys, allowing
mirroring entire bucket when prefix is empty and enabling path traversal via
object keys; require a non-empty S3 prefix (reject hf_cache_path like
"s3://bucket" or when prefix == ""), and sanitize each object key before writing
by computing rel = obj["Key"][len(prefix):].lstrip("/") then skipping keys that
are empty or contain ".." or absolute segments; resolve dest = (hf_cache_dir /
rel).resolve() and ensure dest.is_relative_to(hf_cache_dir.resolve()) (or check
str(dest).startswith(str(hf_cache_dir.resolve()) + os.sep)) before creating
parents and calling s3.download_file, otherwise skip and log a warning; keep
references to hf_cache_path, prefix, rel, hf_cache_dir, dest, and
s3.download_file to locate the changes.

---

Nitpick comments:
In `@tests/test_evalhub_adapter.py`:
- Around line 192-271: Add regression tests that exercise the KFP cache import
branch in kfp_pipeline.py by creating jobs with parameters containing
hf_cache_path and two edge-case object keys (an empty prefix and a "../"
parent-path key), then invoking the code path that handles KFP imports;
monkeypatch the KFP import helper (referenced as import_kfp_cache in
src/llama_stack_provider_trustyai_garak/evalhub/kfp_pipeline.py) to capture its
inputs and assert the adapter transforms/normalizes the object key and prefix
correctly instead of delegating raw values (use the same test harness style as
the existing test_simple_mode_passes_hf_cache_env tests:
_load_evalhub_garak_adapter, adapter.run_benchmark_job, monkeypatch.setattr for
the import helper, and assertions on the captured call).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: f822d985-79a5-4bcb-9864-61d657efa5d9

📥 Commits

Reviewing files that changed from the base of the PR and between a8704af and 47fccc8.

📒 Files selected for processing (7)
  • pyproject.toml
  • src/llama_stack_provider_trustyai_garak/evalhub/garak_adapter.py
  • src/llama_stack_provider_trustyai_garak/evalhub/kfp_pipeline.py
  • src/llama_stack_provider_trustyai_garak/remote/garak_remote_eval.py
  • src/llama_stack_provider_trustyai_garak/remote/kfp_utils/components.py
  • src/llama_stack_provider_trustyai_garak/remote/kfp_utils/pipeline.py
  • tests/test_evalhub_adapter.py

Comment on lines +419 to +430
env: dict[str, str] = {}
hf_cache = (config.parameters or {}).get("hf_cache_path", "")
if hf_cache:
env["HF_HUB_CACHE"] = hf_cache
logger.info("Using HF cache from mounted path: %s", hf_cache)

result = run_garak_scan(
config_file=config_file,
timeout_seconds=timeout,
log_file=log_file,
report_prefix=report_prefix,
env=env if env else None,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
python - <<'PY'
import os
import subprocess
import sys

env = os.environ.copy()
env["HF_HUB_CACHE"] = True  # emulate a non-string JSON value

try:
    subprocess.Popen([sys.executable, "-c", "print('ok')"], env=env).wait()
except Exception as exc:
    print(type(exc).__name__, exc)
PY

Repository: opendatahub-io/llama-stack-provider-trustyai-garak

Length of output: 155


🏁 Script executed:

cd /tmp && find . -name "garak_adapter.py" -type f 2>/dev/null | head -5

Repository: opendatahub-io/llama-stack-provider-trustyai-garak

Length of output: 76


🏁 Script executed:

git ls-files | grep -E "garak_adapter\.py|garak_runner\.py"

Repository: opendatahub-io/llama-stack-provider-trustyai-garak

Length of output: 219


🏁 Script executed:

# Read the file around the specified lines
head -430 src/llama_stack_provider_trustyai_garak/evalhub/garak_adapter.py | tail -20

Repository: opendatahub-io/llama-stack-provider-trustyai-garak

Length of output: 808


🏁 Script executed:

# Check the run_garak_scan function signature and usage
rg -A 15 "def run_garak_scan" src/llama_stack_provider_trustyai_garak/

Repository: opendatahub-io/llama-stack-provider-trustyai-garak

Length of output: 1571


🏁 Script executed:

# Check garak_runner.py for subprocess.Popen usage with env
rg -B 5 -A 10 "subprocess.Popen" src/llama_stack_provider_trustyai_garak/

Repository: opendatahub-io/llama-stack-provider-trustyai-garak

Length of output: 1450


Validate and coerce hf_cache_path to string before passing to subprocess environment.

The config.parameters dict can contain non-string JSON values (booleans, numbers). The code retrieves hf_cache_path without type validation and passes it directly to env["HF_HUB_CACHE"], violating the dict[str, str] type contract. When subprocess.Popen is called with a non-string env value, it raises TypeError. Additionally, whitespace-only values bypass the truthiness check and become broken cache paths instead of being ignored (CWE-94: Improper Input Validation).

Suggested fix
-        env: dict[str, str] = {}
-        hf_cache = (config.parameters or {}).get("hf_cache_path", "")
-        if hf_cache:
-            env["HF_HUB_CACHE"] = hf_cache
-            logger.info("Using HF cache from mounted path: %s", hf_cache)
+        env: dict[str, str] = {}
+        hf_cache_raw = (config.parameters or {}).get("hf_cache_path")
+        if hf_cache_raw is not None:
+            if not isinstance(hf_cache_raw, str):
+                raise ValueError("hf_cache_path must be a string")
+            hf_cache = hf_cache_raw.strip()
+            if hf_cache:
+                env["HF_HUB_CACHE"] = hf_cache
+                logger.info("Using HF cache from mounted path: %s", hf_cache)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
env: dict[str, str] = {}
hf_cache = (config.parameters or {}).get("hf_cache_path", "")
if hf_cache:
env["HF_HUB_CACHE"] = hf_cache
logger.info("Using HF cache from mounted path: %s", hf_cache)
result = run_garak_scan(
config_file=config_file,
timeout_seconds=timeout,
log_file=log_file,
report_prefix=report_prefix,
env=env if env else None,
env: dict[str, str] = {}
hf_cache_raw = (config.parameters or {}).get("hf_cache_path")
if hf_cache_raw is not None:
if not isinstance(hf_cache_raw, str):
raise ValueError("hf_cache_path must be a string")
hf_cache = hf_cache_raw.strip()
if hf_cache:
env["HF_HUB_CACHE"] = hf_cache
logger.info("Using HF cache from mounted path: %s", hf_cache)
result = run_garak_scan(
config_file=config_file,
timeout_seconds=timeout,
log_file=log_file,
report_prefix=report_prefix,
env=env if env else None,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llama_stack_provider_trustyai_garak/evalhub/garak_adapter.py` around
lines 419 - 430, The code reads hf_cache_path from config.parameters and assigns
it directly into env["HF_HUB_CACHE"] (used when calling run_garak_scan), which
can be non-string or whitespace-only and will break subprocess env expectations;
validate and coerce the value to a proper non-empty string before inserting into
env: retrieve hf_cache = (config.parameters or {}).get("hf_cache_path", None),
if hf_cache is not None convert to str(hf_cache).strip(), and only set
env["HF_HUB_CACHE"] when the stripped string is non-empty; keep env typed as
dict[str,str] and pass env if non-empty else None to run_garak_scan.

Comment on lines +473 to +519
if hf_cache_path and hf_cache_path.strip():
from llama_stack_provider_trustyai_garak.evalhub.s3_utils import create_s3_client

if hf_cache_path.startswith("s3://"):
parts = hf_cache_path[len("s3://") :].split("/", 1)
bucket = parts[0]
prefix = parts[1] if len(parts) > 1 else ""
else:
bucket = os.environ.get("AWS_S3_BUCKET", "")
prefix = hf_cache_path.lstrip("/")

if not bucket:
raise GarakError(
"Cannot determine S3 bucket for HF cache. "
"Provide a full s3://bucket/prefix URI in hf_cache_path, "
"or set AWS_S3_BUCKET."
)

if not prefix:
log.warning(
"hf_cache_path has no sub-prefix; downloading all objects from bucket '%s'.",
bucket,
)

hf_cache_dir = Path(tempfile.mkdtemp(prefix="hf-cache-"))
s3 = create_s3_client()
downloaded = 0

paginator = s3.get_paginator("list_objects_v2")
for page in paginator.paginate(Bucket=bucket, Prefix=prefix):
for obj in page.get("Contents", []):
rel = obj["Key"][len(prefix) :].lstrip("/")
if not rel:
continue
dest = hf_cache_dir / rel
dest.parent.mkdir(parents=True, exist_ok=True)
s3.download_file(bucket, obj["Key"], str(dest))
downloaded += 1

os.environ["HF_HUB_CACHE"] = str(hf_cache_dir)
log.info(
"Populated HF cache from s3://%s/%s -> %s (%d files)",
bucket,
prefix,
hf_cache_dir,
downloaded,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Reject empty cache prefixes and sanitize S3-derived paths before writing files.

This trusts both hf_cache_path and S3 object keys. hf_cache_path="s3://bucket" currently mirrors the whole bucket into the pod (CWE-200/CWE-400), and a key like prefix/../../scan.log escapes hf_cache_dir via hf_cache_dir / rel (CWE-22), giving an attacker with write access to that bucket/prefix arbitrary file overwrite inside the container.

Suggested fix
     if hf_cache_path and hf_cache_path.strip():
         from llama_stack_provider_trustyai_garak.evalhub.s3_utils import create_s3_client

         if hf_cache_path.startswith("s3://"):
             parts = hf_cache_path[len("s3://") :].split("/", 1)
             bucket = parts[0]
             prefix = parts[1] if len(parts) > 1 else ""
         else:
             bucket = os.environ.get("AWS_S3_BUCKET", "")
             prefix = hf_cache_path.lstrip("/")

+        prefix = prefix.strip("/")
         if not bucket:
             raise GarakError(
                 "Cannot determine S3 bucket for HF cache. "
                 "Provide a full s3://bucket/prefix URI in hf_cache_path, "
                 "or set AWS_S3_BUCKET."
             )

-        if not prefix:
-            log.warning(
-                "hf_cache_path has no sub-prefix; downloading all objects from bucket '%s'.",
-                bucket,
-            )
+        if not prefix:
+            raise GarakError("hf_cache_path must include a non-empty S3 prefix")
+
+        prefix = f"{prefix}/"

         hf_cache_dir = Path(tempfile.mkdtemp(prefix="hf-cache-"))
+        cache_root = hf_cache_dir.resolve()
         s3 = create_s3_client()
         downloaded = 0

         paginator = s3.get_paginator("list_objects_v2")
         for page in paginator.paginate(Bucket=bucket, Prefix=prefix):
             for obj in page.get("Contents", []):
-                rel = obj["Key"][len(prefix) :].lstrip("/")
+                key = obj["Key"]
+                if not key.startswith(prefix):
+                    continue
+                rel = key[len(prefix) :]
                 if not rel:
                     continue
-                dest = hf_cache_dir / rel
+                dest = (hf_cache_dir / rel).resolve()
+                if cache_root not in dest.parents:
+                    raise GarakError(f"Refusing to write cache object outside {cache_root}: {key}")
                 dest.parent.mkdir(parents=True, exist_ok=True)
-                s3.download_file(bucket, obj["Key"], str(dest))
+                s3.download_file(bucket, key, str(dest))
                 downloaded += 1

As per coding guidelines, "Validate file paths (prevent path traversal)".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/llama_stack_provider_trustyai_garak/evalhub/kfp_pipeline.py` around lines
473 - 519, The code trusts hf_cache_path and S3 keys, allowing mirroring entire
bucket when prefix is empty and enabling path traversal via object keys; require
a non-empty S3 prefix (reject hf_cache_path like "s3://bucket" or when prefix ==
""), and sanitize each object key before writing by computing rel =
obj["Key"][len(prefix):].lstrip("/") then skipping keys that are empty or
contain ".." or absolute segments; resolve dest = (hf_cache_dir / rel).resolve()
and ensure dest.is_relative_to(hf_cache_dir.resolve()) (or check
str(dest).startswith(str(hf_cache_dir.resolve()) + os.sep)) before creating
parents and calling s3.download_file, otherwise skip and log a warning; keep
references to hf_cache_path, prefix, rel, hf_cache_dir, dest, and
s3.download_file to locate the changes.

@saichandrapandraju saichandrapandraju merged commit b2f5ade into opendatahub-io:main Apr 9, 2026
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants