opendatahub-io
diff --git a/‎.github/CODEOWNERS‎
Lines changed: 2 additions & 0 deletions b/‎.github/CODEOWNERS‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎.github/ISSUE_TEMPLATE/bug_report.yml‎
Lines changed: 88 additions & 0 deletions b/‎.github/ISSUE_TEMPLATE/bug_report.yml‎
Lines changed: 88 additions & 0 deletions
diff --git a/‎.github/PULL_REQUEST_TEMPLATE.md‎
Lines changed: 21 additions & 0 deletions b/‎.github/PULL_REQUEST_TEMPLATE.md‎
Lines changed: 21 additions & 0 deletions
diff --git a/‎.github/workflows/lint.yml‎
Lines changed: 33 additions & 0 deletions b/‎.github/workflows/lint.yml‎
Lines changed: 33 additions & 0 deletions
diff --git a/‎.pre-commit-config.yaml‎
Lines changed: 7 additions & 0 deletions b/‎.pre-commit-config.yaml‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎AGENTS.md‎
Lines changed: 113 additions & 0 deletions b/‎AGENTS.md‎
Lines changed: 113 additions & 0 deletions
@@ -0,0 +1,2 @@
+# Default owners for the entire repository
+* @trustyai-explainability/developers
@@ -0,0 +1,88 @@
+name: Bug Report
+description: Report a bug in llama-stack-provider-trustyai-garak
+labels: ["bug"]
+body:
+  - type: markdown
+    attributes:
+      value: |
+        Thank you for reporting a bug. Please fill out the sections below
+        to help us reproduce and fix the issue.
+
+  - type: textarea
+    id: description
+    attributes:
+      label: Bug Description
+      description: A clear and concise description of the bug.
+    validations:
+      required: true
+
+  - type: textarea
+    id: reproduction
+    attributes:
+      label: Steps to Reproduce
+      description: Minimal steps to reproduce the behavior.
+      placeholder: |
+        1. Register benchmark with config...
+        2. Run eval with...
+        3. Observe error...
+    validations:
+      required: true
+
+  - type: textarea
+    id: expected
+    attributes:
+      label: Expected Behavior
+      description: What you expected to happen.
+    validations:
+      required: true
+
+  - type: textarea
+    id: actual
+    attributes:
+      label: Actual Behavior
+      description: What actually happened, including any error messages.
+    validations:
+      required: true
+
+  - type: textarea
+    id: logs
+    attributes:
+      label: Error Logs
+      description: Paste relevant logs or stack traces.
+      render: text
+
+  - type: dropdown
+    id: execution-mode
+    attributes:
+      label: Execution Mode
+      options:
+        - Llama Stack Inline (local garak)
+        - Llama Stack Remote (KFP pipelines)
+        - Llama Stack (all modes)
+        - Eval-Hub Simple (direct pod execution)
+        - Eval-Hub KFP (KFP pipeline execution)
+        - Eval-Hub (all modes)
+    validations:
+      required: true
+
+  - type: textarea
+    id: environment
+    attributes:
+      label: Environment
+      description: Provide environment details.
+      placeholder: |
+        - Provider version:
+        - Python version:
+        - Garak version:
+        - Llama Stack version:
+        - OS / Platform:
+        - Kubernetes version (if remote):
+    validations:
+      required: true
+
+  - type: textarea
+    id: config
+    attributes:
+      label: Benchmark / Garak Config
+      description: Paste relevant benchmark config or garak_config if applicable.
+      render: yaml
@@ -0,0 +1,21 @@
+## Summary
+
+<!-- Brief description of what this PR does and why. -->
+
+## Changes
+
+<!-- List the key changes. -->
+
+-
+
+## Testing Checklist
+
+- [ ] Unit tests pass (`make test`)
+- [ ] Linting passes (`make lint`)
+- [ ] New/changed code has test coverage
+- [ ] No breaking changes to existing benchmark configs
+- [ ] Documentation updated (if applicable)
+
+## Related Issues
+
+<!-- Link any related issues: Fixes #123, Relates to #456 -->
@@ -0,0 +1,33 @@
+name: Lint
+
+on:
+  pull_request:
+    branches: [main]
+  push:
+    branches: [main]
+
+jobs:
+  ruff:
+    name: Ruff Lint & Format Check
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+
+      - name: Set up Python
+        uses: actions/setup-python@v4
+        with:
+          python-version: '3.12'
+
+      - name: Install tools
+        run: pip install ruff mypy
+
+      - name: Ruff check
+        run: ruff check src/ tests/
+
+      - name: Ruff format check
+        run: ruff format --check src/ tests/
+
+      - name: Mypy type check
+        run: mypy src/
@@ -0,0 +1,7 @@
+repos:
+  - repo: https://github.com/astral-sh/ruff-pre-commit
+    rev: v0.11.4
+    hooks:
+      - id: ruff
+        args: [--fix, --exit-non-zero-on-fix]
+      - id: ruff-format
@@ -0,0 +1,113 @@
+# AI Agent Context
+
+## What This Repo Is
+
+This repo contains **two independent integrations** for running
+[Garak](https://github.com/NVIDIA/garak) LLM red-teaming scans. They share
+core logic but serve different orchestration surfaces:
+
+1. **Llama Stack Provider** — An out-of-tree eval provider for the
+   [Llama Stack](https://llamastack.github.io/) framework. Exposes garak
+   through the Llama Stack `benchmarks.register` / `eval.run_eval` API.
+
+2. **Eval-Hub Adapter** — A `FrameworkAdapter` for the eval-hub SDK.
+   Completely independent of Llama Stack. Used by the RHOAI evaluation
+   platform to orchestrate garak scans via K8s jobs.
+
+## Four Execution Modes
+
+```
+              Llama Stack                          Eval-Hub
+         (Llama Stack API)                    (eval-hub SDK)
+        ┌────────┬────────┐              ┌────────┬────────┐
+        │ Inline │ Remote │              │ Simple │  KFP   │
+        │        │  KFP   │              │  (pod) │ (pod + │
+        │        │        │              │        │  KFP)  │
+        └────────┴────────┘              └────────┴────────┘
+          local     KFP                   in-pod     K8s job
+          garak   pipelines               garak    submits to
+                                                   KFP, polls
+```
+
+| Mode | Code Location | How Garak Runs | Intents Support |
+|------|--------------|----------------|-----------------|
+| **Llama Stack Inline** | `inline/` | Locally in the Llama Stack server process | No |
+| **Llama Stack Remote KFP** | `remote/` | As KFP pipeline steps on Kubernetes | **Yes** |
+| **Eval-Hub Simple** | `evalhub/` (simple mode) | Directly in the eval-hub K8s job pod | No |
+| **Eval-Hub KFP** | `evalhub/` (KFP mode) | K8s job submits to KFP, polls status, pulls artifacts via S3 | **Yes** |
+
+**Intents** is a key upcoming feature — it uses SDG (synthetic data generation),
+TAPIntent probes, and MulticlassJudge detectors to test model behavior against
+policy taxonomies. Only the two KFP-based modes support it because it requires
+the six-step pipeline (`core/pipeline_steps.py`) running as KFP components.
+
+## Code Layout
+
+```
+src/llama_stack_provider_trustyai_garak/
+├── core/               # Shared logic used by ALL modes
+│   ├── config_resolution.py   # Deep-merge user overrides onto benchmark profiles
+│   ├── command_builder.py     # Build garak CLI args for OpenAI-compatible endpoints
+│   ├── garak_runner.py        # Subprocess runner for garak CLI
+│   └── pipeline_steps.py      # Six-step pipeline (validate→taxonomy→SDG→prompts→scan→parse)
+│
+├── inline/             # Llama Stack Inline mode
+│   ├── garak_eval.py          # Async adapter wrapping garak subprocess
+│   └── provider.py            # Provider spec with pip dependencies
+│
+├── remote/             # Llama Stack Remote KFP mode
+│   ├── garak_remote_eval.py   # Async adapter managing KFP job lifecycle
+│   └── kfp_utils/             # KFP pipeline DAG and @dsl.component steps
+│
+├── evalhub/            # Eval-Hub integration (NO Llama Stack dependency)
+│   ├── garak_adapter.py       # FrameworkAdapter: benchmark resolution, intents overlay, callbacks
+│   ├── kfp_adapter.py         # KFP-specific adapter (forces KFP execution mode)
+│   ├── kfp_pipeline.py        # Eval-hub KFP pipeline with S3 artifact flow
+│   └── s3_utils.py            # S3/Data Connection client
+│
+├── base_eval.py        # Shared Llama Stack eval lifecycle (NOT used by eval-hub)
+├── garak_command_config.py  # Pydantic models for garak YAML config
+├── intents.py          # Policy taxonomy dataset loading (SDG/intents flows)
+├── sdg.py              # Synthetic data generation via sdg-hub
+├── result_utils.py     # Parse garak outputs, TBSA scoring, HTML reports
+└── resources/          # Jinja2 templates and Vega chart specs
+```
+
+## Key Conventions
+
+- **Config merging**: User overrides are deep-merged onto benchmark profiles via
+  `deep_merge_dicts` in `core/config_resolution.py`. Only leaf values are replaced.
+- **Intents model overlay**: When `intents_models` is provided, model endpoints
+  are applied using `x.get("key") or default` pattern — fills empty slots but
+  preserves user-configured values. `api_key` is always forced to `__FROM_ENV__`
+  (K8s Secret injection).
+- **Benchmark profiles**: Predefined configs live in `base_eval.py` (Llama Stack)
+  and `evalhub/garak_adapter.py` (eval-hub). The `intents` profile is the most
+  complex — it includes TAPIntent, MulticlassJudge, and SDG configuration.
+- **Provider specs**: `inline/provider.py` and `remote/provider.py` define Llama
+  Stack provider specs. `pip_packages` is auto-populated from `get_garak_version()`.
+
+## Build & Install
+
+```bash
+pip install -e .            # Core (Llama Stack remote mode)
+pip install -e ".[inline]"  # With garak for local scans
+pip install -e ".[dev]"     # Dev (tests + ruff + pre-commit)
+```
+
+## Running Tests
+
+```bash
+make test       # All tests (no cluster/GPU/network needed)
+make coverage   # With coverage report
+make lint       # ruff check
+```
+
+Tests are 100% unit tests. Garak is mocked — it does not need to be installed.
+
+## Debugging
+
+- `GARAK_SCAN_DIR` — controls where scan artifacts land
+- `LOG_LEVEL=DEBUG` — verbose eval-hub adapter logging
+- `scan.log` in scan directory — garak subprocess output
+- `__FROM_ENV__` in configs — placeholder for K8s Secret api_key injection
Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,2 @@`
	`1`	`+# Default owners for the entire repository`
	`2`	`+* @trustyai-explainability/developers`