sync: main to incubation by github-actions[bot] · Pull Request #69 · opendatahub-io/llama-stack-provider-trustyai-garak

github-actions · 2026-03-12T22:00:57Z

sync-branches: New code has just landed in main, so let's bring incubation up to speed!

* feat: eval-hub-sdk integration poc * fix: update probe_tags parameter type * fix(garak): install eval-hub SDK in Containerfile * fix(garak): Update Job Spec to new location * fix(garak): align adapter with recent evalhub SDK contracts * fix(garak): resolve benchmark_id to probe profile in adapter * fix(garak): read registry settings from env vars directly * fix(garak): align adapter with eval-hub SDK latest (OCIArtifactSpec, DefaultCallbacks) * fix(garak): instantiate GarakScanConfig to access Pydantic model fields * feat(garak): Enhanced GarakAdapter to build and utilize the new configuration structure * Empty commit --------- Co-authored-by: saichandrapandraju <saichandrapandraju@gmail.com>

…ntegration

…ility/evalhub-kfp-poc feat(evalhub): Add preliminary KFP execution mode to eval-hub Garak adapter

…tes Garak typology and intent stubs

better example jsonl run

The Jinja2 template and Vega chart specs under resources/ were not included when building the package because setuptools only discovers Python packages by default. Add package-data configuration so that resources/* is bundled with the distribution.

…ntents

…e probes to work

…tasets

…el logic. parse_detector now mirrors _is_rejected from earlystop.py: any single safe score from any detector in any generation makes the attempt "refused". Only when every score exceeds the threshold is the attempt "complied". The Vega chart's max-across-attempts aggregation then matches _update_attempt_status. - Friendly attack and scenario names - Charts cosmetic improvements - Strip newlines from stubs to prevent the prompts being split on stub loading (expects one per line). - Add funnel property tests verifying refused(stage N) == total(stage N+1) and max-across-attempts aggregation. - Add per-strategy subsections, each containing a summary table and a variant breakdown appropriate to the probe type.

…ed high_level_stats labels ("Jailbroken questions" / "Safe questions") and pass earlystop_data for full-pipeline comparison.

…bility/mlflow-callback feat(evalhub): Add MLflow artifact saving functionality

Implements test coverage for issue trustyai-explainability#113 to verify that shields work correctly with intents (ART) benchmarks. The test suite covers: - Configuration tests for shield_ids and shield_config - Validation tests for shields API requirements and error handling - Integration tests for the full workflow (register, validate, build command) - Tests verify function-based generator configuration with shield mappings Also fixes pytest configuration typo (python_paths → pythonpath). Fixes trustyai-explainability#113 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED

…bility/fix-api-keys fix: Secure model API key handling via Kubernetes Secrets with volume mount

Addresses code review comments by: - Extracting shared `create_adapter` context manager to reduce repeated GarakRemoteEvalAdapter initialization pattern with patch.object calls - Adding `create_benchmark_config` factory helper for reusable test BenchmarkConfig instances - Removing duplicate `remote_config` fixture from TestIntentsWithShieldsValidation class (uses module-level fixture) - Updating all test methods to use the new helpers This makes the test suite more maintainable and DRY while keeping all 7 tests passing. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> rh-pre-commit.version: 2.3.2 rh-pre-commit.check-secrets: ENABLED

…bility/test-intents-shields Add comprehensive tests for intents benchmarks with shields

…bility/remove-evalhub-kfp-prefix Remove EVALHUB_ prefix from KFP environment variables

…bility/art-defaults Update default detector for Garak intents

…bility/bump-0.3.0 bump version to 0.3.0

…pi release

…bility/pypi-publish-fix add requirements-inline-extra.txt and update pyproject.toml to fix pypi release

…bility/artifacts-evalhub introduce _GarakCallbacks to surface S3 artifact URLs in job response

[pull] main from trustyai-explainability:main

tarilabs

main -> incubation sync approval.

saichandrapandraju and others added 30 commits February 17, 2026 19:24

feat: update docs and demos

3852873

Merge pull request #90 from saichandrapandraju/docs-update

fc7d1a2

feat(evalhub): Add preliminary KFP execution mode for evalhub garak i…

43f9e45

…ntegration

address sourcery comments

279d94f

Update default Garak provider image to the latest version

002c13e

Merge pull request trustyai-explainability#96 from trustyai-explainab…

998e34f

…ility/evalhub-kfp-poc feat(evalhub): Add preliminary KFP execution mode to eval-hub Garak adapter

feat(evalhub): Add dedicated KFP entrypoint for EvalHub adapter

52521fc

feat: added pre-process step that takes a dataset in input and genera…

4622fad

…tes Garak typology and intent stubs

feat: added basic Automated Red Teaming report

a2bca0a

fix: failing test in Github Actions

967b1b5

feat: bring latest vega chart from AIMI

63db430

feat: added second chart in report

1c334a5

feat: using PatternFly for UI styling, added high level stats

6cacec6

feat: added probes from run setup

f372545

better example jsonl run

feat: add happy path KFP integration with latest garak provider changes

3c811ec

fix: parsing latest output from Garak and charts in the report

fa3a0b9

fix: report navigation and header

e0226c8

fix: vega chart and better test example

035ce54

fix: parse_generations_from_report_content for ART report + test

773c276

fix: failing test after pointing Garak to our midstream

9bda683

Source intent description from dataset column with configurable argument

7db76bb

Sanitize category/intent ids to match Garak's validation

14cc746

feat: Add user-provided intents dataset flow + fix metric calc with i…

dacfa00

…ntents

fix: match KFP asr metric log with html report for intents probes

35b09c4

fix: update intent_spec field to default to an empty string for nativ…

2595156

…e probes to work

feat: Integrate Synthetic Data Generation (SDG) support for intent da…

8fec24c

…tasets

Updated test_intents_aggregates_match_high_level_stats to match renam…

97136b5

…ed high_level_stats labels ("Jailbroken questions" / "Safe questions") and pass earlystop_data for full-pipeline comparison.

saichandrapandraju and others added 23 commits March 11, 2026 12:05

feat(evalhub): Add MLflow artifact saving functionality

bf130ea

add warning if _read_s3_credentials_from_secret returns empty

b11737b

Merge pull request trustyai-explainability#122 from trustyai-explaina…

6ffa665

…bility/mlflow-callback feat(evalhub): Add MLflow artifact saving functionality

Merge pull request trustyai-explainability#121 from trustyai-explaina…

8bb2276

…bility/fix-api-keys fix: Secure model API key handling via Kubernetes Secrets with volume mount

Merge pull request trustyai-explainability#123 from trustyai-explaina…

897238c

…bility/test-intents-shields Add comprehensive tests for intents benchmarks with shields

Merge pull request trustyai-explainability#120 from trustyai-explaina…

7a6f77b

…bility/remove-evalhub-kfp-prefix Remove EVALHUB_ prefix from KFP environment variables

Update default detector for Garak intents

c1fab0a

Update TAPIntent probe defaults

0d3f590

Update intents benchmark name and description

ea90052

Update intents benchmark description

18b8727

Merge pull request trustyai-explainability#124 from trustyai-explaina…

31f081c

…bility/art-defaults Update default detector for Garak intents

introduce _GarakCallbacks to surface S3 artifact URLs in job response

9e4d403

bump version to 0.3.0

f745ca0

limit lls to 0.6.0

410399c

Merge pull request trustyai-explainability#126 from trustyai-explaina…

e9ee137

…bility/bump-0.3.0 bump version to 0.3.0

add requirements-inline-extra.txt and update pyproject.toml to fix py…

6eba57f

…pi release

fix tests

a276bb4

Merge pull request trustyai-explainability#127 from trustyai-explaina…

10fa8bc

…bility/pypi-publish-fix add requirements-inline-extra.txt and update pyproject.toml to fix pypi release

log artifact reporting failures and fallback to default reporting method

1a3bf14

Merge pull request trustyai-explainability#125 from trustyai-explaina…

fbd360a

…bility/artifacts-evalhub introduce _GarakCallbacks to surface S3 artifact URLs in job response

Merge pull request #67 from trustyai-explainability/main

d5f489d

[pull] main from trustyai-explainability:main

github-actions bot added bot/sync-incubation tide/merge-method-merge labels Mar 12, 2026

saichandrapandraju approved these changes Mar 12, 2026

View reviewed changes

saichandrapandraju requested review from ruivieira and tarilabs March 12, 2026 22:09

tarilabs approved these changes Mar 13, 2026

View reviewed changes

saichandrapandraju merged commit 22bb9fe into incubation Mar 13, 2026
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sync: main to incubation#69

sync: main to incubation#69
saichandrapandraju merged 85 commits intoincubationfrom
main

github-actions bot commented Mar 12, 2026

Uh oh!

tarilabs left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

github-actions bot commented Mar 12, 2026

Uh oh!

tarilabs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants