Skip to content

sync: main to incubation#69

Merged
saichandrapandraju merged 85 commits intoincubationfrom
main
Mar 13, 2026
Merged

sync: main to incubation#69
saichandrapandraju merged 85 commits intoincubationfrom
main

Conversation

@github-actions
Copy link
Copy Markdown

sync-branches: New code has just landed in main, so let's bring incubation up to speed!

saichandrapandraju and others added 30 commits February 17, 2026 19:24
* feat: eval-hub-sdk integration poc

* fix: update probe_tags parameter type

* fix(garak): install eval-hub SDK in Containerfile

* fix(garak): Update Job Spec to new location

* fix(garak): align adapter with recent evalhub SDK contracts

* fix(garak): resolve benchmark_id to probe profile in adapter

* fix(garak): read registry settings from env vars directly

* fix(garak): align adapter with eval-hub SDK latest (OCIArtifactSpec, DefaultCallbacks)

* fix(garak): instantiate GarakScanConfig to access Pydantic model fields

* feat(garak): Enhanced GarakAdapter to build and utilize the new configuration structure

* Empty commit

---------

Co-authored-by: saichandrapandraju <saichandrapandraju@gmail.com>
…ility/evalhub-kfp-poc

feat(evalhub): Add preliminary KFP execution mode to eval-hub Garak adapter
better example jsonl run
The Jinja2 template and Vega chart specs under resources/ were not included when building the package because setuptools only discovers Python packages by default. Add package-data configuration so that resources/* is bundled with the distribution.
…el logic. parse_detector now mirrors _is_rejected from earlystop.py: any single safe score from any detector in any generation makes the attempt "refused". Only when every score exceeds the threshold is the attempt "complied". The Vega chart's max-across-attempts aggregation then matches _update_attempt_status.

- Friendly attack and scenario names
- Charts cosmetic improvements
- Strip newlines from stubs to prevent the prompts being split on stub loading (expects one per line).
- Add funnel property tests verifying refused(stage N) == total(stage N+1) and max-across-attempts aggregation.
- Add per-strategy subsections, each containing a summary table and a variant breakdown appropriate to the probe type.
…ed high_level_stats labels ("Jailbroken questions" / "Safe questions") and pass earlystop_data for full-pipeline comparison.
saichandrapandraju and others added 23 commits March 11, 2026 12:05
…bility/mlflow-callback

feat(evalhub): Add MLflow artifact saving functionality
Implements test coverage for issue trustyai-explainability#113 to verify that shields work correctly
with intents (ART) benchmarks. The test suite covers:

- Configuration tests for shield_ids and shield_config
- Validation tests for shields API requirements and error handling
- Integration tests for the full workflow (register, validate, build command)
- Tests verify function-based generator configuration with shield mappings

Also fixes pytest configuration typo (python_paths → pythonpath).

Fixes trustyai-explainability#113

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
…bility/fix-api-keys

fix: Secure model API key handling via Kubernetes Secrets with volume mount
Addresses code review comments by:
- Extracting shared `create_adapter` context manager to reduce repeated
  GarakRemoteEvalAdapter initialization pattern with patch.object calls
- Adding `create_benchmark_config` factory helper for reusable test
  BenchmarkConfig instances
- Removing duplicate `remote_config` fixture from
  TestIntentsWithShieldsValidation class (uses module-level fixture)
- Updating all test methods to use the new helpers

This makes the test suite more maintainable and DRY while keeping all
7 tests passing.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

rh-pre-commit.version: 2.3.2
rh-pre-commit.check-secrets: ENABLED
…bility/test-intents-shields

Add comprehensive tests for intents benchmarks with shields
…bility/remove-evalhub-kfp-prefix

Remove EVALHUB_ prefix from KFP environment variables
…bility/art-defaults

Update default detector for Garak intents
…bility/pypi-publish-fix

add requirements-inline-extra.txt and update pyproject.toml to fix pypi release
…bility/artifacts-evalhub

introduce _GarakCallbacks to surface S3 artifact URLs in job response
[pull] main from trustyai-explainability:main
Copy link
Copy Markdown
Member

@tarilabs tarilabs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

main -> incubation sync approval.

@saichandrapandraju saichandrapandraju merged commit 22bb9fe into incubation Mar 13, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants