Proposal: optional audit-safe metadata for federated evaluation runs

## Proposal

Add an optional, audit-safe federated evaluation metadata sidecar for MedPerf runs.

MedPerf already centers privacy-preserving federated evaluation, benchmark committee governance, and transparent reporting. A small optional metadata envelope would make it easier for sites, benchmark committees, and reviewers to understand what happened in a run without exposing patient data, PHI/PII, private filesystem paths, tokens, or full sensitive arguments.

This would be a docs/example-first addition, not a scoring change and not an AANA dependency.

## Suggested Shape

```json
{
  "schema_version": "medperf.federated_eval_audit.v1",
  "benchmark_uid": "benchmark:example",
  "dataset_uid": "dataset:redacted-or-hashed",
  "model_uid": "model:example",
  "result_uid": "result:example",
  "site_ref": "site:redacted-or-hashed",
  "workflow_stage": "dataset_preparation | association_test | model_execution | metrics_evaluation | result_submission",
  "container_refs": [
    {
      "kind": "data_preparator | model | metrics",
      "image_digest": "sha256:..."
    }
  ],
  "artifacts": {
    "result_paths": ["relative/or/redacted/path"],
    "metadata_paths": ["relative/or/redacted/path"]
  },
  "privacy_controls": {
    "raw_patient_data_logged": false,
    "phi_or_pii_in_public_log": false,
    "redaction_status": "safe_for_public_log"
  },
  "evidence_refs": [
    {
      "source_id": "local-run:redacted-id",
      "kind": "federated_eval_run",
      "trust_tier": "site_reported",
      "redaction_status": "safe_for_public_log"
    }
  ],
  "claim_status": "diagnostic | committee_reviewed | reportable"
}
```

## Why This Helps

- Gives benchmark committees a lightweight provenance record for runs and artifacts.
- Helps sites prove that public logs/results do not contain raw patient data, PHI/PII, tokens, or private paths.
- Separates diagnostic/internal runs from committee-reviewed/reportable results.
- Supports reproducibility review without changing benchmark scoring or requiring new runtime dependencies.
- Aligns with MedPerf's federated evaluation model, where useful audit records should be safe to share across organizational boundaries.

## Initial Scope

A minimal first PR could add:

1. `docs/examples/federated_eval_audit.example.json` or a similarly placed example file.
2. A short docs note explaining that this metadata is optional and non-normative.
3. Guidance that public audit records must be redacted and must not include raw patient data, PHI/PII, tokens, private account IDs, or full sensitive arguments.

If maintainers think this belongs in a different MedPerf concept, naming scheme, or workflow stage, I can adjust the proposal before opening a PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: optional audit-safe metadata for federated evaluation runs #688

Proposal

Suggested Shape

Why This Helps

Initial Scope

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Proposal: optional audit-safe metadata for federated evaluation runs #688

Description

Proposal

Suggested Shape

Why This Helps

Initial Scope

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions