-
Notifications
You must be signed in to change notification settings - Fork 54
Description
Model Transfer and Attestation Lifecycle
Models get moved around. How should we handle their supply chain security artifacts?
Let’s follow a model through its original training, to huggingface, to consumption by a corporate user and its path through downstream systems.
For each, we’ll provide two depictions of the model file layout. One using standalone sigstore bundles and another using a unified JSON lines file.
Recall, in both cases, an individual attestation is always a sigstore bundle. In the standalone sigstore bundles model, each sigstore bundle is its own file. In the unified .jsonl model, each line of that file is a sigstore bundle. In both cases, every sigstore bundle contains an in-toto statement where the subject is the root hash of the model which can be computed with OMS tooling (see also #565) and the predicate is either the OMS signature predicate, one of the well known in-toto predicates, or a custom predicate bespoke to the producer.
Proposal: we should encourage the Unified Bundle Layout which uses the unified model.sigstore.jsonl file where new signed attestations are appended.
Origin
A model may be originally produced on a workstation or in a pipeline. There, it can be signed by its producer generating an OMS signature sigstore bundle.
Individual Files Layout Unified Bundle Layout
────────────────────────────── ────────────────────────────────────
origin-workspace/ origin-workspace/
├── model.safetensors ├── model.safetensors
├── config.json ├── config.json
└── model.sig └── model.sigstore.jsonl
# OMS signature # Contains:
# (signed by [email protected]) # - OMS signature (signed by [email protected])
SBOM and SLSA
The producer may also generate an SBOM attestation and SLSA provenance attestation also as sigstore bundles.
Individual Files Layout Unified Bundle Layout
────────────────────────────── ────────────────────────────────────
origin-workspace/ origin-workspace/
├── model.safetensors ├── model.safetensors
├── config.json ├── config.json
├── model.sig └── model.sigstore.jsonl
├── model.slsa.sigstore.json # Appended with:
│ # NEW: SLSA provenance # - SLSA provenance (signed by [email protected])
│ # (signed by [email protected]) # - SPDX SBOM (signed by [email protected])
└── model.spdx.sigstore.json
# NEW: SPDX SBOM
# (signed by [email protected])
Public Registry
The producer then pushes to huggingface which, through some feature of the huggingface hub, signs the model files that it received. (See also Note 1 and Note 2 at the bottom.)
Individual Files Layout Unified Bundle Layout
────────────────────────────── ────────────────────────────────────
huggingface.co/author/model-name/ huggingface.co/author/model-name/
├── model.safetensors ├── model.safetensors
├── config.json ├── config.json
├── model.sig └── model.sigstore.jsonl
├── model.slsa.sigstore.json # Appended with:
├── model.spdx.sigstore.json # - Registry OMS signature (signed by [email protected])
└── model.huggingface.sig
# NEW: Registry OMS signature
# (signed by [email protected])
Enterprise Security
An enterprise security team or pipeline pulls the model, verifies all upstream attestations, and generates a VSA. (See also Note 3 at the bottom.)
The enterprise security team may choose to drop some of the upstream attestations from the repo or from the model.sigstore.jsonl file as they see fit.
Individual Files Layout Unified Bundle Layout
────────────────────────────── ────────────────────────────────────
local-workspace/ local-workspace/
├── model.safetensors ├── model.safetensors
├── config.json ├── config.json
├── model.sig └── model.sigstore.jsonl
├── model.slsa.sigstore.json # Appended with:
├── model.spdx.sigstore.json # - Security team OMS Signature (signed by [email protected])
├── model.huggingface.sig # - VSA (signed by [email protected]) upstream attestations preserved
├── model.mlsec.sigstore.json # All or some upstream attestations preserved
│ # NEW: Security team OMS signature
│ # (signed by [email protected])
└── model.vsa.sigstore.json
# NEW: VSA
# (signed by [email protected])
# References all or some upstream attestations
Enterprise Registry
The enterprise security team pushes the model to the corporate registry for use by others in the company. They push their own VSA and signatures. They may preserve all, some, or none of the upstream attestations. (See Note 4 at the bottom.)
Individual Files Layout Unified Bundle Layout
────────────────────────────── ────────────────────────────────────
enterprise.com/approved/model-name/ enterprise.com/approved/model-name/
├── model.safetensors ├── model.safetensors
├── config.json ├── config.json
├── model.sig └── model.sigstore.jsonl
├── model.slsa.sigstore.json # Appended with:
├── model.spdx.sigstore.json # - Registry OMS Signature (signed by [email protected])
├── model.huggingface.sig
├── model.mlsec.sigstore.json
├── model.vsa.sigstore.json
└── model.corp-registry.sigstore.json
# NEW: Registry OMS Signature
# (signed by [email protected])
Enterprise Quality
An enterprise quality team assesses models that land in their corporate registry. Upstream comes with some performance information in the modelcard, but they want to run their own assessments. They push their test results back to the registry as a test result attestation.
Individual Files Layout Unified Bundle Layout
────────────────────────────── ────────────────────────────────────
enterprise.com/approved/model-name/ enterprise.com/approved/model-name/
├── model.safetensors ├── model.safetensors
├── config.json ├── config.json
├── model.sig └── model.sigstore.jsonl
├── model.slsa.sigstore.json # Appended with:
├── model.spdx.sigstore.json # - Test Result Attestation (signed by [email protected])
├── model.huggingface.sig
├── model.mlsec.sigstore.json
├── model.vsa.sigstore.json
├── model.corp-registry.sigstore.json
└── model.quality.sigstore.json
# NEW: Test Result Attestation
# (signed by [email protected])
Inference in Production
Finally, the model is selected by a development team and they refer to it from their AI enabled application. The model is served, but the inference server or an admission controller in the production environment first verifies that the model bears correct signatures and attestations from security as well as from the quality team.
Notes
- Note 1: Huggingface doesn’t currently have this feature, but engaging with them to make something like this happen is on the OMS Roadmap.
- Note 2: When a registry receives an upload, it needs to decide which files/hashes it will include in its OMS signature. See problem in the OMS Signatures Tab on .jsonl.
- Note 3: VSA is part of SLSA, but the idea may be generalized in the in-toto documentation as an SVR, see in-toto/attestation!470.
- Note 4: If the original upstream signature includes some of the upstream attestations, it becomes an all-or-nothing decision. The team must retain all if they retain the upstream signature. If they drop one or more upstream attestations, they’ll need to drop the upstream signature as well.
OCI Registry Variant
If the model is copied to an OCI registry, the recommendation is for the original model files to be copied as layers in an OCI artifact using a tool like oras push or podman artifact. The supply-chain security artifacts (sigstore bundles) should not be included directly, but should instead be pushed to the registry as independent OCI artifacts and linked to the model with the OCI 1.1 Referrer’s API using a command like oras attach. This is important so that the addition of subsequent attestations does not change the digest of the image manifest of the model OCI artifact, which serves as an otherwise unchanging identifier for the model in the registry.