-
Notifications
You must be signed in to change notification settings - Fork 54
Description
Models can undergo transformation processes that create new model artifacts from existing ones. Common scenarios include:
- Fine-tuning: Adapting a foundation model to a specific domain or task using additional training data
- Quantization: Reducing model precision (e.g., from FP32 to INT8) to optimize for inference performance
- Distillation: Training a smaller "student" model to mimic a larger "teacher" model
- Compression: Removing weights or layers to reduce model size
- Merging: Combining multiple models or LoRA adapters into a single artifact
In each case, we create new model files—a child artifact derived from one or more parent artifacts.
How should supply chain security metadata artifacts relate across the lineage boundary?
OMS Signatures
How should the OMS (OpenSSF Model Signing) signature of the child relate to the OMS signature of the parent?
The child model consists of entirely new files with new content. A quantized INT8 model bears little binary resemblance to its FP32 parent, even though they perform the same function. A fine-tuned model has different weights than its foundation model base.
Position: The child's OMS signature should be completely independent of the parent's OMS signature.
A robust build process may verify the parent's OMS signature at the beginning of the pipeline—confirming authenticity and integrity before beginning the transformation. You want to ensure you're starting from a trusted artifact. However, once that verification passes and the transformation completes, the child artifact is a new, independent entity. The child's OMS signature should cryptographically bind to the child's files, not reference or include the parent's signature.
See also #586.
AI SBOMs
How should the AI SBOM of the child relate to the AI SBOM of the parent?
An AI SBOM inventories the components that comprise a model: training datasets, foundation models, dependency libraries, data preprocessing tools, etc. When you create a child model from a parent, the parent becomes a component of the child's supply chain.
Position: The child's AI SBOM should reference the parent model as a component using the DESCENDANT_OF / ANCESTOR_OF relationship type from SPDX or the descendents / ancestors fields of the pedigree field from CycloneDX.
For example, for a fine-tuned model, the AI SBOM should reflect:
- The foundation model as a component
- The fine-tuning dataset as a component
- Any new dependencies introduced during fine-tuning
For a quantized model, the AI SBOM might be nearly identical to the parent's SBOM, with metadata noting the quantization process.
SLSA Provenance
How should the SLSA provenance of the child relate to the parent?
SLSA treats ML transformations as builds; when you fine-tune a model, that's a build process. The foundation model is a dependency of that build, just as a Python library would be.
SLSA is not transitive, at least not yet and the new dependency track draft defers verification of dependency provenance to a future iteration of the document.
Position: The child’s SLSA provenance is completely independent of the parent’s SLSA provenance. The child’s build process may choose to verify the provenance of its parent model, but this practice does not correspond with any requirements in SLSA and in any case, the provenance records should remain decoupled.
Note: slsa-framework/slsa#978 makes the case that the SLSA provenance attestation for the child model should reference the parent model as an external dependency (one of resolvedDependencies) in the build process, but the right place for a parent model reference is in the SBOM of the child model, not the SLSA build provenance.
Conclusion
The AI SBOM of a child model is the critical artifact for representing the foundation models and contributing datasets that were used in the long chain of builds and transformations that resulted in a final model.
The SLSA provenance of a child model describes how the artifact was built, but it does not describe a transitive foundation model and dataset dependencies that contributed to the production of the model.
The OMS signature is critical to determine if model files have been tampered with after the creation of the OMS signature.
See also #587 for an illustration of how to encode AI SBOMs and provenance attestations in a model repository.