Skip to content

Conversation

@SequeI
Copy link
Contributor

@SequeI SequeI commented Nov 24, 2025

Summary

This feature introduces support for handling OCI model manifests (obtained via tools like skopeo inspect --raw) directly in the signing and verification process.

I had 2 reasons for adding this:

  1. It addresses the significant problem of time and network cost associated with fetching multi-gigabyte AI model layers merely to compute a hash for signing. By signing the small, immutable OCI Manifest, we establish trust in the artifact's structure without pulling the data.
  2. It standardizes verification across different model states.

Closes #568

Thank you :)

❯ skopeo inspect --raw docker://quay.io/asiek/oci_model@sha256:6907c004bc8efa7a3b1a45c8a1035a4f2f37002dcf58216ad168704b7238cc11 > test.json

❯ hatch run python -m model_signing sign test.json
Waiting for browser interaction...
Opening in existing browser session.
Signing succeeded

❯ hatch run python -m model_signing verify sigstore --identity <>@redhat.com --identity_provider https://accounts.google.com --signature model.sig bert-base-uncased
Verification succeeded

Checklist
  • All commits are signed-off, using DCO
  • All new code has docstrings and type annotations
  • All new code is covered by tests. Aim for at least 90% coverage. CI is configured to highlight lines not covered by tests.
  • Public facing changes are paired with documentation changes
  • Release note has been added to CHANGELOG.md if needed

@SequeI SequeI requested review from a team as code owners November 24, 2025 10:46

def _detect_path_type(
path: pathlib.Path,
) -> tuple[Optional[pathlib.Path], Optional[pathlib.Path]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think introducing a path type constant would make sense?
Example: path_type="MODEL_DIR" or "OCI" ; path=/something/xyz

Trade Off:
Possible downside: Maintain different constants ;
Alternatively thinking from extensibility point of view, if signing and verifying another kind of model path is introduced in future, it would be easy to segregate the model path specific logic.

Thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple enum for the type of the path would work here. Then we can make this function return the detected type.

We can also just wrap the path into one of two classes (one for real paths, one for the manifest path) and move the reading behavior into a method in the class.

Right now we're doing json.load twice for the same file, maybe we should not do that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point, looks much cleaner now using Enum instead. Thank you for the suggestion!

with open(path, "r", encoding="utf-8") as f:
data = json.load(f)
if isinstance(data, dict) and (
"layers" in data or "schemaVersion" in data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pydantic schema validation would be a good alternative solution.
This ensures that we maintain a schema that we can always validate against and easy for maintaining config versioning against model transparency versioning.
Again, in future if the underlying schema changes - model transparency can validate against specific schema versions and still go through with the signing/verifying process.

Thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to migrate to pydantic, +1.

For the 1.0 release we wanted to get to a stable API, but for the next stable release we should migrate to pydantic and to our own error hierarchy, rather than just reusing the Python one (to make it easier to distinguish between errors in the signing/verification flows versus errors due to incorrect usage)


def _detect_path_type(
path: pathlib.Path,
) -> tuple[Optional[pathlib.Path], Optional[pathlib.Path]]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple enum for the type of the path would work here. Then we can make this function return the detected type.

We can also just wrap the path into one of two classes (one for real paths, one for the manifest path) and move the reading behavior into a method in the class.

Right now we're doing json.load twice for the same file, maybe we should not do that?

Add the ability to sign and verify model artifacts directly from their manifest
JSON files without requiring the model files on disk. This enables signing
model artifacts in registries without pulling them, and supports cross-verification
between OCI manifests and local model directories.

Signed-off-by: SequeI <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow providing pre-computed file digests instead of requiring full model directory input

3 participants