Skip to content

[OpenCLIP] Tighten numerical equivalence against pinned ONNX artifacts #76

@tazarov

Description

@tazarov

Outcome

Establish strict numerical equivalence for embeddings/openclip against the pinned OpenCLIP ONNX bundle, so CI proves same-artifact parity rather than broad cross-framework compatibility.

Problem

#68 delivered end-to-end OpenCLIP support, but the current hosted parity flow is not a strict numerical-equivalence guarantee:

  • the hosted golden dataset is generated from the Hugging Face PyTorch/Transformers pipeline rather than the pinned ONNX export
  • CI currently uses a loose OpenCLIP golden tolerance to accommodate cross-framework drift
  • image preprocessing parity is not validated at intermediate stages such as pixel_values

Scope

  • Generate OpenCLIP golden data from the exact pinned text_model.onnx and vision_model.onnx artifacts via ONNX Runtime.
  • Add stronger parity coverage for intermediate representations:
    • input_ids
    • attention_mask
    • pixel_values
    • raw model outputs before L2 normalization
    • normalized embeddings
    • similarity logits
  • Make Go image preprocessing behavior match the pinned OpenCLIP preprocessor contract closely enough for strict parity.
  • Tighten CI tolerance after parity is demonstrated.

Work Breakdown

  • Add an ONNX Runtime based OpenCLIP golden-data generator that uses the pinned artifact bundle rather than CLIPModel.from_pretrained(...).
  • Produce a full-vector reference dataset or equivalent debug artifact so parity checks are not limited to the current prefix-only comparison.
  • Add tokenization parity coverage for input_ids and attention_mask against the pinned tokenizer contract.
  • Add preprocessing parity coverage for pixel_values, including resize/crop/normalize behavior from preprocessor_config.json.
  • Align Go image preprocessing with the pinned preprocessor semantics closely enough to remove the current large drift allowance.
  • Tighten TestOpenCLIPGoldenDatasetParity to a strict tolerance once the above is in place.
  • Update TESTING.md with the new regeneration and verification workflow.

Definition of Done

  • Golden OpenCLIP reference data is generated from the pinned ONNX artifacts, not the PyTorch reference path.
  • Intermediate parity tests isolate tokenization and preprocessing mismatches before final embedding/logit comparisons.
  • embeddings/openclip passes strict numerical parity checks in CI at a low tolerance without the current loose override.
  • Documentation explains how to regenerate and verify the strict parity dataset.

Related

  • Follow-up to #68

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions