@@ -219,6 +219,79 @@ func main() {
219219}
220220```
221221
222+ ### Optional OpenCLIP Embeddings Layer (` embeddings/openclip ` )
223+
224+ For local CLIP text + image embeddings, use:
225+ ` github.com/amikos-tech/pure-onnx/embeddings/openclip ` .
226+
227+ Expected artifacts from the OpenCLIP export tooling:
228+ - ` text_model.onnx `
229+ - ` vision_model.onnx `
230+ - ` tokenizer.json `
231+ - ` preprocessor_config.json `
232+
233+ Defaults are aligned with the pinned OpenCLIP export contract:
234+ - text inputs: ` input_ids ` , ` attention_mask ` ; output: ` text_embeds `
235+ - vision input: ` pixel_values ` ; output: ` image_embeds `
236+ - sequence length ` 77 ` , image size ` 224 ` , embedding width ` 512 `
237+ - L2 normalization enabled by default (toggle with ` WithoutL2Normalization() ` )
238+ - per-modality LRU session cache (default ` 8 ` per modality, configurable)
239+
240+ Built-in bootstrap can download and cache the default model bundle:
241+ - repo: ` amikos/openclip-vit-b-32-laion2b-s34b-b79k-onnx `
242+ - revision: ` 248a2ed76a7189fc080e654e36930171331ef085 `
243+ - cache directory env var: ` ONNXRUNTIME_OPENCLIP_CACHE_DIR ` (defaults to user cache, e.g. ` ~/.cache/onnx-purego/openclip ` )
244+ - optional auth token env var: ` HF_TOKEN ` (adds Hugging Face bearer token for gated/private downloads)
245+
246+ When ` HF_TOKEN ` is set, downloads require ` https:// ` base URLs to avoid leaking credentials.
247+
248+ ``` go
249+ package main
250+
251+ import (
252+ " log"
253+
254+ " github.com/amikos-tech/pure-onnx/embeddings/openclip"
255+ " github.com/amikos-tech/pure-onnx/ort"
256+ )
257+
258+ func main () {
259+ if err := ort.SetSharedLibraryPath (" /path/to/libonnxruntime.so" ); err != nil {
260+ log.Fatal (err)
261+ }
262+ if err := ort.InitializeEnvironment (); err != nil {
263+ log.Fatal (err)
264+ }
265+ defer ort.DestroyEnvironment ()
266+
267+ assets , err := openclip.EnsureDefaultAssets ()
268+ if err != nil {
269+ log.Fatal (err)
270+ }
271+
272+ embedder , err := openclip.NewEmbedder (
273+ assets.TextModelPath ,
274+ assets.VisionModelPath ,
275+ assets.TokenizerPath ,
276+ assets.PreprocessorConfigPath ,
277+ )
278+ if err != nil {
279+ log.Fatal (err)
280+ }
281+ defer embedder.Close ()
282+
283+ textEmbeds , err := embedder.EmbedTexts ([]string {" a photo of a cat" , " a photo of a dog" })
284+ if err != nil {
285+ log.Fatal (err)
286+ }
287+ _ = textEmbeds // [][]float32
288+ }
289+ ```
290+
291+ Similarity helpers are also available:
292+ - ` openclip.CosineSimilarity(a, b) `
293+ - ` openclip.CLIPSimilarityLogits(imageEmbeddings, textEmbeddings, openclip.DefaultCLIPLogitScale) `
294+
222295### OpenCLIP ONNX Export Tooling (` tools/openclip_export_onnx.py ` )
223296
224297To generate pinned OpenCLIP ONNX artifacts (split text + vision encoders):
0 commit comments