Skip to content

Commit c0d56dc

Browse files
committed
feat(openclip): add bootstrap downloader hardening and tests
2 parents 033b9e2 + 9520ab9 commit c0d56dc

6 files changed

Lines changed: 3347 additions & 0 deletions

File tree

README.md

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -219,6 +219,79 @@ func main() {
219219
}
220220
```
221221

222+
### Optional OpenCLIP Embeddings Layer (`embeddings/openclip`)
223+
224+
For local CLIP text + image embeddings, use:
225+
`github.com/amikos-tech/pure-onnx/embeddings/openclip`.
226+
227+
Expected artifacts from the OpenCLIP export tooling:
228+
- `text_model.onnx`
229+
- `vision_model.onnx`
230+
- `tokenizer.json`
231+
- `preprocessor_config.json`
232+
233+
Defaults are aligned with the pinned OpenCLIP export contract:
234+
- text inputs: `input_ids`, `attention_mask`; output: `text_embeds`
235+
- vision input: `pixel_values`; output: `image_embeds`
236+
- sequence length `77`, image size `224`, embedding width `512`
237+
- L2 normalization enabled by default (toggle with `WithoutL2Normalization()`)
238+
- per-modality LRU session cache (default `8` per modality, configurable)
239+
240+
Built-in bootstrap can download and cache the default model bundle:
241+
- repo: `amikos/openclip-vit-b-32-laion2b-s34b-b79k-onnx`
242+
- revision: `248a2ed76a7189fc080e654e36930171331ef085`
243+
- cache directory env var: `ONNXRUNTIME_OPENCLIP_CACHE_DIR` (defaults to user cache, e.g. `~/.cache/onnx-purego/openclip`)
244+
- optional auth token env var: `HF_TOKEN` (adds Hugging Face bearer token for gated/private downloads)
245+
246+
When `HF_TOKEN` is set, downloads require `https://` base URLs to avoid leaking credentials.
247+
248+
```go
249+
package main
250+
251+
import (
252+
"log"
253+
254+
"github.com/amikos-tech/pure-onnx/embeddings/openclip"
255+
"github.com/amikos-tech/pure-onnx/ort"
256+
)
257+
258+
func main() {
259+
if err := ort.SetSharedLibraryPath("/path/to/libonnxruntime.so"); err != nil {
260+
log.Fatal(err)
261+
}
262+
if err := ort.InitializeEnvironment(); err != nil {
263+
log.Fatal(err)
264+
}
265+
defer ort.DestroyEnvironment()
266+
267+
assets, err := openclip.EnsureDefaultAssets()
268+
if err != nil {
269+
log.Fatal(err)
270+
}
271+
272+
embedder, err := openclip.NewEmbedder(
273+
assets.TextModelPath,
274+
assets.VisionModelPath,
275+
assets.TokenizerPath,
276+
assets.PreprocessorConfigPath,
277+
)
278+
if err != nil {
279+
log.Fatal(err)
280+
}
281+
defer embedder.Close()
282+
283+
textEmbeds, err := embedder.EmbedTexts([]string{"a photo of a cat", "a photo of a dog"})
284+
if err != nil {
285+
log.Fatal(err)
286+
}
287+
_ = textEmbeds // [][]float32
288+
}
289+
```
290+
291+
Similarity helpers are also available:
292+
- `openclip.CosineSimilarity(a, b)`
293+
- `openclip.CLIPSimilarityLogits(imageEmbeddings, textEmbeddings, openclip.DefaultCLIPLogitScale)`
294+
222295
### OpenCLIP ONNX Export Tooling (`tools/openclip_export_onnx.py`)
223296

224297
To generate pinned OpenCLIP ONNX artifacts (split text + vision encoders):

0 commit comments

Comments
 (0)