@@ -87,7 +87,7 @@ This approach offers several benefits including, but not limited to:
87
87
## Sample Reference Implementation
88
88
89
89
In this tutorial we provide a reference implementation for a Semantic Cache in
90
- [ semantic_caching.py. ] ( ./artifacts/semantic_caching.py ) There are 3 key
90
+ [ semantic_caching.py] ( ./artifacts/semantic_caching.py ) . There are 3 key
91
91
dependencies:
92
92
* [ SentenceTransformer] ( https://sbert.net/ ) : a Python framework for computing
93
93
dense vector representations (embeddings) of sentences, paragraphs, and images.
@@ -104,7 +104,7 @@ clustering of dense vectors.
104
104
algorithms.
105
105
- Alternatives include [ annoy] ( https://github.com/spotify/annoy ) , or
106
106
[ cuVS] ( https://github.com/rapidsai/cuvs ) . However, note that cuVS already
107
- has an integration in Faiss, more on this can be found [ here. ] ( https://docs.rapids.ai/api/cuvs/nightly/integrations/faiss/ )
107
+ has an integration in Faiss, more on this can be found [ here] ( https://docs.rapids.ai/api/cuvs/nightly/integrations/faiss/ ) .
108
108
* [ Theine] ( https://github.com/Yiling-J/theine ) : High performance in-memory
109
109
cache.
110
110
- We will use it as our exact match cache backend. After the most similar
@@ -151,15 +151,15 @@ section. However, for those interested in understanding the specifics,
151
151
let's explore what this patch includes.
152
152
153
153
The patch introduces a new script,
154
- [ semantic_caching.py. ] ( ./artifacts/semantic_caching.py ) , which is added to the
154
+ [ semantic_caching.py] ( ./artifacts/semantic_caching.py ) , which is added to the
155
155
appropriate directory. This script implements the core logic for our
156
156
semantic caching functionality.
157
157
158
158
Next, the patch integrates semantic caching into the model. Let's walk through
159
159
these changes step-by-step.
160
160
161
161
Firstly, it imports the necessary classes from
162
- [ semantic_caching.py. ] ( ./artifacts/semantic_caching.py ) into the codebase:
162
+ [ semantic_caching.py] ( ./artifacts/semantic_caching.py ) into the codebase:
163
163
164
164
``` diff
165
165
...
@@ -353,7 +353,7 @@ supported feature in Triton Inference Server.
353
353
354
354
We value your input! If you're interested in seeing semantic caching as a
355
355
supported feature in future releases, we invite you to join the ongoing
356
- [ discussion. ] ( https://github.com/triton-inference-server/server/discussions/7742 )
356
+ [ discussion] ( https://github.com/triton-inference-server/server/discussions/7742 ) .
357
357
Provide details about why you think semantic caching would
358
358
be valuable for your use case. Your feedback helps shape our product roadmap,
359
359
and we appreciate your contributions to making our software better for everyone.
0 commit comments