Skip to content

Commit ff96fe9

Browse files
Update caching.md
Update document for semantic cache
1 parent 8402191 commit ff96fe9

File tree

1 file changed

+31
-12
lines changed

1 file changed

+31
-12
lines changed

caching.md

Lines changed: 31 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,43 @@
1-
# Caching
1+
# Caching in Portkey
22

3-
Portkey supports caching across text & chat completions. When the exact same request comes in to Portkey, we can return the response from our cache.
3+
Portkey offers two types of caching to enhance performance and optimize response retrieval: fixed string matching cache(simple) and semantic cache.
44

5-
This could be useful if you have fixed input prompts or are testing the app with the same inputs.
5+
## Fixed String Matching Cache
6+
The fixed string matching cache is the traditional caching mechanism where an exact match is performed on the input prompts. If the exact same request is received again, Portkey can directly return the response from the cache without executing the model.
67

7-
### Enabling cache
8-
9-
To enable caching, pass the following headers in your requests.
8+
### Enabling Fixed String Matching Cache
9+
To enable the fixed string matching cache, include the following headers in your requests:
1010

1111
```sh
12-
"x-portkey-cache": true
13-
"Cache-Control": "max-age:1000"
12+
"x-portkey-cache": "simple"
13+
"Cache-Control": "max-age:1000"
1414
```
15+
The x-portkey-cache header enables or disables the cache storage and retrieval. The Cache-Control header accepts the max-age parameter in seconds, which specifies the maximum age of the cached response. If the Cache-Control header is not provided, Portkey will automatically cache requests for 7*24*60*60 seconds (7 days) when x-portkey-cache is set to true.
1516

16-
The `x-portkey-cache` enables or disables cache storage and retrieval. The `Cache-Control` header accepts `max-age` in seconds. The minimum value for `Cache-Control` is 30. If you don't provide this header, we will automatically cache requests for `7*24*60*60 seconds` (7 days) when the `x-portkey-cache` is set to `true`.
17+
### Invalidating Fixed String Matching Cache
18+
You can force refresh the fixed string matching cache by using the x-portkey-cache-force-refresh header. Setting it to true ensures that the cache is invalidated, and a new value is stored in the cache.
1719

18-
### Invalidating Cache
20+
```sh
21+
"x-portkey-cache-force-refresh": true
22+
```
1923

20-
You can choose to force refresh cache by using the `x-portkey-cache-force-refresh` header. Setting it to `true` ensures that the cache is invalidated, and a new value is stored in the cache.
24+
## Semantic Cache
25+
The semantic cache in Portkey goes beyond exact string matching and takes into account the contextual similarity between input prompts. It uses cosine similarity to determine if the similarity between the input and a cached request exceeds a certain threshold. If the similarity threshold is met, Portkey retrieves the response from the cache.
26+
27+
### Enabling Semantic Cache
28+
To enable the semantic cache feature, use the following header in your requests:
2129

2230
```sh
23-
"x-portkey-cache-force-refresh": true
31+
"x-portkey-cache": "semantic"
2432
```
33+
34+
Setting the x-portkey-cache header to "semantic" enables the semantic cache functionality.
35+
36+
### Implementation Details
37+
When utilizing the semantic cache, it's important to note that the Cache-Control header is still applicable to control the maximum age of the cached response.
38+
39+
If you wish to force refresh the semantic cache and invalidate existing entries, you can use the x-portkey-cache-force-refresh header as described earlier.
40+
41+
By leveraging the semantic cache, you can optimize the caching process by considering the contextual similarity of input prompts, leading to more efficient response retrieval.
42+
43+
Choose the appropriate caching mechanism based on your use case to improve performance and minimize unnecessary model executions in Portkey.

0 commit comments

Comments
 (0)