You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -92,6 +92,11 @@ In case Disaggrigated Prefill is enabled, you should also define the following e
92
92
- Toggle P/D mode: `PD_ENABLED=true`
93
93
- Threshold: `PD_PROMPT_LEN_THRESHOLD=<value>`
94
94
95
+
### Prefix Aware Scorer Configuration
96
+
97
+
-`PREFIX_SCORER_CACHE_CAPACITY` - the cache capacity sets the maximum number of blocks the LRU cache can store. A block maps from a chunk of a prompt to a set of pods that are estimated to have the prefix of the prompt that ends at the keyed chunk.
98
+
-`PREFIX_SCORER_CACHE_BLOCK_SIZE` - the cache block size defines the length of the prompt chunk that a block is keyed by.
0 commit comments