Hi authors,
Thank you for your great work on this project. I'm currently reading through your paper and have a question regarding the "Compatibility with KVCache" section. I would appreciate it if you could provide some clarification.
I am trying to understand the rationale behind the eviction operation described in formula (2).
$$\text{ KVCache }\leftarrow\text{ KVCache }\{t_{i}\mid t_{i}> \min\left(t_{f_{a}},t_{f_{b}}\right)\}$$
My questions are:
-
Purpose and Potential Side Effects: The formula states that all tokens with timestamps greater than min(tfa,tfb) are erased from the KVCache. This seems to imply that not only the tokens for the evicted frames but also potentially tokens from subsequent frames or even user text prompts might be removed. Wouldn't this lead to a loss of important contextual information and negatively impact the model's subsequent responses? What is the specific motivation for erasing all tokens after this timestamp, rather than only the tokens corresponding to the specific frames being evicted?
-
Code Implementation and Conceptual Alignment: I've been looking through the repository, but I couldn't find the specific code that implements this KVCache eviction logic.
- Could you please point me to the relevant file(s) and line(s) of code where this operation is handled?
- Additionally, could you elaborate a bit more on how the memory bank "aligns closely" with the KVCache architecturally? Understanding the implementation might help clarify this.
Thank you for your time and for sharing your innovative work. Any insights you could provide would be extremely helpful for my understanding.
Best regards
Hi authors,
Thank you for your great work on this project. I'm currently reading through your paper and have a question regarding the "Compatibility with KVCache" section. I would appreciate it if you could provide some clarification.
I am trying to understand the rationale behind the eviction operation described in formula (2).
My questions are:
Purpose and Potential Side Effects: The formula states that all tokens with timestamps greater than min(tfa,tfb) are erased from the KVCache. This seems to imply that not only the tokens for the evicted frames but also potentially tokens from subsequent frames or even user text prompts might be removed. Wouldn't this lead to a loss of important contextual information and negatively impact the model's subsequent responses? What is the specific motivation for erasing all tokens after this timestamp, rather than only the tokens corresponding to the specific frames being evicted?
Code Implementation and Conceptual Alignment: I've been looking through the repository, but I couldn't find the specific code that implements this KVCache eviction logic.
Thank you for your time and for sharing your innovative work. Any insights you could provide would be extremely helpful for my understanding.
Best regards