added parameter for KV cache tokens contribution to LLM impacts calcu… #200
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Added the KV cache tokens contribution to GPU power consumption + model latency, based on the calculations provided below that detail LLM transformer interactions with KV cache.
https://kipp.ly/transformer-inference-arithmetic/
https://kipp.ly/transformer-param-count/
Not an expert on this topic yet, I'm still trying to learn how transformers work in detail to improve how KV cache emissions should be calculated. This PR is a first step after 4 months of trying to find the right information to understand how to measure cache emissions.
I'm already using Ecologits to analyse my team's Cursor (https://cursor.com/) emissions. Our usage logs 4 types of tokens :
With a quick calculation, I'm estimating a 25x increase in our team's emissions if including the Cache Read in our calculations (currently omitted) using the "1/6 factor" method.
I'm open to more discussions on this topic to help research & improve the calculation method for large context windows (mainly to compare the contribution of developers using LLMs for code generation vs the environmental cost of actually using the applications they develop).