Replies: 2 comments 2 replies
-
|
Use |
Beta Was this translation helpful? Give feedback.
1 reply
-
|
I was also wondering this too, I was using llama-server logging to loki on scale to zero scenarios to check offloading layers. using |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
We use to get details on how much memory got loaded on each GPU, etc, size of compute buffers, KV cache etc. Those were very useful and are all gone. Why? Is there anyway to get them back?
Beta Was this translation helpful? Give feedback.
All reactions