What happened to the detailed starting/loading output? #23926

segmond · 2026-05-30T23:55:59Z

segmond
May 30, 2026

We use to get details on how much memory got loaded on each GPU, etc, size of compute buffers, KV cache etc. Those were very useful and are all gone. Why? Is there anyway to get them back?

am17an · 2026-05-31T04:22:13Z

am17an
May 31, 2026
Collaborator

Use -lv 4

1 reply

Kangaroux May 31, 2026

This works, but the fitting step logs so much extra garbage with it on. The old logging where it would just print the memory usage at each step and the final memory usage was much nicer. I'm kinda surprised it hasn't been reverted yet based on how much I've seen this get talked about

weikinhuang · 2026-05-31T13:23:05Z

weikinhuang
May 31, 2026

I was also wondering this too, I was using llama-server logging to loki on scale to zero scenarios to check offloading layers. using -lv 4 produces way too much logging that is unnecessary to save. I suggest moving just those details back to lv 3, number of layers offloaded & memory.

1 reply

segmond Jun 1, 2026
Author

I suspect the mess happened here - 994118a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What happened to the detailed starting/loading output? #23926

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

What happened to the detailed starting/loading output? #23926

Uh oh!

segmond May 30, 2026

Replies: 2 comments · 2 replies

Uh oh!

am17an May 31, 2026 Collaborator

Uh oh!

Uh oh!

Kangaroux May 31, 2026

Uh oh!

weikinhuang May 31, 2026

Uh oh!

segmond Jun 1, 2026 Author

segmond
May 30, 2026

Replies: 2 comments 2 replies

am17an
May 31, 2026
Collaborator

weikinhuang
May 31, 2026

segmond Jun 1, 2026
Author