Dynamic maybeQuantizeKVCache causes cache COW in TokenIterator.

When I work on the prompt cache with the KV quant cache, I've noticed maybeQuantizeKVCache converts the SimpleKVCache to the quantized KV cache. However, the cache reference passed to TokenIterator still remains as SimpleKVCache. It seems this causes a COW situation where TokenIterator starts maintaining its own cache instead of using the reference from the passed-in cache. This cause an issue where external code can't access the updated KV cache from TokenIterator.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dynamic maybeQuantizeKVCache causes cache COW in TokenIterator. #341

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dynamic maybeQuantizeKVCache causes cache COW in TokenIterator. #341

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions