Skip to content

Comparison with Standard Token Insertion + KV Eviction? #3

@RealJosephus

Description

@RealJosephus

Hi, thanks for the impressive work!

The streaming benchmarks currently compare CASA against "Full Insertion" (accumulating infinite memory). Have you compared it against a Standard Insertion + KV Eviction baseline (processing images via FFN but dropping old visual KVs)?

While CASA has a clear compute advantage by skipping FFNs, both methods share similar memory characteristics and RoPE handling (keeping position ID "gaps").

Do you have any performance comparisons against a simple KV eviction strategy?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions