Skip to content

fix(generation): fp4 byte-size and stale v1→v2 MLA warning#264

Open
Gpgabriel25 wants to merge 1 commit into
erfanzar:mainfrom
Gpgabriel25:pr/generation-mixin-fixes
Open

fix(generation): fp4 byte-size and stale v1→v2 MLA warning#264
Gpgabriel25 wants to merge 1 commit into
erfanzar:mainfrom
Gpgabriel25:pr/generation-mixin-fixes

Conversation

@Gpgabriel25
Copy link
Copy Markdown

Changes

fp4-safe byte-size calculation

_create_mixed_standard_ragged_page_cache_configs and
_create_mixed_standard_unified_attention_cache_configs both used
jnp.finfo(kvdtype).bits // 8 — the same bug fixed in the caching
layer (see PR #XX). Replaced with jnp.dtype(kvdtype).itemsize so
fp4 / integer KV dtypes work correctly through the generation mixin too.

Correct v1v2 in mixed-MLA log warning

The warning that fires when mixed MLA/non-MLA layers are detected was
logging multi_latent_ragged_page_attention_v1 as the target mechanism.
The actual target is v2. Fixed the string literal.

Tests

New: test_esurge_compatible_model_logs_v2_for_mixed_mla — verifies
that both the emitted warning and the resulting attn_mechanism /
decode_attn_mechanism / mla_attn_mechanism kwargs all reference
multi_latent_ragged_page_attention_v2. 1 passed.

@Gpgabriel25 Gpgabriel25 force-pushed the pr/generation-mixin-fixes branch from 718ce5c to 806f8ee Compare April 24, 2026 17:58
@erfanzar
Copy link
Copy Markdown
Owner

Hi @Gpgabriel25 and thanks for contributing to easydel

All of ur PRs are solid and good but it will take time to merge them since i want to keep git worktree the same before i merge branch 'vnext'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants