-
Notifications
You must be signed in to change notification settings - Fork 2k
Pull requests: mlc-ai/mlc-llm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Add Qwen3.5 GatedDeltaNet hybrid model + kHybrid KVStateKind
#3449
opened Mar 8, 2026 by
mitiskuma
Loading…
[WebGPU] Add --enable-subgroups flag for optional subgroup support
#3431
opened Feb 24, 2026 by
ksgr5566
Loading…
[Compiler] Add kv_cache_dtype override plumbing and KV-cache metadata
#3421
opened Feb 11, 2026 by
MagellaX
Loading…
feat(cuda): Implement custom TVM schedule for fused QKV-split and RoPE
#3397
opened Dec 11, 2025 by
Chandan-Sugreevu
Loading…
Add Comprehensive QAT Training Framework for MLC-LLM
#3258
opened Jun 23, 2025 by
alohachen
Loading…
7 of 9 tasks
Perf: load weights, create KV cache, initialize tokenizer in parallel
#3215
opened Apr 27, 2025 by
Bekaboo
Loading…
[Serving] Support tool function calls under strict format constraints
#3190
opened Mar 26, 2025 by
Irfnfnkemed
Loading…
[SERVE][CPP][Android] add native executable program to benchmark models
#2987
opened Oct 18, 2024 by
pfk-beta
Loading…
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.