-
Notifications
You must be signed in to change notification settings - Fork 143
Pull requests: jd-opensource/xllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
refactor: remove useless function call and refine tests.
#996
opened Mar 4, 2026 by
liutongxuan
Loading…
feat: add flags for multi_round pipeline to return logprob.
#993
opened Mar 4, 2026 by
DragonFive
Loading…
feat: support embedding interface for all generate LLM models.
#990
opened Mar 4, 2026 by
RobbieLeung
Loading…
bugfix: fix mbox model(qwen2.5) multi round core in xattention
#988
opened Mar 4, 2026 by
DragonFive
Loading…
feat: add gtest coverage for cuda act_and_mul activation kernel.
#987
opened Mar 3, 2026 by
yingxudeng
Loading…
feat: add qwen35 and qwen35-thinking reasoning parser detectors.
#985
opened Mar 3, 2026 by
yingxudeng
Loading…
feat: support qwen3.5 tool-call parser with qwen3_coder detector.
#982
opened Mar 3, 2026 by
yingxudeng
Loading…
feat: implement fused RMSNorm and static FP8 quantization for improved performance[2/N].
#976
opened Mar 3, 2026 by
yingxudeng
Loading…
feat: support DeepSeek V3.2 W4A8 MoE for mlu and add smoke test.
#969
opened Mar 2, 2026 by
phantomlei3
Loading…
feat: improve offline inference interface and fix several tp and vlm bugs.
#968
opened Mar 2, 2026 by
weizhehuang0827
Loading…
feat: add rope_in_place tilelang kernel for npu device.
#964
opened Feb 28, 2026 by
zhang-minchao
•
Draft
feat: implement bidirectional remote host to local device kv cache transfer and batch offload.
#939
opened Feb 26, 2026 by
Kang-Meng
Loading…
feat: add REC multi-round two-stage xattention support with CUDA Graph integration.
#933
opened Feb 24, 2026 by
LMX-xin
Loading…
Previous Next
ProTip!
Adding no:label will show everything without a label.