Skip to content
Merged
Show file tree
Hide file tree
Changes from 115 commits
Commits
Show all changes
118 commits
Select commit Hold shift + click to select a range
bf1b92f
Speculative decoding simple implementation
Feb 2, 2026
194e0e4
Speculative decoding vectorized implementation
Feb 5, 2026
e911a17
Added comments and cleaned up code
Feb 6, 2026
6aab08f
Bug fix
Feb 6, 2026
fceb983
Bug fix
Feb 6, 2026
6aeaced
Bug fix
Feb 6, 2026
f5d52db
Rebase to main
shanmugamr1992 Feb 22, 2026
537a1fe
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Feb 23, 2026
b2718b8
WIP MTP for mamba
santhnm2 Feb 23, 2026
d2ac237
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Feb 24, 2026
043e7c1
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Feb 24, 2026
397fd54
WIP debugging
santhnm2 Feb 24, 2026
af60402
Add SGLang kernels
santhnm2 Feb 24, 2026
7795737
More debugging
santhnm2 Feb 24, 2026
ae03747
Working causal_conv1d_update triton kernel
santhnm2 Feb 25, 2026
5915cc2
Mamba almost working
santhnm2 Feb 25, 2026
a011c73
Fix non-consecutive acceptance bug
santhnm2 Feb 25, 2026
06b08d7
More progress
santhnm2 Feb 26, 2026
6b8835a
Working with cuda graphs
santhnm2 Feb 27, 2026
9057442
Fix cuda graphs and chunked prefill
santhnm2 Feb 27, 2026
cfc6282
Merge with main
santhnm2 Feb 28, 2026
4d5fe5d
Add speculative decode unit tests
santhnm2 Mar 2, 2026
8e3710f
Minor fix
santhnm2 Mar 2, 2026
867a137
Minimize diff
santhnm2 Mar 2, 2026
9727533
Formatting
santhnm2 Mar 2, 2026
9917e35
Linting
santhnm2 Mar 2, 2026
eff0fa1
Linting / copyright
santhnm2 Mar 2, 2026
905c7e3
Merge branch 'main' into spec_mamba
santhnm2 Mar 2, 2026
0d05f8b
Linting
santhnm2 Mar 2, 2026
6b86d00
Merge with main
santhnm2 Mar 3, 2026
19921fc
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Mar 4, 2026
42cb956
Merge with main
santhnm2 Mar 5, 2026
5947e3a
Bug fixes
santhnm2 Mar 5, 2026
789f6e8
More fixes
santhnm2 Mar 5, 2026
56b84f5
Minor fixes
santhnm2 Mar 5, 2026
1d028e2
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Mar 5, 2026
2962941
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Mar 5, 2026
1a852c6
Add softplus
santhnm2 Mar 5, 2026
3549711
Remove dead code
santhnm2 Mar 5, 2026
7faee83
Fix flaky test
santhnm2 Mar 5, 2026
fc806ef
More flaky test fixes
santhnm2 Mar 5, 2026
fadbc0c
Address claude's comments
santhnm2 Mar 5, 2026
5f3c141
Chunked prefill fix
santhnm2 Mar 6, 2026
a51d979
Move cache_seqlens_decode into mamba_metadata.py
santhnm2 Mar 6, 2026
307fad5
AAdd triton kernels (possibly revert)
santhnm2 Mar 6, 2026
e0b0a8c
Fix bug
santhnm2 Mar 6, 2026
f4112ea
Undo formatting changes
santhnm2 Mar 6, 2026
83a9726
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Mar 6, 2026
33e1910
Cleanup
santhnm2 Mar 6, 2026
9b4dadf
Add unit tests
santhnm2 Mar 6, 2026
1277af4
Merge remote-tracking branch 'upstream' into spec_mamba
santhnm2 Mar 6, 2026
e59d6e9
Test cache_seqlens in mamba_metadata.py
santhnm2 Mar 6, 2026
213e5d7
Add spec decode + prefix caching unit tests
santhnm2 Mar 6, 2026
d26450b
Fix speculative decode engine test
santhnm2 Mar 6, 2026
d942c0f
Enable prefix caching in the config
santhnm2 Mar 6, 2026
06aa6d2
Merge with main
santhnm2 Mar 7, 2026
c616203
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Mar 9, 2026
8b3fd10
Fix tests
santhnm2 Mar 9, 2026
e6f61d6
Linting
santhnm2 Mar 9, 2026
5e02618
Address review comments
santhnm2 Mar 9, 2026
ff8721d
Update clones
santhnm2 Mar 9, 2026
83f526c
Linting
santhnm2 Mar 9, 2026
1194663
Update clones
santhnm2 Mar 9, 2026
72b1f68
Delete extraneous tokens after stop sequence
santhnm2 Mar 9, 2026
22e8db3
Add engine test for deleting speculative tokens after stop token
santhnm2 Mar 9, 2026
7ce9546
Address review comments
santhnm2 Mar 9, 2026
89e55c0
Remove restriction on materialize_only_last_token_logits
santhnm2 Mar 9, 2026
a878759
Revert circular buffer logic for conv
santhnm2 Mar 9, 2026
47195a1
Linting
santhnm2 Mar 9, 2026
6d7da58
Remove references to cache_seqlens
santhnm2 Mar 9, 2026
b65dbbe
Linting and misc review comments
santhnm2 Mar 9, 2026
712824f
Linting
santhnm2 Mar 9, 2026
61e16e5
Revert materialize_only_last_token_logits changes
santhnm2 Mar 9, 2026
456853c
Formatting
santhnm2 Mar 9, 2026
c3e697f
Minor fix
santhnm2 Mar 9, 2026
ab28eb7
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Mar 9, 2026
955f404
Remove outdated assertion on test
santhnm2 Mar 9, 2026
1a0584d
Nits
santhnm2 Mar 9, 2026
00a7dcc
Fix event tracking for speculative tokens
santhnm2 Mar 9, 2026
8204991
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Mar 9, 2026
9ca68d0
Update text_generation_controller tests
santhnm2 Mar 9, 2026
ed7667c
Fix text generation controller tests
santhnm2 Mar 9, 2026
b92f79c
Fix new_speculative_tokens + eviction, add tests
santhnm2 Mar 9, 2026
9eb639a
Log speculative token acceptance rates
santhnm2 Mar 9, 2026
675aa01
Don't overcount spec proposed tokens for prefill requests
santhnm2 Mar 9, 2026
30d3b62
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Mar 9, 2026
4f119d2
Linting
santhnm2 Mar 9, 2026
fe1372c
Linting
santhnm2 Mar 10, 2026
3be6e4d
Fixing logprobs, stop words adn track_generated_token_events
Mar 10, 2026
e3e5ca2
Fix dynamic_engine unit tests
santhnm2 Mar 10, 2026
f656155
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Mar 10, 2026
0ecfb4e
Fix speculative decoding
santhnm2 Mar 10, 2026
a267a9c
Restore dynamic_engine unit test changes
santhnm2 Mar 10, 2026
9922180
Bug fix
santhnm2 Mar 10, 2026
42126a8
Merge santhnm2/spec_mamba into spec_mamba
Mar 10, 2026
d49ebdb
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Mar 10, 2026
4097bf1
Minimize diff
santhnm2 Mar 10, 2026
c444759
Merge pull request #10 from shanmugamr1992/shanmugamr/fix-logprobs-st…
santhnm2 Mar 10, 2026
8033c52
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Mar 10, 2026
482acdb
Formatting
santhnm2 Mar 10, 2026
915c03e
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Mar 11, 2026
70a5808
Fix hang
santhnm2 Mar 11, 2026
b0f2099
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Mar 11, 2026
d17ad10
Fix dummy spec decode
santhnm2 Mar 11, 2026
fffddd6
Linting
santhnm2 Mar 11, 2026
6a08b28
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Mar 11, 2026
3296f19
Bug fix
santhnm2 Mar 11, 2026
fade26c
Fix static inference
santhnm2 Mar 11, 2026
b578a6a
Fix for mamba model also
santhnm2 Mar 11, 2026
277dfba
Merge remote-tracking branch 'upstream/main' into spec_mamba
santhnm2 Mar 11, 2026
2169c01
Unit test fix
santhnm2 Mar 11, 2026
69946c7
Merge with main
santhnm2 Mar 12, 2026
d2cfd7e
Fix prompt length bug
santhnm2 Mar 12, 2026
736c4f1
Mamba test fix
santhnm2 Mar 12, 2026
ab0e5b4
Fix tests
santhnm2 Mar 12, 2026
d94dd8b
Remove conv1d test
santhnm2 Mar 12, 2026
b92a711
Merge remote-tracking branch 'upstream/main' into spec_mamba_unit_tests
santhnm2 Mar 13, 2026
f67591f
Change to setup_class
santhnm2 Mar 13, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Loading