Skip to content
Open
Changes from all commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
711d217
Qwen3 MOE quick fix
eshoguli Nov 20, 2025
a5bae60
Merge branch 'ping1jing2:main' into memory_and_nz_fix
OrangeRedeng Dec 26, 2025
2b24ec3
Add nz support for MOE
OrangeRedeng Dec 26, 2025
5eda2b9
Merge branch 'ping1jing2:main' into memory_and_nz_fix
OrangeRedeng Dec 26, 2025
0195e52
Merge branch 'main' into memory_and_nz_fix
OrangeRedeng Dec 26, 2025
2ec286a
Update python/sglang/srt/layers/quantization/unquant.py
OrangeRedeng Dec 29, 2025
cecaea0
Merge branch 'main' into memory_and_nz_fix
OrangeRedeng Dec 29, 2025
d9a3818
Update unquant.py
OrangeRedeng Dec 29, 2025
1ad1ca1
Update unquant.py
OrangeRedeng Dec 29, 2025
e5484c9
Fix lint issue
OrangeRedeng Dec 29, 2025
153b9b7
Merge branch 'main' into memory_and_nz_fix
OrangeRedeng Jan 12, 2026
d898855
Merge branch 'main' into memory_and_nz_fix
OrangeRedeng Jan 13, 2026
25a0e56
Remove a non-used env ENABLE_ASCEND_MOE_NZ variable from ascend_npu_q…
OrangeRedeng Jan 13, 2026
61830a2
Remove a non-used env ENABLE_MOE_NZ variable from ascend_npu_qwen3_ex…
OrangeRedeng Jan 13, 2026
f586b40
Update NZ converison
OrangeRedeng Jan 13, 2026
ecfafa1
Merge branch 'main' into memory_and_nz_fix
OrangeRedeng Jan 13, 2026
4d38ade
Remove unnecessary function
OrangeRedeng Jan 13, 2026
2f4608d
Update layer.py
OrangeRedeng Jan 13, 2026
3699288
Update unquant.py
OrangeRedeng Jan 13, 2026
3092b31
Update layer.py
OrangeRedeng Jan 13, 2026
fe2aed7
Update layer.py
OrangeRedeng Jan 13, 2026
1054c9d
Update fused_moe_method_npu.py
OrangeRedeng Jan 13, 2026
da5158b
Update fused_moe_method_npu.py
OrangeRedeng Jan 14, 2026
0162b74
Update fused_moe_method_npu.py
OrangeRedeng Jan 14, 2026
019e2d6
Update fused_moe_method_npu.py
OrangeRedeng Jan 14, 2026
8018ee9
Merge branch 'main' into memory_and_nz_fix
OrangeRedeng Jan 14, 2026
d02b451
Create test_ascend_memory_consumption.py‎
OrangeRedeng Jan 14, 2026
312ad28
Merge branch 'main' into memory_and_nz_fix
OrangeRedeng Jan 15, 2026
edbba1b
Fix lint issue
OrangeRedeng Jan 15, 2026
fa13828
Fix lint issue
OrangeRedeng Jan 15, 2026
c78449b
Fix lint issue
OrangeRedeng Jan 15, 2026
5b0b787
Merge branch 'main' into memory_and_nz_fix
OrangeRedeng Jan 16, 2026
5888bd6
Update fused_moe_method_npu.py
OrangeRedeng Jan 16, 2026
a2b332d
Merge branch 'main' into memory_and_nz_fix
OrangeRedeng Jan 16, 2026
ab233ad
Update test_ascend_memory_consumption.py‎
OrangeRedeng Jan 16, 2026
5ceeab1
Update run_suite.py
OrangeRedeng Jan 16, 2026
5cfa59c
Merge branch 'main' into memory_and_nz_fix
OrangeRedeng Jan 16, 2026
0d1cfd1
Merge branch 'main' into memory_and_nz_fix
OrangeRedeng Jan 19, 2026
a3d2798
Merge branch 'main' into memory_and_nz_fix
OrangeRedeng Jan 20, 2026
929e03b
Merge branch 'main' into memory_and_nz_fix
OrangeRedeng Jan 22, 2026
f5462d1
Merge branch 'main' into memory_and_nz_fix
OrangeRedeng Jan 26, 2026
b8d8285
Move transpose(1,2) from forward_npu() to process_weights
OrangeRedeng Jan 26, 2026
a80de0b
Quickfix
OrangeRedeng Jan 26, 2026
495ee00
Merge branch 'main' into memory_and_nz_fix
iforgetmyname Jan 27, 2026
47a5d8a
Delete test/srt/ascend/test_ascend_memory_consumption.py‎
OrangeRedeng Jan 27, 2026
87d6963
Rename test_ascend_memory_consumption.py‎
OrangeRedeng Jan 27, 2026
105e063
Delete test/srt/ascend/test_ascend_memory_consumption.py‎
OrangeRedeng Jan 27, 2026
c501e4e
Add test_ascend_memory_consumption.py
OrangeRedeng Jan 27, 2026
4824eb4
Update run_suite.py
OrangeRedeng Jan 27, 2026
f398506
Move test to test/registered
OrangeRedeng Jan 27, 2026
bacb1ee
Move test to test/registered
OrangeRedeng Jan 27, 2026
5929c9b
Delete test/srt/ascend/test_ascend_memory_consumption.py
OrangeRedeng Jan 27, 2026
8fbe3a1
Fix lint issue
OrangeRedeng Jan 27, 2026
f15c406
Merge branch 'main' into memory_and_nz_fix
OrangeRedeng Jan 27, 2026
b08d02c
Merge branch 'main' into memory_and_nz_fix
OrangeRedeng Jan 27, 2026
d33144c
Merge branch 'main' into memory_and_nz_fix
iforgetmyname Jan 28, 2026
57fb4a1
Merge branch 'main' into memory_and_nz_fix
iforgetmyname Jan 28, 2026
a5c0dc9
Fix test_ascend_memory_consumption.py
OrangeRedeng Jan 30, 2026
5b00360
Merge branch 'main' into test_ascend_memory_consumption.py-bugfix
OrangeRedeng Jan 30, 2026
1b5f0ef
Update model path
OrangeRedeng Feb 3, 2026
21e688b
Merge branch 'main' into test_ascend_memory_consumption.py-bugfix
OrangeRedeng Feb 3, 2026
4ea6b2d
Merge branch 'main' into test_ascend_memory_consumption.py-bugfix
OrangeRedeng Feb 3, 2026
bc3a5c3
Update test_ascend_memory_consumption.py
OrangeRedeng Feb 13, 2026
fdfa79e
Merge branch 'main' into test_ascend_memory_consumption.py-bugfix
OrangeRedeng Feb 13, 2026
6d4ca35
Update test_ascend_memory_consumption.py
OrangeRedeng Feb 13, 2026
3177c79
Merge branch 'main' into test_ascend_memory_consumption.py-bugfix
ping1jing2 Feb 14, 2026
2ab996d
Merge branch 'main' into test_ascend_memory_consumption.py-bugfix
ping1jing2 Feb 16, 2026
8b45efe
Merge branch 'main' into test_ascend_memory_consumption.py-bugfix
ping1jing2 Feb 19, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 3 additions & 4 deletions test/registered/ascend/test_ascend_memory_consumption.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,8 @@

register_npu_ci(
est_time=400,
suite="nightly-1-npu-a3",
suite="nightly-2-npu-a3",
nightly=True,
disabled="run failed",
)

if "ASCEND_RT_VISIBLE_DEVICES" not in os.environ:
Expand All @@ -36,7 +35,7 @@ class TestMemoryConsumptionAscend(CustomTestCase):

def test_memory_consumption(self):

model = "nytopop/Qwen3-30B-A3B.w8a8"
model = "/root/.cache/modelscope/hub/models/Qwen/Qwen3-30B-A3B-w8a8"
base_url = DEFAULT_URL_FOR_TEST

### Calculate initial used memory
Expand Down Expand Up @@ -71,7 +70,7 @@ def test_memory_consumption(self):
used_memory_after_server_starting = (
total_npu_memory - free_npu_memory - initial_used_memory
) / (1 << 30)
self.assertLessEqual(float(used_memory_after_server_starting), 16.00)
self.assertLessEqual(float(used_memory_after_server_starting), 17.00)

# Clean up everything
kill_process_tree(process.pid)
Expand Down
Loading