Skip to content

[https://nvbugs/5989920][test] Unwaive DeepSeekV3 nvfp4 mtp3_fp8kv_chunked test#12533

Open
yizhang-nv wants to merge 3 commits into
NVIDIA:mainfrom
yizhang-nv:unwaive-deepseekv3-nvfp4-mtp3-fp8kv-chunked
Open

[https://nvbugs/5989920][test] Unwaive DeepSeekV3 nvfp4 mtp3_fp8kv_chunked test#12533
yizhang-nv wants to merge 3 commits into
NVIDIA:mainfrom
yizhang-nv:unwaive-deepseekv3-nvfp4-mtp3-fp8kv-chunked

Conversation

@yizhang-nv
Copy link
Copy Markdown
Member

@yizhang-nv yizhang-nv commented Mar 25, 2026

Summary by CodeRabbit

  • Tests
    • Re-enabled a previously skipped test case, indicating the underlying issue has been resolved and test coverage is being expanded.

Description

Unwaive TestDeepSeekV32::test_nvfp4_multi_gpus_piecewise_cuda_graph[mtp3_fp8kv_chunked] (nvbugs/5989920) to re-enable it in CI.

Test Coverage

  • The unwaived test itself: accuracy/test_llm_api_pytorch.py::TestDeepSeekV32::test_nvfp4_multi_gpus_piecewise_cuda_graph[mtp3_fp8kv_chunked]

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40306 [ run ] triggered by Bot. Commit: dc1cf29 Link to invocation

@yizhang-nv yizhang-nv changed the title [None][test] Unwaive DeepSeekV3 nvfp4 mtp3_fp8kv_chunked test [https://nvbugs/5989920][test] Unwaive DeepSeekV3 nvfp4 mtp3_fp8kv_chunked test Mar 25, 2026
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40306 [ run ] completed with state SUCCESS. Commit: dc1cf29
/LLM/main/L0_MergeRequest_PR pipeline #31417 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@yizhang-nv yizhang-nv force-pushed the unwaive-deepseekv3-nvfp4-mtp3-fp8kv-chunked branch from dc1cf29 to a397535 Compare March 30, 2026 02:39
@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run --only-multi-gpu-test --disable-fail-fast

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 30, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 51de01b8-cd20-4dfa-a1eb-81716f3c6eb2

📥 Commits

Reviewing files that changed from the base of the PR and between a34c981 and a397535.

📒 Files selected for processing (1)
  • tests/integration/test_lists/waives.txt
💤 Files with no reviewable changes (1)
  • tests/integration/test_lists/waives.txt

📝 Walkthrough

Walkthrough

A skip/waive entry for a specific parametrized test in tests/integration/test_lists/waives.txt was removed, meaning the test accuracy/test_llm_api_pytorch.py::TestDeepSeekV32::test_nvfp4_multi_gpus_piecewise_cuda_graph[mtp3_fp8kv_chunked] is no longer listed to be skipped.

Changes

Cohort / File(s) Summary
Test Configuration
tests/integration/test_lists/waives.txt
Removed a waive entry for a parametrized test variant, allowing it to run instead of being skipped.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly references the specific test being unwaived and the associated bug number, directly matching the changeset.
Description check ✅ Passed The description follows the template, includes the bug reference, explains what is being done and why, specifies test coverage, and completes the PR checklist.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40632 [ run ] triggered by Bot. Commit: a397535 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40632 [ run ] completed with state SUCCESS. Commit: a397535
/LLM/main/L0_MergeRequest_PR pipeline #31671 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40683 [ run ] triggered by Bot. Commit: a397535 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40683 [ run ] completed with state SUCCESS. Commit: a397535
/LLM/main/L0_MergeRequest_PR pipeline #31713 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@liji-nv
Copy link
Copy Markdown
Collaborator

liji-nv commented Apr 3, 2026

The PR is currently blocked by #12659

@yizhang-nv yizhang-nv force-pushed the unwaive-deepseekv3-nvfp4-mtp3-fp8kv-chunked branch from a2ecf4a to ac31507 Compare April 16, 2026 07:11
@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43690 [ run ] triggered by Bot. Commit: ac31507 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #43690 [ run ] completed with state FAILURE. Commit: ac31507
/LLM/main/L0_MergeRequest_PR pipeline #34174 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run --only-multi-gpu-test --disable-fail-fast

@yizhang-nv yizhang-nv force-pushed the unwaive-deepseekv3-nvfp4-mtp3-fp8kv-chunked branch from ac31507 to 0a2c4f0 Compare April 17, 2026 09:18
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44011 [ run ] triggered by Bot. Commit: 0a2c4f0 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44011 [ run ] completed with state FAILURE. Commit: 0a2c4f0
/LLM/main/L0_MergeRequest_PR pipeline #34450 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@yizhang-nv yizhang-nv force-pushed the unwaive-deepseekv3-nvfp4-mtp3-fp8kv-chunked branch from 0a2c4f0 to a337977 Compare April 20, 2026 05:15
@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44306 [ run ] triggered by Bot. Commit: a337977 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44306 [ run ] completed with state FAILURE. Commit: a337977
/LLM/main/L0_MergeRequest_PR pipeline #34727 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@yizhang-nv yizhang-nv force-pushed the unwaive-deepseekv3-nvfp4-mtp3-fp8kv-chunked branch from a337977 to 85ab435 Compare April 20, 2026 09:03
@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44422 [ run ] triggered by Bot. Commit: 85ab435 Link to invocation

@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44461 [ run ] triggered by Bot. Commit: 85ab435 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44422 [ run ] completed with state ABORTED. Commit: 85ab435

Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44461 [ run ] completed with state SUCCESS. Commit: 85ab435
/LLM/main/L0_MergeRequest_PR pipeline #34864 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44587 [ run ] triggered by Bot. Commit: 85ab435 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44587 [ run ] completed with state SUCCESS. Commit: 85ab435
/LLM/main/L0_MergeRequest_PR pipeline #34972 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@yizhang-nv yizhang-nv force-pushed the unwaive-deepseekv3-nvfp4-mtp3-fp8kv-chunked branch from 85ab435 to 9473334 Compare April 21, 2026 10:05
@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44702 [ run ] triggered by Bot. Commit: 9473334 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44702 [ run ] completed with state SUCCESS. Commit: 9473334
/LLM/main/L0_MergeRequest_PR pipeline #35066 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run --only-multi-gpu-test --disable-fail-fast

Unwaive TestDeepSeekV32::test_nvfp4_multi_gpus_piecewise_cuda_graph[mtp3_fp8kv_chunked]
(nvbugs/5989920) to re-enable it in CI.

Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
@yizhang-nv yizhang-nv force-pushed the unwaive-deepseekv3-nvfp4-mtp3-fp8kv-chunked branch from 9473334 to 41a4405 Compare April 22, 2026 07:47
@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #44934 [ run ] triggered by Bot. Commit: 41a4405 Link to invocation

@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45079 [ run ] triggered by Bot. Commit: 41a4405 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45079 [ run ] completed with state SUCCESS. Commit: 41a4405
/LLM/main/L0_MergeRequest_PR pipeline #35376 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run --only-multi-gpu-test --disable-fail-fast

Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45627 [ run ] triggered by Bot. Commit: d277abb Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #45627 [ run ] completed with state SUCCESS. Commit: d277abb
/LLM/main/L0_MergeRequest_PR pipeline #35840 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
@yizhang-nv
Copy link
Copy Markdown
Member Author

yizhang-nv commented May 25, 2026

/bot run --only-multi-gpu-test --disable-fail-fast

1 similar comment
@yizhang-nv
Copy link
Copy Markdown
Member Author

/bot run --only-multi-gpu-test --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50207 [ run ] triggered by Bot. Commit: e3cf242 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50208 [ run ] triggered by Bot. Commit: e3cf242 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50208 [ run ] completed with state SUCCESS. Commit: e3cf242
/LLM/main/L0_MergeRequest_PR pipeline #39746 (Partly Tested) completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants