[TRTLLM-10303][feat] Deprecate trtllm-serve CLI options#12106
[TRTLLM-10303][feat] Deprecate trtllm-serve CLI options#12106JunyiXu-nv wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
…parallel_size, metrics-log-interval, and fail_fast_on_attention_window_too_large - TRTLLM-10303: Deprecate --moe_cluster_parallel_size / --cluster_size (smart router feature no longer supported) - TRTLLM-10230: Deprecate --metrics-log-interval in disaggregated command (not connected to any functionality) - TRTLLM-10228: Deprecate --fail_fast_on_attention_window_too_large, default to True (only affects TRT backend which is being removed) All deprecated options emit DeprecationWarning when used and have updated help text. API stability references updated accordingly. Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com> Made-with: Cursor Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com> Made-with: Cursor
📝 WalkthroughWalkthroughThis PR deprecates three CLI options ( Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tensorrt_llm/commands/serve.py`:
- Around line 1011-1015: The help text for the CLI option metrics_log_interval
incorrectly claims the option is disconnected while its value is still threaded
into OpenAIDisaggServer(metrics_interval_secs=metrics_log_interval); either stop
passing metrics_log_interval into OpenAIDisaggServer (remove the argument or
pass None/default) or soften the help text to indicate it is deprecated but
still forwarded to the server (e.g., "Deprecated: will be removed in a future
release; currently forwarded to OpenAIDisaggServer as metrics_interval_secs").
Update both occurrences where the flag is defined and where it's passed to
OpenAIDisaggServer to keep behavior and messaging consistent.
In `@tensorrt_llm/llmapi/llm_args.py`:
- Around line 2233-2236: Add a runtime deprecation warning in the model-layer
validators so explicitly set deprecated fields (e.g., moe_cluster_parallel_size
and the other deprecated fields around lines 2627-2631) trigger a warning when
callers provide them; update BaseLlmArgs (and TrtLlmArgs if it overrides
validation) to include a root_validator or field-specific `@validator` that checks
if the field value is not None and then emits a deprecation warning (use
warnings.warn(..., DeprecationWarning) or the project logging facility) while
preserving the existing value, and ensure the validator names reference the
exact field names (moe_cluster_parallel_size and the other deprecated field
names) so config-driven and direct API usage both surface the deprecation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b143a766-b418-4c82-a2a6-db463801f9ca
📒 Files selected for processing (3)
tensorrt_llm/commands/serve.pytensorrt_llm/llmapi/llm_args.pytests/unittest/api_stability/references/llm.yaml
|
/bot run |
|
PR_Github #38562 [ run ] triggered by Bot. Commit: |
|
PR_Github #38562 [ run ] completed with state
|
All deprecated options emit DeprecationWarning when used and have updated help text. API stability references updated accordingly.
Summary by CodeRabbit
Description
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.