Fix Speculative decoding test cases#288
Fix Speculative decoding test cases#288vaibhavjainwiz wants to merge 3 commits intoopendatahub-io:mainfrom vaibhavjainwiz:fix_spec_dec
Conversation
## Walkthrough
The changes update two test files for speculative decoding in model serving. Previously, multiple command-line arguments were used to configure speculative decoding parameters. These have now been consolidated into a single `--speculative_config` argument, which accepts a JSON string encoding all relevant parameters. This affects how the configuration is passed to the serving runtime but does not alter the underlying test logic or exported entities.
## Changes
| File(s) | Change Summary |
|-------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------|
| tests/model_serving/model_runtime/vllm/speculative_decoding/test_granite_7b_lab_draft.py | Replaces separate command-line flags for speculative decoding with a single `--speculative_config` JSON argument. |
| tests/model_serving/model_runtime/vllm/speculative_decoding/test_granite_7b_lab_ngram.py | Consolidates three separate speculative decoding flags into one JSON-based `--speculative_config` argument. |
## Poem
> In the warren of code, where arguments roam,
> Now JSON config brings them all home.
> No more flags to scatter or lose,
> Just one string for all you choose!
> A hop, a skip, one config to bind—
> Speculative rabbits, streamlined!
> 🐇📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
The following are automatically added/executed:
Available user actions:
Supported labels{'/wip', '/hold', '/lgtm', '/verified'} |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/model_serving/model_runtime/vllm/speculative_decoding/test_granite_7b_lab_draft.py (1)
20-20: Consolidation of speculative decoding parameters looks good.The change to consolidate separate speculative decoding parameters into a single JSON-structured config improves consistency and alignment with modern configuration approaches.
For improved maintainability, consider defining the speculative config as a Python dictionary first, then converting to JSON string:
- '--speculative_config={"model": "/mnt/models/granite-7b-instruct-accelerator","num_speculative_tokens": 5}', + # Configuration for speculative decoding (draft model) + f'--speculative_config={json.dumps({ + "model": "/mnt/models/granite-7b-instruct-accelerator", + "num_speculative_tokens": 5 + })}',Don't forget to add
import jsonat the top of the file if you implement this suggestion.tests/model_serving/model_runtime/vllm/speculative_decoding/test_granite_7b_lab_ngram.py (1)
20-20: Consolidation of speculative decoding parameters looks good.The change to consolidate three separate parameters into a single JSON config is consistent with the similar change in the draft test file and improves the configuration approach.
For improved maintainability, consider defining the speculative config as a Python dictionary first, then converting to JSON string:
- '--speculative_config={"model": "ngram","num_speculative_tokens": 5, "prompt_lookup_max": 4}', + # Configuration for speculative decoding (ngram model) + f'--speculative_config={json.dumps({ + "model": "ngram", + "num_speculative_tokens": 5, + "prompt_lookup_max": 4 + })}',Don't forget to add
import jsonat the top of the file if you implement this suggestion.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
tests/model_serving/model_runtime/vllm/speculative_decoding/test_granite_7b_lab_draft.py(1 hunks)tests/model_serving/model_runtime/vllm/speculative_decoding/test_granite_7b_lab_ngram.py(1 hunks)
|
Caution No docstrings were generated. |
for more information, see https://pre-commit.ci
| "--dtype=float16", | ||
| "--speculative-model=/mnt/models/granite-7b-instruct-accelerator", | ||
| "--num-speculative-tokens=5", | ||
| '--speculative_config={"model": "/mnt/models/granite-7b-instruct-accelerator", "num_speculative_tokens": 5}', |
There was a problem hiding this comment.
| '--speculative_config={"model": "/mnt/models/granite-7b-instruct-accelerator", "num_speculative_tokens": 5}', | |
| "--speculative_config", | |
| "{"model": "/mnt/models/granite-7b-instruct-accelerator","num_speculative_tokens": 5}", |
There was a problem hiding this comment.
@tarukumar
There is something wrong with the pre-commit hook configured on opendatahub-tests repo. I had made all changes as per your suggestion but pre-commit hook make a new commit on my PR and change the value.
39314d0
|
create new PR : #294 |
Description
How Has This Been Tested?
Merge criteria:
Summary by CodeRabbit