-
Notifications
You must be signed in to change notification settings - Fork 275
integration-vllm-test #2258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
integration-vllm-test #2258
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2258
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit d152dac with merge base 1017c7e ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
stack-info: PR: #2258, branch: drisspg/stack/58
test/integration/test_vllm.py
Outdated
os.environ["VLLM_USE_V1"] = "1" | ||
os.environ["VLLM_ENABLE_V1_MULTIPROCESSING"] = "0" | ||
os.environ["VLLM_TEST_STANDALONE_COMPILE"] = "1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you missing VLLM_DISABLE_COMPILE_CACHE
?
can you write down for each flag about whether they are always required v.s. temporary etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made note here: #2239
You can see that VLLM_TEST_STANDALONE_COMPILE
which wil lultimately be the default doesn't makes it so that we can compile subclasses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point, I think we should actually just set this in AO integration upstream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean modify these flags when people use AO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, in the AO integration if someone is loading an AO model we should just sset this flag I will make a PR
test/integration/test_vllm.py
Outdated
print(f"Quick test - Input: {test_input}, Output: {decoded}") | ||
|
||
# Save quantized model | ||
print(f"Saving quantized model to {output_dir}...") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we delete these in the end?
stack-info: PR: #2258, branch: drisspg/stack/58
Stacked PRs:
integration-vllm-test