update: add env variable for main container(vllm) when DRA is enabled for Intel-xe#189
Conversation
af1ab6f to
5b87b65
Compare
|
@yuanwu2017 do you have any comments on the suggestion to add this environment variable as a default? I believe you contributed these originally. I am not familiar with what default env variable make sense. |
|
@zdtsw I don't think this is the root cause of llm-d/llm-d#620. Is modelservice being used? Remove any gpu resource request/limits from the values file and try without them (and set |
PR518 and PR380 can fix it. Issue620 is caused by the llm-d-modelservice upgrade in llm-d. The type of accelerator changed from "intel" to "intel-i915". The DRA is enabling in llm-d-modelservice, so enabling the dra also can fix the issue620.
I think it is ok for adding a default envs for specific device. But I have not understood how this patch to fix it. If enabling the DRA device in values.yaml, these envs values should not work. @zdtsw @poussa |
@kalantar Sorry for bad description in this PR. |
|
The Anyway, 1) this PR is against the old DRA implementation and 2) is not correct since it touches the device plugins, not DRA. |
|
as for the env |
It should apply to DRA now too. As long as we are using the same keys. |
| # @schema | ||
| # additionalProperties: true | ||
| # @schema |
There was a problem hiding this comment.
Why is this annotation needed?
There was a problem hiding this comment.
good catch, it is left from my origin change before the dra refactor, let me remove it
|
@zdtsw please also bump chart version in Chart.yaml and run |
5b87b65 to
096a804
Compare
updated |
|
This looks good, let's resolve the conflict and we can merge. |
620fb82 to
efa9f28
Compare
- add VLLM_WORKER_MULTIPROC_METHOD: spawn for Intel-xe - bump version Signed-off-by: Wen Zhou <wenzhou@redhat.com>
efa9f28 to
eda8b88
Compare
Signed-off-by: Wen Zhou <wenzhou@redhat.com>
thanks, it is rebased. |
Description