[XPU] Add XPU Dockerfile and related docs#1162
[XPU] Add XPU Dockerfile and related docs#1162yma11 wants to merge 3 commits intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 7e035f4ede
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
docker/Dockerfile.xpu
Outdated
| FROM final AS vllm-openai | ||
| RUN cd ${COMMON_WORKDIR}/vllm-omni | ||
| ENTRYPOINT ["vllm", "serve", "--omni"] |
There was a problem hiding this comment.
Clear inherited CMD for vllm-openai image
The vllm-openai stage sets an ENTRYPOINT but does not override the CMD inherited from the final stage (CMD ["/bin/bash"] above). When users run the image without extra args, Docker passes the inherited CMD to the entrypoint, effectively executing vllm serve --omni /bin/bash, which vllm serve will treat as a positional argument and fail. To avoid this, explicitly set CMD [] (or a valid default argument list) in this stage.
Useful? React with 👍 / 👎.
docker/Dockerfile.xpu
Outdated
| @@ -0,0 +1,28 @@ | |||
| ARG BASE_IMAGE=gar-registry.caas.intel.com/pytorch/pytorch-ipex-spr:multi-bmg_release_2601_py2.10_ww2605.5 | |||
There was a problem hiding this comment.
should we use the multi-arc release? or this is temp setting?
There was a problem hiding this comment.
could we based on an upstreaming vllm xpu? From me, doing this will create technical debt for us, and for debt, we always need pay it with interest.
There was a problem hiding this comment.
I am planning to building images based on upstream commits. For example, if the coming vLLM-omni release is based on vLLM v0.15.0, I will build the image using file https://github.com/vllm-project/vllm/blob/releases/v0.15.0/docker/Dockerfile.xpu.
There was a problem hiding this comment.
Updated. @yao-matrix @xuechendi please take a further look.
Signed-off-by: Yan Ma <yan.ma@intel.com>
Signed-off-by: Yan Ma <yan.ma@intel.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 74cc42df39
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
|
||
| ENV VLLM_WORKER_MULTIPROC_METHOD=spawn | ||
|
|
||
| ENTRYPOINT ["vllm", "serve", "--omni"] |
There was a problem hiding this comment.
Source oneAPI env for vllm-openai entrypoint
The image installs oneAPI/oneCCL and only wires its environment into /root/.bashrc (via setvars.sh), but the vllm-openai stage runs vllm serve --omni directly. Non-interactive entrypoints do not source .bashrc, so the oneAPI environment (e.g., library paths for libccl.so) is missing when users run the container with the default entrypoint, which can lead to runtime linker errors or CCL not being found. Consider exporting the needed env vars with ENV or wrapping the entrypoint in bash -lc 'source /root/.bashrc && …' so the runtime environment matches what you prepared.
Useful? React with 👍 / 👎.
xuechendi
left a comment
There was a problem hiding this comment.
verified with intel gpu (1500 and B60), both working
|
@ywang96 , may you help to review and merge. |
|
|
||
| --8<-- "docs/getting_started/installation/gpu/rocm.inc.md:requirements" | ||
|
|
||
| === "Intel XPU" |
There was a problem hiding this comment.
is XPU categorized as one type of GPU?
| @@ -0,0 +1,110 @@ | |||
| FROM intel/deep-learning-essentials:2025.2.2-0-devel-ubuntu24.04 AS vllm-base | |||
There was a problem hiding this comment.
how long does it take to build a image in CI env?
There was a problem hiding this comment.
First time will take about 10min due to vllm docker build. We will provide our Intel GPU node to buildkite, so we can ensure vLLM base image cache to accelerate CI docker build time
There was a problem hiding this comment.
For the CI test, please check #400 and the follow-up design RFC for detials. We expect all other platform CI e2e time should not exceed the cuda e2e time.
There was a problem hiding this comment.
Got it, thanks for the info!
Is it OK to get this PR in firstly, and we will accelerate docker build time in next CI pr.
There was a problem hiding this comment.
@xuechendi are there plans to ship XPU docker image on vllm main repo?
docker/Dockerfile.xpu
Outdated
| ENV VLLM_WORKER_MULTIPROC_METHOD=spawn | ||
|
|
||
| RUN --mount=type=cache,target=/root/.cache/pip \ | ||
| --mount=type=bind,source=.git,target=.git \ |
There was a problem hiding this comment.
This bind mount is not needed since we are git cloning from vllm, and .git is for vllm-omni.
There was a problem hiding this comment.
Makes sense. removed this line.
Signed-off-by: Yan Ma <yan.ma@intel.com>
Purpose
This PR adds XPU docker file and updates related docs.
As there is no vllm xpu image published, we build omni image based on vllm docker file directly.
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)