Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions test/dlc_tests/sanity/test_boottime_container_security.py
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed this check because vllm installs deepgemm scripts to tmp folder.

Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@
@pytest.mark.model("N/A")
@pytest.mark.canary("Run security test regularly on production images")
def test_security(image):
if "vllm" in image:
pytest.skip(
"vLLM images do not require pip check as they are managed by vLLM devs. Skipping test."
)
repo_name, image_tag = image.split("/")[-1].split(":")
container_name = f"{repo_name}-{image_tag}-security"

Expand All @@ -20,10 +24,7 @@ def test_security(image):
)
try:
docker_exec_cmd = f"docker exec -i {container_name}"
if "vllm" in image:
run_command = f"python3 /test/bin/security_checks.py"
else:
run_command = f"python /test/bin/security_checks.py"
run_command = f"python /test/bin/security_checks.py"

run(f"{docker_exec_cmd} {run_command} --image_uri {image}", hide=True)
finally:
Expand Down
5 changes: 5 additions & 0 deletions test/dlc_tests/sanity/test_pre_release.py
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above, errors out about tmp folder.

Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,11 @@ def test_stray_files(image):

:param image: ECR image URI
"""
if "vllm" in image:
pytest.skip(
"vLLM images do not require pip check as they are managed by vLLM devs. Skipping test."
)

ctx = Context()
container_name = get_container_name("test_tmp_dirs", image)
start_container(container_name, image, ctx)
Expand Down
2 changes: 1 addition & 1 deletion test/vllm/ec2/test_artifacts/test_ec2.py
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

benchmark_serving is deprecated so we need to use the bench serve command.

Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ def setup_env(connection):
def create_benchmark_command() -> str:
"""Create command for running benchmark"""
return f"""
python3 /fsx/vllm-dlc/vllm/benchmarks/benchmark_serving.py \
vllm bench serve \
--model deepseek-ai/DeepSeek-R1-Distill-Qwen-32B \
--backend vllm \
--base-url "http://localhost:8000" \
Expand Down
2 changes: 1 addition & 1 deletion test/vllm/ec2/utils/setup_fsx_vllm.sh
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ check_error "Failed to set permissions"
cd /fsx/vllm-dlc
git clone https://github.com/vllm-project/vllm.git
cd vllm
git checkout tags/v0.10.1.1
git checkout tags/v0.10.2

# Download ShareGPT dataset
log "Downloading ShareGPT dataset..."
Expand Down
19 changes: 8 additions & 11 deletions vllm/arm64/gpu/Dockerfile.arm64
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to call this out, we are using latest pytorch, latest cuda and latest vllm version which is a win.

Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
ARG CUDA_VERSION=12.8.1
ARG CUDA_VERSION=12.9.0
ARG IMAGE_DISTRO=ubuntu22.04
ARG PYTHON_VERSION=3.12

Expand Down Expand Up @@ -41,9 +41,9 @@ ENV LD_LIBRARY_PATH=${CUDA_HOME}/lib64:${LD_LIBRARY_PATH}

RUN apt-get update && apt install -y wget

ARG TORCH_URL=https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.7.0/arm64/cu128/torch-2.7.0%2Bcu128-cp312-cp312-manylinux_2_28_aarch64.whl
ARG TORCHVISION_URL=https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.7.0/arm64/cu128/torchvision-0.22.0%2Bcu128-cp312-cp312-linux_aarch64.whl
ARG TORCHAUDIO_URL=https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.7.0/arm64/cu128/torchaudio-2.7.0%2Bcu128-cp312-cp312-linux_aarch64.whl
ARG TORCH_URL=https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.8.0/arm64/cu129/torch-2.8.0%2Bcu129-cp312-cp312-manylinux_2_28_aarch64.whl
ARG TORCHVISION_URL=https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.8.0/arm64/cu129/torchvision-0.23.0%2Bcu129-cp312-cp312-linux_aarch64.whl
ARG TORCHAUDIO_URL=https://framework-binaries.s3.us-west-2.amazonaws.com/pytorch/v2.8.0/arm64/cu129/torchaudio-2.8.0%2Bcu129-cp312-cp312-linux_aarch64.whl

RUN uv pip install --no-cache-dir -U \
${TORCH_URL} \
Expand All @@ -56,7 +56,7 @@ RUN uv pip install --extra-index-url https://download.pytorch.org/whl/nightly/py
FROM base AS build-base
RUN mkdir /wheels

RUN uv pip install -U build cmake ninja pybind11 setuptools wheel requests numpy
RUN uv pip install -U build cmake ninja pybind11 setuptools setuptools_scm wheel requests numpy torch==2.8.0
RUN export MAX_JOBS=15

###############################################################################
Expand All @@ -75,18 +75,18 @@ RUN git clone https://github.com/facebookresearch/xformers.git && \
FROM build-base AS build-vllm
RUN git clone https://github.com/vllm-project/vllm.git && \
cd vllm && \
git checkout v0.10.2rc1 && \
git checkout v0.10.2 && \
git submodule sync && \
git submodule update --init --recursive -j 8 && \
python use_existing_torch.py && \
uv pip install -r requirements/build.txt && \
MAX_JOBS=16 uv build --wheel --no-build-isolation -o /wheels

###############################################################################
FROM base AS vllm-openai
COPY --from=build-vllm /wheels/* wheels/
COPY --from=build-xformers /wheels/* wheels/

RUN uv pip install -U build cmake ninja pybind11 setuptools==79.0.1 wheel

RUN git clone https://github.com/flashinfer-ai/flashinfer.git --recursive && \
cd flashinfer && \
python -c "import torch; print(torch.__version__, torch.version.cuda)" && \
Expand All @@ -106,8 +106,6 @@ RUN uv clean

RUN export PATH="$(dirname $(realpath .venv/bin/python)):$PATH"

RUN uv pip install -U build cmake ninja pybind11 setuptools==79.0.1 wheel

# Enable hf-transfer for faster downloads
ENV HF_HUB_ENABLE_HF_TRANSFER=1
RUN uv pip install datasets aiohttp
Expand All @@ -121,7 +119,6 @@ RUN wget ${NSYS_URL}${NSYS_PKG} && \
rm $NSYS_PKG
RUN apt install -y --no-install-recommends tmux cmake

# Install required build tool
RUN uv pip install ninja

ARG PYTHON="python3"
Expand Down
6 changes: 3 additions & 3 deletions vllm/buildspec-arm64.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ account_id: &ACCOUNT_ID <set-$ACCOUNT_ID-in-environment>
prod_account_id: &PROD_ACCOUNT_ID 763104351884
region: &REGION <set-$REGION-in-environment>
framework: &FRAMEWORK vllm
version: &VERSION "0.10.2rc1"
version: &VERSION "0.10.2"
short_version: &SHORT_VERSION "0.10"
arch_type: &ARCH_TYPE arm64
autopatch_build: "False"
Expand Down Expand Up @@ -33,9 +33,9 @@ images:
<<: *BUILD_REPOSITORY
context:
<<: *BUILD_CONTEXT
image_size_baseline: 20000
image_size_baseline: 25000
device_type: &DEVICE_TYPE gpu
cuda_version: &CUDA_VERSION cu128
cuda_version: &CUDA_VERSION cu129
python_version: &DOCKER_PYTHON_VERSION py3
tag_python_version: &TAG_PYTHON_VERSION py312
os_version: &OS_VERSION ubuntu22.04
Expand Down