Skip to content

[ROCm][CI] Remove soft_fail from AMD Docker Image Build#38685

Draft
micah-wil wants to merge 1 commit intovllm-project:mainfrom
ROCm:micah/gate-amd-docker-build
Draft

[ROCm][CI] Remove soft_fail from AMD Docker Image Build#38685
micah-wil wants to merge 1 commit intovllm-project:mainfrom
ROCm:micah/gate-amd-docker-build

Conversation

@micah-wil
Copy link
Copy Markdown
Contributor

@micah-wil micah-wil commented Apr 1, 2026

Reverts #38505

I only added retries in the event that the agent is lost.

Signed-off-by: Micah Williamson <micah.williamson@amd.com>
@mergify mergify bot added ci/build rocm Related to AMD ROCm labels Apr 1, 2026
@github-project-automation github-project-automation bot moved this to Todo in AMD Apr 1, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the Buildkite configuration for AMD hardware tests by removing the 'soft_fail' flag from the AMD CPU build step and introducing automatic retry logic for agent-related failures (exit statuses -1 and -10). I have no feedback to provide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/build rocm Related to AMD ROCm

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

1 participant