Update pr-sglang-g6-inference.yaml #5

Workflow file for this run

.github/workflows/pr-sglang-g6-inference.yaml at 3421fc3

	name: PR - SGLang G6 Inference

	on:
	workflow_dispatch:
	push:
	paths:
	- ".github/workflows/pr-sglang-g6-inference.yaml"

	env:
	SGLANG_IMAGE: "public.ecr.aws/deep-learning-containers/sglang:0.5.5-gpu-py312"

	jobs:
	sglang-heavy-inference:
	runs-on: g6-2gpu-runner
	steps:
	- name: Checkout
	uses: actions/checkout@v5

	- name: Pull image
	run: docker pull ${{ env.SGLANG_IMAGE }}

	- name: Start container (2 GPUs)
	run: \|
	CONTAINER_ID=$(docker run -d --rm --gpus=all \
	-p 30000:30000 \
	${{ env.SGLANG_IMAGE }} \
	python3 -m sglang.launch_server \
	--model-path Qwen/Qwen2.5-0.5B-Instruct \
	--host 0.0.0.0 --port 30000 \
	--tp 2)
	echo "CONTAINER_ID=${CONTAINER_ID}" >> ${GITHUB_ENV}
	sleep 60

	- name: Verify GPUs
	run: docker exec ${CONTAINER_ID} nvidia-smi

	- name: Test inference
	run: \|
	docker exec ${CONTAINER_ID} curl -X POST http://localhost:30000/generate \
	-H "Content-Type: application/json" \
	-d '{"text": "Hello, how are you?", "sampling_params": {"temperature": 0.7, "max_new_tokens": 50}}'

	- name: Show container logs
	if: always()
	run: docker logs ${CONTAINER_ID} \|\| true

	- name: Cleanup
	if: always()
	run: docker stop ${CONTAINER_ID} \|\| true

	sglang-light-inference:
	runs-on: g6-1gpu-runner
	steps:
	- name: Checkout
	uses: actions/checkout@v5

	- name: Pull image
	run: docker pull ${{ env.SGLANG_IMAGE }}

	- name: Start container (1 GPU)
	run: \|
	CONTAINER_ID=$(docker run -d --rm --gpus=all \
	-p 30000:30000 \
	${{ env.SGLANG_IMAGE }} \
	python3 -m sglang.launch_server \
	--model-path Qwen/Qwen2.5-0.5B-Instruct \
	--host 0.0.0.0 --port 30000)
	echo "CONTAINER_ID=${CONTAINER_ID}" >> ${GITHUB_ENV}
	sleep 45

	- name: Verify GPUs
	run: docker exec ${CONTAINER_ID} nvidia-smi

	- name: Test inference
	run: \|
	docker exec ${CONTAINER_ID} curl -X POST http://localhost:30000/generate \
	-H "Content-Type: application/json" \
	-d '{"text": "Hello, how are you?", "sampling_params": {"temperature": 0.7, "max_new_tokens": 50}}'

	- name: Show container logs
	if: always()
	run: docker logs ${CONTAINER_ID} \|\| true

	- name: Cleanup
	if: always()
	run: docker stop ${CONTAINER_ID} \|\| true

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update pr-sglang-g6-inference.yaml #5

Workflow file

Update pr-sglang-g6-inference.yaml #5

Uh oh!

Workflow file for this run