Skip to content

Added a vllm quickstart playbook and scripts for curl and gradio web interactive client#47

Merged
danielholanda merged 7 commits intomainfrom
vllm-quick-start
Feb 11, 2026
Merged

Added a vllm quickstart playbook and scripts for curl and gradio web interactive client#47
danielholanda merged 7 commits intomainfrom
vllm-quick-start

Conversation

@hongxiayang
Copy link
Copy Markdown
Collaborator

In this playbook, you will learn how to:

  • Set up and run vLLM with ROCm support using Docker for high-performance LLM inference on AMD GPUs
  • Download and configure language models from Hugging Face for use with vLLM
  • Start and configure a vLLM server with OpenAI-compatible API endpoints on port 8000
  • Test the server using curl commands and API requests
  • Launch and use the Gradio web interface (port 7860) for interactive chat with real-time streaming responses
  • Configure server parameters like GPU memory utilization, model length limits, and multi-GPU support
  • Make API calls to the vLLM server using both streaming and non-streaming requests
  • Troubleshoot common issues with server startup, memory, and client connections

@danielholanda
Copy link
Copy Markdown
Collaborator

@hongxiayang Please fix the CI failure by moving the additional scripts you added to an assets folder inside the vllm-inference repo.

@danielholanda
Copy link
Copy Markdown
Collaborator

Next steps:

  • Daniel to investigate alternatives on preinstalling vLLM or facilitating installs without Docker.

@hongxiayang
Copy link
Copy Markdown
Collaborator Author

@hongxiayang Please fix the CI failure by moving the additional scripts you added to an assets folder inside the vllm-inference repo.

@danielholanda Done.

@danielholanda danielholanda self-requested a review February 11, 2026 22:41
Copy link
Copy Markdown
Collaborator

@danielholanda danielholanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @hongxiayang . Eddie will now take it over and adapt it to work with vllm whls instead of docker.

@danielholanda danielholanda merged commit f8d5e34 into main Feb 11, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants