Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 57 additions & 1 deletion playbooks/supplemental/open-webui-chat/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -757,6 +757,7 @@ This demonstrates that Open WebUI can send multimodal requests (text + image) th

---

<!-- @os:windows -->
### Activity 3: Generate an Image from a Text Prompt (Stable Diffusion)

Stable Diffusion models don't support text generation, they only generate images through the Images API.
Expand All @@ -771,7 +772,60 @@ Stable Diffusion models don't support text generation, they only generate images
- **OpenAI API Base URL:** `http://localhost:13305/api/v1`
- **OpenAI API Key:** `-`
- **Model:** `SDXL-Turbo` or `SDXL-Base-1.0`
4. If you want to add more parameters, add them to the text field as JSON. For example: `{ "steps": 4, "cfg_scale": 1 }`. See available parameters at [Image Generation (Stable Diffusion CPP)](https://lemonade-server.ai/models.html).
4. If you want to add more parameters, add them to the text field as JSON. For example: `{ “steps”: 4, “cfg_scale”: 1 }`. See available parameters at [Image Generation (Stable Diffusion CPP)](https://lemonade-server.ai/models.html).

<p align=”center”>
<img src=”assets/images_settings.png” alt=”Open WebUI Image Generation settings” width=”600”/>
</p>

5. Save


#### Step 2: Allow Image Generation for the model
This step ensures that you enable Image Generation as a capability for your model.
1. Go to **Admin Settings → Models** (http://localhost:8080/admin/settings/models) and choose your model
2. Turn on `Image Generation`

<p align="center">
<img src="assets/model_settings.png" alt="Model Settings" width="45%"/>
<img src="assets/edit_model.png" alt="Edit Model" width="50%"/>
</p>

#### Step 3: Generate an image from the chat screen

1. Go back to chat at `http://localhost:8080`.
2. Select a **Text Generation LLM** in the model dropdown (example: Qwen, Llama). **Do not select a Stable Diffusion model** as this is a chat model selector.
3. In the message area, click on **Integrations**, and toggle **Image** ON.
4. Use a prompt like: `A cinematic photo of heavy traffic at sunset, ultra detailed`.
5. An image is generated and appears in the chat.

<p align="center">
<img src="assets/image_gen_prompt.png" alt="Image Generation" width="49%"/>
<img src="assets/image_gen_response.png" alt="Generated image response" width="32.5%"/>
</p>

This establishes that Open WebUI can coordinate a “two-part” workflow:
- The LLM helps refine the prompt
- The image is generated via Lemonade’s Images endpoint using Stable Diffusion
<!-- @os:end -->

<!-- @os:linux -->
<!-- @device:halo,stx,krk,rx7900xt,rx9070xt -->
Comment thread
adamlam2-amd marked this conversation as resolved.
### Activity 3: Generate an Image from a Text Prompt (Stable Diffusion)

Stable Diffusion models don’t support text generation, they only generate images through the Images API.

#### Step 1: Configure Image Generation in Open WebUI

1. In the Lemonade GUI (`http://localhost:13305`), search for `SDXL-Turbo` (fast) or `SDXL-Base-1.0` (higher quality) and download it.
2. Go to **Admin Settings → Images** (http://localhost:8080/admin/settings/images)
3. Set:
- **Image Generation:** ON
- **Image Generation Engine:** Default (OpenAI)
- **OpenAI API Base URL:** `http://localhost:13305/api/v1`
- **OpenAI API Key:** `-`
- **Model:** `SDXL-Turbo` or `SDXL-Base-1.0`
4. If you want to add more parameters, add them to the text field as JSON. For example: `{ “steps”: 4, “cfg_scale”: 1 }`. See available parameters at [Image Generation (Stable Diffusion CPP)](https://lemonade-server.ai/models.html).

<p align="center">
<img src="assets/images_settings.png" alt="Open WebUI Image Generation settings" width="600"/>
Expand Down Expand Up @@ -806,6 +860,8 @@ This step ensures that you enable Image Generation as a capability for your mode
This establishes that Open WebUI can coordinate a “two-part” workflow:
- The LLM helps refine the prompt
- The image is generated via Lemonade’s Images endpoint using Stable Diffusion
<!-- @device:end -->
<!-- @os:end -->

---

Expand Down
22 changes: 10 additions & 12 deletions playbooks/supplemental/vllm-inference/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,13 +44,13 @@ There is no host-side vLLM installation step. Start vLLM with:
vllm-launch
```

The launcher starts the container, targets the integrated GPU, and exposes a local OpenAI-compatible vLLM server.
The launcher starts the container, targets the integrated GPU, and exposes a local OpenAI-compatible vLLM server. Alternatively, click the vLLM icon in the taskbar.

## Quick Start

### 1. Confirm the vLLM Server Is Running

After `vllm-launch` starts, the server is available at `http://localhost:8001`. Keep the launch terminal open because the server runs in the foreground, then open a separate terminal for the remaining steps. The examples below use `Qwen/Qwen3-1.7B`; if your launcher is configured for a different model, substitute that model ID in the requests.
The `vllm-launch` may take a couple minutes to initialize everything. Once it starts, the server is available at `http://localhost:8001`. Keep the launch terminal open because the server runs in the foreground, then open a separate terminal for the remaining steps. The examples below use `Qwen/Qwen3-1.7B`; if your launcher is configured for a different model, substitute that model ID in the requests.

### 2. Send a Prompt

Expand Down Expand Up @@ -113,14 +113,6 @@ Make sure the server is running:
curl http://localhost:8001/health
```

## Requirements

### For vLLM Server
- Linux
- `vllm-launch` container launcher
- AMD system with a supported integrated GPU
- Sufficient memory for the selected model

## Summary

In this playbook, you learned how to:
Expand All @@ -133,7 +125,13 @@ In this playbook, you learned how to:

You now have a containerized vLLM deployment for serving large language models with optimized performance on the integrated GPU.

## Next Steps

- **Try different models** — Swap the model in the `vllm-launch` configuration to experiment with different LLMs and compare performance.
- **Build an application** — Use the OpenAI-compatible API to integrate vLLM into a Python app, chatbot, or automation workflow.
- **Fine-tune and serve** — Fine-tune a model using LoRA or QLoRA, then deploy it with vLLM for optimized inference.

## Additional Resources

- **[vLLM Official Documentation](https://docs.vllm.ai/)** - Comprehensive guides and API references
- **[vLLM GitHub Repository](https://github.com/vllm-project/vllm)** - Source code, issues, and community discussions
- **[vLLM Official Documentation](https://docs.vllm.ai/)** Comprehensive guides and API references
- **[vLLM GitHub Repository](https://github.com/vllm-project/vllm)** Source code, issues, and community discussions
2 changes: 1 addition & 1 deletion playbooks/supplemental/vllm-inference/playbook.json
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
},
"difficulty": "beginner",
"isNew": false,
"isFeatured": true,
"isFeatured": false,
"developed": true,
"published": true,
"tags": [
Expand Down
4 changes: 2 additions & 2 deletions website/src/components/PlaybooksSection.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,7 @@ export default function PlaybooksSection({ activeDevice, selectedDevice, onSelec
New
</span>
)}
{isHaloSelected && featuredPlaybook.category === "core" && featuredPlaybook.id !== "lmstudio-rocm-llms" && (
{isHaloSelected && featuredPlaybook.category === "core" && (
<PreinstalledBadge />
)}
<DifficultyBadge difficulty={featuredPlaybook.difficulty} />
Expand Down Expand Up @@ -322,7 +322,7 @@ export default function PlaybooksSection({ activeDevice, selectedDevice, onSelec
</svg>
</div>
<div className="flex items-center gap-1.5">
{isHaloSelected && playbook.category === "core" && playbook.id !== "lmstudio-rocm-llms" && (
{isHaloSelected && playbook.category === "core" && !(platformFilter === "linux" && (playbook.id === "lmstudio-rocm-llms" || playbook.id === "vscode-qwen3-coder")) && (
<PreinstalledBadge />
)}
{playbook.isNew && (
Expand Down
Loading