Skip to content

Commit 895ddbb

Browse files
committed
Show prompt in info panel and clarify CLI vs Agent distinction
Info panel improvements: - Now shows the actual prompt being used for generation - Format: "Prompt: cute robot" - Truncates long prompts (>80 chars) with "..." - Helps users understand what SD is receiving Documentation clarity: - Added "CLI vs Agent Approach" section to user guide - Explains CLI sends prompts directly (no enhancement) - Explains agents use LLM to enhance prompts automatically - Makes it clear this playbook teaches prompt-enhancing agents - Updated playbook intro to emphasize LLM-powered enhancement Examples of distinction: CLI (gaia sd): User types: "a cat" SD receives: "a cat" (exactly as typed) Result: Basic image Agent (from playbook): User says: "a cat" LLM enhances: "fluffy orange cat, soft lighting, detailed fur, photorealistic, 4k" SD receives: Enhanced prompt Result: Professional-quality image This clarification helps users choose the right approach: - CLI: Fast, direct control, write your own detailed prompts - Agent: LLM enhancement, better results from simple inputs
1 parent 5beb5c2 commit 895ddbb

3 files changed

Lines changed: 46 additions & 14 deletions

File tree

docs/guides/sd.mdx

Lines changed: 21 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,10 @@ description: "Generate images from text using the gaia sd command"
99

1010
The `gaia sd` command generates images from text descriptions using Stable Diffusion models running locally on Lemonade Server.
1111

12+
<Note>
13+
**Direct generation (no LLM):** This CLI command sends your prompt directly to Stable Diffusion without enhancement. For **LLM-powered prompt enhancement**, see the [Image Generation Agent Playbook](/playbooks/sd-agent/index) which teaches you to build an agent that automatically improves prompts.
14+
</Note>
15+
1216
---
1317

1418
## Quick Start
@@ -37,15 +41,31 @@ The `gaia sd` command generates images from text descriptions using Stable Diffu
3741

3842
---
3943

44+
## CLI vs Agent Approach
45+
46+
**This CLI command (`gaia sd`):**
47+
- Sends your prompt **directly** to Stable Diffusion (no enhancement)
48+
- Fast and simple - just generates what you ask for
49+
- You write the detailed prompt yourself
50+
51+
**Agent approach** ([see playbook](/playbooks/sd-agent/index)):
52+
- Uses an **LLM to enhance** your simple prompts automatically
53+
- "a cat" → "fluffy orange cat, soft lighting, detailed fur, photorealistic, 4k"
54+
- Better results from simple descriptions, but requires agent setup
55+
56+
---
57+
4058
## Basic Usage
4159

4260
Generate a single image:
4361
```bash
4462
gaia sd "your prompt here"
4563
```
4664

65+
**Tip:** For best results with CLI, write detailed prompts. The command sends your prompt directly to SD without enhancement.
66+
4767
The command will:
48-
1. Show generation settings (model, size, estimated time)
68+
1. Show generation settings (prompt, model, size, estimated time)
4969
2. Display a progress spinner with elapsed timer
5070
3. Show the image preview in terminal
5171
4. Prompt to open in default image viewer

docs/playbooks/sd-agent/index.mdx

Lines changed: 21 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,8 @@ icon: "image"
1111
<Badge text="development" color="orange" />
1212

1313
**Time to complete:** 20-30 minutes
14-
**What you'll build:** An interactive agent that generates images from natural language descriptions
15-
**What you'll learn:** SDToolsMixin pattern, Lemonade Server integration, tool registration
14+
**What you'll build:** An LLM-powered agent that enhances prompts and generates images
15+
**What you'll learn:** SDToolsMixin pattern, prompt enhancement with LLMs, tool registration
1616
**Platform:** Runs locally on AI PCs with Ryzen AI
1717

1818
---
@@ -23,25 +23,34 @@ icon: "image"
2323
**Privacy-First AI:** This agent runs entirely on your AI PC. Images are generated locally using Ryzen AI—prompts and images never leave your machine.
2424
</Info>
2525

26-
When you need to generate images for presentations, prototypes, or creative projects, you typically use cloud services that require API keys, cost money per generation, and send your prompts to external servers.
26+
When you need to generate images for presentations, prototypes, or creative projects, you typically use cloud services that require API keys, cost money per generation, and send your prompts to external servers. Additionally, getting good results requires expertise in "prompt engineering"—knowing the right keywords, styles, and techniques.
2727

28-
This agent solves that by:
28+
This agent solves both problems by:
2929

30-
1. Running Stable Diffusion locally on Ryzen AI hardware
31-
2. Generating images from natural language descriptions
32-
3. Enhancing prompts automatically for better results
33-
4. Saving images to your local filesystem
34-
5. Operating completely offline with full privacy
30+
1. **LLM-powered prompt enhancement** - Automatically improves simple prompts ("a cat" → "fluffy orange cat, soft lighting, detailed fur, photorealistic, 4k")
31+
2. **Local SD generation** - Runs Stable Diffusion on Ryzen AI hardware
32+
3. **Natural language interface** - Just describe what you want in plain English
33+
4. **Automatic best practices** - Adds style, lighting, quality keywords
34+
5. **Complete privacy** - Everything runs offline on your machine
35+
36+
<Note>
37+
**This is different from `gaia sd` CLI:** The CLI command sends your prompt directly to Stable Diffusion. This **agent-based approach** uses an LLM to enhance your prompt first, resulting in significantly better images from simple descriptions.
38+
</Note>
3539

3640
**What you're building:**
3741

38-
An image generation agent that combines:
39-
- **Agent reasoning** - LLM-based prompt enhancement
42+
A **prompt-enhancing image generation agent** that combines:
43+
- **LLM reasoning** - Analyzes user intent and enhances prompts with best practices
4044
- **SDToolsMixin** - Stable Diffusion image generation tools
4145
- **Lemonade Server** - Local SD inference on Ryzen AI
42-
- **Interactive CLI** - Natural language interface
46+
- **Natural language** - User describes in plain English, agent adds technical details
4347
- **Session tracking** - History of generated images
4448

49+
**Example:**
50+
- **User says:** "a robot"
51+
- **Agent enhances:** "futuristic robot assistant, metallic chrome finish, studio lighting, sci-fi, detailed, 4k"
52+
- **Result:** Professional-quality image from simple input
53+
4554
---
4655

4756
## The Architecture (What You're Building)

src/gaia/agents/sd/mixin.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -297,8 +297,11 @@ def _generate_image(
297297

298298
# Show generation info to user
299299
if console and hasattr(console, "print_info"):
300+
# Truncate very long prompts for display
301+
display_prompt = prompt if len(prompt) <= 80 else prompt[:77] + "..."
300302
console.print_info(
301-
f"Generating {size} image with {model}\n"
303+
f"Prompt: {display_prompt}\n"
304+
f"Model: {model} • Size: {size}\n"
302305
f"Settings: {steps} steps, CFG {cfg_scale}\n"
303306
f"Estimated time: {self._estimate_generation_time(model, size)}"
304307
)

0 commit comments

Comments
 (0)