Simplified vGPU RAG System

Overview

The vGPU RAG system has been simplified to provide a seamless experience:

NO collection selection - just one pre-loaded knowledge base
NO manual uploads - all PDFs are automatically ingested
NO configuration - everything works out of the box

How It Works

Single Knowledge Base: All vGPU documentation is loaded into one collection called vgpu_knowledge_base
Automatic Loading: When the system starts, it automatically ingests ALL PDFs from the vgpu_docs folder
Enhanced Validation: The system still validates vGPU profiles and provides accurate recommendations

Setup Instructions

1. Add Your PDFs

Place ALL your NVIDIA vGPU documentation PDFs in the vgpu_docs folder:

vgpu_docs/
├── nvidia_vgpu_software_user_guide.pdf
├── vgpu_profile_specifications.pdf
├── a40_datasheet.pdf
├── l40s_specifications.pdf
├── esxi_vgpu_deployment_guide.pdf
└── [any other vGPU PDFs]

2. Start the System

# Set your NGC API key
export NGC_API_KEY="nvapi-your-key-here"

# Start everything (cloud mode)
./scripts/start_vgpu_rag.sh --skip-nims

3. Use the System

Open http://localhost:8090 and start asking questions!

Examples:

"What vGPU profiles are available for A40 GPUs?"
"I have 4x L40S GPUs, how many VMs can I run?"
"Compare vGPU vs passthrough for AI workloads"

What Changed?

Before (Complex)

Multiple collections to manage
Manual collection selection in UI
Complex document organization
Users had to know which collection to search

After (Simple)

One collection with everything
No UI clutter
Drop PDFs in one folder
System automatically finds relevant content

Benefits

Easier Setup: Just drop PDFs and go
Better User Experience: No confusion about collections
Same Intelligence: Still validates profiles and provides enhanced recommendations
Faster Onboarding: New users can start immediately

Technical Details

Collection Name: vgpu_knowledge_base
Bootstrap automatically creates collection and ingests all PDFs
Enhanced validation still works (profile checking, capacity calculations)
All PDFs are searched for every query

Troubleshooting

PDFs Not Loading?

# Check if PDFs exist
ls -la vgpu_docs/*.pdf

# Re-run bootstrap
docker compose -f deploy/compose/docker-compose-bootstrap.yaml up

Want to Update PDFs?

Add/remove PDFs in vgpu_docs
Re-run bootstrap to update the knowledge base

Need to Reset?

# Stop everything
./scripts/stop_vgpu_rag.sh

# Remove volumes (careful - this deletes all data!)
docker volume rm nvidia-rag_milvus-data

# Start fresh
./scripts/start_vgpu_rag.sh --skip-nims

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplified vGPU RAG System

Overview

How It Works

Setup Instructions

1. Add Your PDFs

2. Start the System

3. Use the System

What Changed?

Before (Complex)

After (Simple)

Benefits

Technical Details

Troubleshooting

PDFs Not Loading?

Want to Update PDFs?

Need to Reset?

FilesExpand file tree

VGPU_SIMPLIFIED_SETUP.md

Latest commit

History

VGPU_SIMPLIFIED_SETUP.md

File metadata and controls

Simplified vGPU RAG System

Overview

How It Works

Setup Instructions

1. Add Your PDFs

2. Start the System

3. Use the System

What Changed?

Before (Complex)

After (Simple)

Benefits

Technical Details

Troubleshooting

PDFs Not Loading?

Want to Update PDFs?

Need to Reset?