The vGPU RAG system has been simplified to provide a seamless experience:
- NO collection selection - just one pre-loaded knowledge base
- NO manual uploads - all PDFs are automatically ingested
- NO configuration - everything works out of the box
- Single Knowledge Base: All vGPU documentation is loaded into one collection called
vgpu_knowledge_base - Automatic Loading: When the system starts, it automatically ingests ALL PDFs from the
vgpu_docsfolder - Enhanced Validation: The system still validates vGPU profiles and provides accurate recommendations
Place ALL your NVIDIA vGPU documentation PDFs in the vgpu_docs folder:
vgpu_docs/
├── nvidia_vgpu_software_user_guide.pdf
├── vgpu_profile_specifications.pdf
├── a40_datasheet.pdf
├── l40s_specifications.pdf
├── esxi_vgpu_deployment_guide.pdf
└── [any other vGPU PDFs]
# Set your NGC API key
export NGC_API_KEY="nvapi-your-key-here"
# Start everything (cloud mode)
./scripts/start_vgpu_rag.sh --skip-nimsOpen http://localhost:8090 and start asking questions!
Examples:
- "What vGPU profiles are available for A40 GPUs?"
- "I have 4x L40S GPUs, how many VMs can I run?"
- "Compare vGPU vs passthrough for AI workloads"
- Multiple collections to manage
- Manual collection selection in UI
- Complex document organization
- Users had to know which collection to search
- One collection with everything
- No UI clutter
- Drop PDFs in one folder
- System automatically finds relevant content
- Easier Setup: Just drop PDFs and go
- Better User Experience: No confusion about collections
- Same Intelligence: Still validates profiles and provides enhanced recommendations
- Faster Onboarding: New users can start immediately
- Collection Name:
vgpu_knowledge_base - Bootstrap automatically creates collection and ingests all PDFs
- Enhanced validation still works (profile checking, capacity calculations)
- All PDFs are searched for every query
# Check if PDFs exist
ls -la vgpu_docs/*.pdf
# Re-run bootstrap
docker compose -f deploy/compose/docker-compose-bootstrap.yaml up- Add/remove PDFs in
vgpu_docs - Re-run bootstrap to update the knowledge base
# Stop everything
./scripts/stop_vgpu_rag.sh
# Remove volumes (careful - this deletes all data!)
docker volume rm nvidia-rag_milvus-data
# Start fresh
./scripts/start_vgpu_rag.sh --skip-nims