Skip to content

NVIDIA/workbench-example-agentic-rag

Repository files navigation

Agentic RAG - Web Search with Accuracy and Hallucination Controls

Boost RAG with an Agentic Layer

  • Route: Checks the RAG context for relevance to the query and adds live web search if the context is thin
  • Evaluate: Checks responses for relevance and accuracy, flags hallucinations
  • Iterate: Goes through multiple evaluation and generation cycles

Modify Agentic RAG

  • Edit Prompts: Customize results through your own prompts
  • Change Parameters: Adjust agent behavior through parameters and runtime variables
  • Look and Feel: Change the agent and UI by editing the code yourself

Inference Your Way

  • Free Endpoints: use free endpoints on build.nvidia.com
  • Self-Hosted: Point to Ollama or NIM on your own GPUs

Get Started

This README has three modes:

  • Easy Mode: Use the application
  • Intermediate Mode: Modify the application
  • Advanced Mode: Self-host gpus for inference

Prerequisites - AI Workbench and an Internet Connection

You can run Agentic RAG without Workbench, but this README requires NVIDIA AI Workbench installed. See how to install it here.

You need internet because Agentic RAG uses an NVIDIA endpoint for document embedding.

Easy Mode (< 5 minutes if Workbench installed)

  1. Get NVIDIA and Tavily API keys:
  2. Clone this repo with AI Workbench > configure the keys when prompted.
  3. Click Open Chat > Go to the Document tab in the web app > Click Add to Context.
  4. Type in your question > Hit enter - answers come from free cloud endpoints.

Details for the README Modes

Click to Expand Easy Mode Agentic RAG Web App Screenshot

Clone Project > Start Chat > Create Context > Ask Questions

Steps What can go wrong Screen shot
1. Open the Desktop App > Select local. Probably a Docker Desktop issue (if selected on install). Fix: See troubleshooting here

Desktop App Icon

2. Click Clone Project > Paste repository URL > Clone Incorrect URL. Fix: use the correct URL. Clone Button
3. Click Resolve Now > Enter NVIDIA and Tavily API keys. You don't see the banner. Fix: go to Project Container > Variables > Configure for API keys. See docs here Resolve Now Warning
4. Click Open Chat. Very little can go wrong here Open Chat Button
5. Click Documents > Create Context. Incorrect API key. Fix per Step 3 above. Add to Context Button
6. Type question > Hit enter. Incorrect API key. Fix per Step 3 above. Chat Text

Clear Context > Change URLs > Create Context > Ask Questions

Use these steps when you want to work with your own documents and your own prompts.

Steps What can go wrong Screen shot
1. Click Documents > Clear Context. Very little. Vector DB reset.
2. Delete the URLs > Add your own > Click Add to Context. URLs that can't be resolved. Fix: Enter appropriate URLs New context.
3. Type question > Hit enter. Incorrect API key. Fix: Fix per Step 3 in table above. Triggers the agent.
Click to Expand Intermediate Mode

Intermediate Mode

Diagram of Agentic Framework

This application is a quick prototype and not a robust piece of software. So there are many opportunities to improve it.

  1. Fork this project to your own GitHub account. Then clone it in Workbench
  2. Add VS Code to the project
  3. Create an experiment branch to protect main
  4. Open VS Code from the Desktop App and edit the application code
    • Change recursion limit, number of web sites returned by Tavily, whether previous searches are saved
    • Add new endpoints from build.nvidia.com
    • Change the look and feel of the Gradio app or add new features
    • Modify the agent
    • Fix any bugs you find
Click to Expand Advanced Mode

Advanced Mode

Use these details if you want to modify the application, e.g. by configuring prompts, adding your own endpoints, changing the Gradio app or whatever else occurs to you.

  1. Set up a Linux box with an NVIDIA GPU and Docker.
  2. Deploy an Ollama container or an NVIDIA NIM on that host.
  3. Configure the chat app to use the self-hosted endpoint.

License

This NVIDIA AI Workbench example project is under the Apache 2.0 License

This project may utilize additional third-party open source software projects. Review the license terms of these open source projects before use. Third party components used as part of this project are subject to their separate legal notices or terms that accompany the components. You are responsible for confirming compliance with third-party component license terms and requirements.

❓ Have Questions?
Please direct any issues, fixes, suggestions, and discussion on this project to the DevZone Members Only Forum thread here

Other Resources

⬇️ Download AI Workbench | 📖 User Guide |📂 Other Projects | 🚨 User Forum

About

An NVIDIA AI Workbench example project for an Agentic Retrieval Augmented Generation (RAG)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •