- Overview
- Features
- Setup Instructions
- Environment Setup
- Getting API Access
- Set the API Key in Environment (Optional)
- Run the Application
This demo application demonstrates how to integrate a Large Language Model (LLM) with Retrieval-Augmented Generation (RAG) using ChromaDB as the vector store. It enables users to query documents using natural language and receive contextually enriched responses.
- Modular RAG pipeline with pluggable components.
- ChromaDB integration for fast and scalable vector search.
- Support for multiple LLM backends (e.g., Cloud AI Playground, Allam Playground, Custom).
- Choose between various models.
Before proceeding further, please ensure that all the setup steps outlined below are completed in the specified order. These instructions are essential for configuring the various tools required to successfully run the application.
Each section provides a reference to internal documentation for detailed guidance. Please follow them carefully to avoid any setup issues later in the process.
Miniconda is required to manage the application's Python environment and dependencies. Please follow the setup instructions carefully to ensure a consistent and reproducible environment.
For detailed steps, refer to the internal documentation:
Set up Miniconda
Git is required for version control and collaboration. Proper configuration ensures seamless integration with repositories and development workflows.
For detailed steps, refer to the internal documentation:
Setup Git
To set up the Python environment required for running the application, follow the steps below. This ensures all dependencies are installed in an isolated and reproducible environment.
-
Create your working directory :
mkdir my_working_directory cd my_working_directory -
Download Your Application :
git clone -n --depth=1 --filter=tree:0 https://github.com/qualcomm/Startup-Demos.git cd Startup-Demos git sparse-checkout set --no-cone /GenAI/CloudAI-Playground/ChatBot_LLM_RAG git checkout
-
Navigate to Application Directory :
cd ./GenAI/CloudAI-Playground/ChatBot_LLM_RAG -
Create a new Conda environment with Python 3.12:
conda create -n myenv python=3.12
-
Activate the environment:
conda activate myenv
-
Install the required dependencies:
pip install -r requirements.txt
💡 Make sure you have Miniconda or Anaconda installed before running these commands.
To use the application features that rely on external AI services, you need to obtain API keys from the respective playgrounds. Follow the instructions below for each platform.
To retrieve your API key for the Cloud AI Playground, refer to the documentation: API Key for Cloud AI Playground
To retrieve your API key for the Cloud AI Playground, refer to the documentation: API Key for Allam Playground
Once you have obtained the keys, make sure to securely store them in your environment configuration or secrets manager as per your project’s security guidelines.
To securely manage your API keys, it is recommended to store them as environment variables. This approach helps avoid hardcoding sensitive information in your codebase.
For instructions on setting environment variables on your system, refer to the documentation:
Setup Environment Variables
💡 Once set, your application can access the API key using standard environment variable access methods in your programming language (e.g.,
os.environin Python).
To launch the application, execute the following command from the application directory:
streamlit run .\src\ChatApp.pyThis will start the interactive web interface where you can configure and use the application.
Choose the Large Language Model (LLM) provider you want to use (e.g., CloudAI Playground, Allam Playground, etc.). This determines the backend service that will process your queries.
Enter the API key associated with the selected LLM instance. This is required to authenticate and access the model's capabilities.
After selecting the provider, choose the specific model (e.g., Llama-3.3-70B, Llama-3.1-8B, etc.) that best suits your use case.
If your workflow involves document embeddings (e.g., for semantic search), select the embedding service provider and configure its API key.
Choose the embedding model (e.g., BAAI/bge-base-en-v1.5) to convert your documents into vector representations.
Upload one or more PDF documents that you want the application to process. These documents will be parsed and embedded for querying.
Decide how you want to manage the document data in the vector database. The following options are available:
- Use Existing – Query the existing database without modifying it.
- Add to Existing Database – Append new documents to the current database.
- Generate New Database – Clear existing data and create a new database from uploaded files.
- Clear Database – Remove all data from the current database.
- No Database – Skip database usage and query the LLM directly.
- Clear Chat History – Reset the current chat session.
✅ Once all configurations are complete, you can begin interacting with the application through the chat interface.







