A Gradio app for generating and feeding structured common knowledge to the Inworld AI Studio API using data from product catalogs in JSON format. The app is hosted on Hugging Face Spaces and can also be run locally.
This project automates the extraction, processing, and structured output of product information from PDF-based catalogs, leveraging NVIDIA's powerful AI tools and libraries. The pipeline is designed to handle complex data structures, including tables and embedded text, and formats the output into JSON. Here's a step-by-step breakdown of the process:
- Libraries Used:
fitz(PyMuPDF) andllama-index - The
PyMuPDFReaderextracts raw text and tables from each page of the PDF. This text is split into smaller, manageable chunks to comply with token limits for processing.
- Tool:
nemo - Extracted text chunks are normalized using NVIDIA's NeMo text processing capabilities, which ensure proper case formatting and language-specific corrections.
- Embedding Model: NVIDIA's
NV-Embed-QAfor generating embeddings - Language Model: NVIDIA’s
meta/llama3-70b-instructviallama-index - Chunks are embedded using NVIDIA's embedding model, and a vector index is built. The embedded text is queried using an LLM to extract structured product information, following a predefined format.
- Library:
nemoguardrails - Guardrails are configured to validate and ensure the LLM's output adheres to specific quality and safety standards. Checks include fact validation, hallucination prevention, and format enforcement.
- Custom Classes:
ProductInfoandProductCatalog - Parsed product information is structured into dataclasses and merged to eliminate duplicates. The result is a consolidated list of product details with specifications and associated tables.
- The final product information is saved as a JSON file, ready for further integration or submission to external APIs like ConvAI for knowledge enhancement.
- The structured data is optionally fed to a common knowledge API for further use in conversational AI systems.
Check out the online demo on Hugging Face Spaces: Gradio App on Hugging Face
- app.py: The main Gradio app for running the interface.
- main.py: Script to process product catalogs and manage communication with the Inworld AI API.
- common_knowledge_fill.py: Script that handles creating and feeding structured common knowledge to the API.
- requirements.txt: A list of dependencies required for the project.
To set up the project on your local machine, follow these steps:
Ensure you have Conda installed, and create a new environment:
conda create -n nvidia_llama python=3.10
conda activate nvidia_llamaUse pip to install the dependencies listed in requirements.txt:
pip install -r requirements.txtTo run the Gradio app locally, prepare NVIDIA, LLAMA_CLOUD, CONVAI API Keys, and ConvAI characters. Then put the API keys in the scripts accordingly and use:
python app.pyThis will launch the Gradio app in your default browser, where you can interact with the interface.
Input a product catalog: The app takes a .pdf file containing product details. Generate common knowledge: The app extracts and formats the product information. Feed to ConvAI API: The processed data is sent through the ConvAI Studio API to enrich the ConvAI character/avatar knowledge. Then, you can chat with the ConvAI character about the products from the catalog. Cheers!