NVIDIA-AI-Blueprints · antoniomtz · Mar 25, 2026 · Mar 20, 2026
diff --git a/README.md b/README.md
@@ -49,13 +49,15 @@ A GenAI-powered catalog enrichment system that transforms basic product images i
 **AI Models:**
 - NVIDIA Nemotron VLM (vision-language model)
 - NVIDIA Nemotron LLM (prompt planning)
+- NVIDIA Embeddings (Policy Compliance)
 - FLUX models (image generation)
 - Microsoft TRELLIS (3D generation)
 
 **Infrastructure:**
 - Docker & Docker Compose
 - NVIDIA NIM containers
 - HuggingFace model hosting
+- Milvus vector database for policy PDF retrieval
 
 ## Minimum System Requirements
 
@@ -67,10 +69,11 @@ For self-hosting the NIM microservices locally, the following GPU requirements a
 |-------|---------|-------------|-----------------|
 | Nemotron-Nano-12B-V2-VL | Vision-Language Analysis | 1× A100 | 1× H100 |
 | Nemotron-Nano-V3 | Prompt Planning (LLM) | 1× A100 | 1× H100 |
+| nv-embedqa | Embeddings (Policy Compliance) | 1× A100 | 1× H100 |
 | FLUX Kontext Dev | Image Generation | 1× H100 | 1× H100 |
-| Microsoft TRELLIS | 3D Asset Generation | 1× L40S | 1× L40S |
+| Microsoft TRELLIS | 3D Asset Generation | 1× L40S | 1× H100 |
 
-**Total recommended setup**: 3× H100 + 1× L40S (or 4× H100 for uniform configuration)
+**Total recommended setup**: 3× H100 + 1× L40S (or 4× H100 for uniform configuration). Embeddings model can be deploy on the same GPU as Flux or Trellis models.
 
 ### Deployment Options
 
@@ -146,6 +149,10 @@ Make sure you have accepted [https://huggingface.co/black-forest-labs/FLUX.1-Kon
 
    trellis:
      url: "http://localhost:8004/v1/infer"  # Your TRELLIS NIM endpoint
+
+   embeddings:
+     url: "http://localhost:8005/v1" #Your Embeddings NIM endpoint
+     model: "nvidia/nv-embedqa-e5-v5"
    ```
 
    See the **[Docker Deployment Guide](docs/DOCKER.md)** for instructions on deploying these NIMs.
@@ -166,7 +173,7 @@ The frontend at `http://localhost:3000`.
 
 ### Docker Deployment (Self-Hosted NIMs)
 
-The Docker deployment includes all required self-hosted NVIDIA NIM containers (Nemotron VLM, Nemotron LLM, FLUX, and TRELLIS). The `shared/config/config.yaml` is pre-configured with the correct service URLs for Docker networking.
+The Docker deployment includes all required self-hosted NVIDIA NIM containers (Nemotron VLM, Nemotron LLM, FLUX, and TRELLIS). If you want to use uploaded policy PDFs in the UI, start the companion Milvus stack from `docker-compose.rag.yml` as well. The `shared/config/config.yaml` is pre-configured with the correct service URLs for Docker networking.
 
 For complete Docker deployment instructions, see the **[Docker Deployment Guide](docs/DOCKER.md)**.
 
@@ -185,15 +192,27 @@ For complete Docker deployment instructions, see the **[Docker Deployment Guide]
    chmod a+w "$LOCAL_NIM_CACHE"
    ```
 
-3. **Start all services**:
+3. **Create the shared Docker network**:
+   ```bash
+   docker network create catalog-network || true
+   ```
+
+4. **Start the policy RAG stack**:
+   ```bash
+   docker compose -f docker-compose.rag.yml up -d
+   ```
+
+5. **Start the application stack**:
    ```bash
-   docker-compose up -d
+   docker compose up -d
    ```
 
-4. **Access the application**:
+6. **Access the application**:
    - Frontend: `http://localhost:3000`
    - Backend API: `http://localhost:8000`
    - Health Check: `http://localhost:8000/health`
+   - Milvus: `localhost:19530`
+   - MinIO Console: `http://localhost:9001`
 
 ## API Endpoints
 
@@ -211,12 +230,12 @@ For detailed API documentation with request/response examples, see **[API Docume
 
 ## License
 
-GOVERNING TERMS: The Blueprint scripts are governed by Apache License, Version 2.0, and enables use of separate open source and proprietary software governed by their respective licenses: [NVIDIA-Nemotron-Nano-12B-v2-VL](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemotron-nano-12b-v2-vl?version=1), [Nemotron-Nano-V3](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemotron-3-nano?version=1.7.0), [FLUX.1-Kontext-Dev](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md), and [Microsoft TRELLIS](https://catalog.ngc.nvidia.com/orgs/nim/teams/microsoft/containers/trellis?version=1).
+GOVERNING TERMS: The Blueprint scripts are governed by Apache License, Version 2.0, and enables use of separate open source and proprietary software governed by their respective licenses: [NVIDIA-Nemotron-Nano-12B-v2-VL](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemotron-nano-12b-v2-vl?version=1), [Nemotron-Nano-V3](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemotron-3-nano?version=1.7.0), [nv-embedqa-e5-v5](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nv-embedqa-e5-v5?version=latest), [FLUX.1-Kontext-Dev](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md), and [Microsoft TRELLIS](https://catalog.ngc.nvidia.com/orgs/nim/teams/microsoft/containers/trellis?version=1).
 
 ADDITIONAL INFORMATION: 
 FLUX.1-Kontext-Dev license: [https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md).
 
 Third-Party Community Consideration:
 The FLUX Kontext model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see link to: black-forest-labs/FLUX.1-Kontext-dev Model Card - [https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev).
 
-This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use. 
+This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use. 
diff --git a/deploy/1_Deploy_Catalog_Enrichment.ipynb b/deploy/1_Deploy_Catalog_Enrichment.ipynb
@@ -357,7 +357,7 @@
       "source": [
         "<a id=\"spin-up-blueprint\"></a>\n",
         "## Spin Up Blueprint\n",
-        "Docker compose scripts are provided which spin up the microservices on a single node.  This docker-compose yaml file will start the agents as well as dependant microservices.  This may take up to **15 minutes** to complete.\n"
+        "Docker compose scripts are provided which spin up the microservices on a single node. Start by creating the shared Docker network, then launch the Milvus policy RAG stack from `docker-compose.rag.yml`, and finally bring up the main application stack. This may take up to **15 minutes** to complete.\n"
       ]
     },
     {
@@ -369,6 +369,8 @@
       },
       "outputs": [],
       "source": [
+        "!docker network create catalog-network || true\n",
+        "!docker compose -f docker-compose.rag.yml up -d > /dev/null 2>&1\n",
         "!docker compose up -d > /dev/null 2>&1"
       ]
     },
@@ -413,7 +415,7 @@
       "id": "7d90c358-f0e9-4607-8b88-32a44ffce74e",
       "metadata": {},
       "source": [
-        "This command should produce similiar output in the following format:"
+        "These commands should produce similar output in the following format:"
       ]
     },
     {
@@ -430,6 +432,10 @@
         "nim-llm                       2025-12-16 18:30:24 +0000 UTC   Up 1 minutes\n",
         "nim-trellis                   2025-12-16 18:30:24 +0000 UTC   Up 1 minutes\n",
         "nim-flux                      2025-12-16 18:30:24 +0000 UTC   Up 1 minutes\n",
+        "embedqa                       2025-12-16 18:30:24 +0000 UTC   Up 1 minutes\n",
+        "milvus-etcd                   2025-12-16 18:30:24 +0000 UTC   Up 1 minutes (healthy)\n",
+        "milvus-minio                  2025-12-16 18:30:24 +0000 UTC   Up 1 minutes (healthy)\n",
+        "milvus-standalone             2025-12-16 18:30:24 +0000 UTC   Up 1 minutes (healthy)\n",
         "```"
       ]
     },
@@ -529,7 +535,7 @@
         "<a id=\"stopping-services-and-cleaning-up\"></a>\n",
         "## Stopping Services and Cleaning Up\n",
         "\n",
-        "To shut down the microservices, run the following command"
+        "To shut down the microservices, run the following commands"
       ]
     },
     {
@@ -539,7 +545,8 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "!docker compose down > /dev/null 2>&1"
+        "!docker compose down > /dev/null 2>&1\n",
+        "!docker compose -f docker-compose.rag.yml down > /dev/null 2>&1"
       ]
     },
     {
@@ -577,7 +584,8 @@
         "\n",
         "**Explanation:** When running the blueprint for the first time, all models need to be downloaded from their respective sources. Depending on your internet connection speed, this process can take 20-30 minutes or longer. The models include:\n",
         "- NVIDIA Nemotron VLM\n",
-        "- NVIDIA Nemotron LLM  \n",
+        "- NVIDIA Nemotron LLM \n",
+        "- NVIDIA Embeddings \n",
         "- FLUX image generation model\n",
         "- TRELLIS 3D asset generation model\n",
         "\n",
@@ -596,7 +604,7 @@
       "source": [
         "## LICENSE\n",
         "\n",
-        "GOVERNING TERMS: The Blueprint scripts are governed by Apache License, Version 2.0, and enables use of separate open source and proprietary software governed by their respective licenses: [NVIDIA-Nemotron-Nano-12B-v2-VL](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemotron-nano-12b-v2-vl?version=1), [Nemotron-Nano-V3](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemotron-3-nano?version=1.7.0), [FLUX.1-Kontext-Dev](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md), and [Microsoft TRELLIS](https://catalog.ngc.nvidia.com/orgs/nim/teams/microsoft/containers/trellis?version=1).\n",
+        "GOVERNING TERMS: The Blueprint scripts are governed by Apache License, Version 2.0, and enables use of separate open source and proprietary software governed by their respective licenses: [NVIDIA-Nemotron-Nano-12B-v2-VL](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemotron-nano-12b-v2-vl?version=1), [Nemotron-Nano-V3](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemotron-3-nano?version=1.7.0), [nv-embedqa-e5-v5](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nv-embedqa-e5-v5?version=latest) [FLUX.1-Kontext-Dev](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md), and [Microsoft TRELLIS](https://catalog.ngc.nvidia.com/orgs/nim/teams/microsoft/containers/trellis?version=1).\n",
         "\n",
         "ADDITIONAL INFORMATION: \n",
         "FLUX.1-Kontext-Dev license: [https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md).\n",

diff --git a/docker-compose.rag.yml b/docker-compose.rag.yml
@@ -0,0 +1,73 @@
+services:
+  milvus-etcd:
+    image: quay.io/coreos/etcd:v3.5.5
+    container_name: milvus-etcd
+    environment:
+      - ETCD_AUTO_COMPACTION_MODE=revision
+      - ETCD_AUTO_COMPACTION_RETENTION=1000
+      - ETCD_QUOTA_BACKEND_BYTES=4294967296
+      - ETCD_SNAPSHOT_COUNT=50000
+    volumes:
+      - etcd_data:/etcd
+    command: etcd -listen-client-urls=http://0.0.0.0:2379 -advertise-client-urls=http://milvus-etcd:2379 --data-dir /etcd
+    healthcheck:
+      test: ["CMD", "etcdctl", "endpoint", "health"]
+      interval: 30s
+      timeout: 20s
+      retries: 3
+    networks:
+      - catalog-network
+
+  milvus-minio:
+    image: minio/minio:RELEASE.2023-03-20T20-16-18Z
+    container_name: milvus-minio
+    environment:
+      MINIO_ACCESS_KEY: minioadmin
+      MINIO_SECRET_KEY: minioadmin
+    volumes:
+      - minio_data:/minio_data
+    command: minio server /minio_data --console-address ":9001"
+    ports:
+      - "9001:9001"
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
+      interval: 30s
+      timeout: 20s
+      retries: 3
+    networks:
+      - catalog-network
+
+  milvus-standalone:
+    image: milvusdb/milvus:v2.4.0
+    container_name: milvus-standalone
+    command: ["milvus", "run", "standalone"]
+    ports:
+      - "19530:19530"
+      - "9091:9091"
+    environment:
+      ETCD_ENDPOINTS: milvus-etcd:2379
+      MINIO_ADDRESS: milvus-minio:9000
+    volumes:
+      - milvus_data:/var/lib/milvus
+    depends_on:
+      milvus-etcd:
+        condition: service_healthy
+      milvus-minio:
+        condition: service_healthy
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:9091/healthz"]
+      interval: 30s
+      timeout: 20s
+      retries: 3
+    networks:
+      - catalog-network
+
+networks:
+  catalog-network:
+    external: true
+    name: catalog-network
+
+volumes:
+  etcd_data:
+  minio_data:
+  milvus_data:
diff --git a/docker-compose.yml b/docker-compose.yml
@@ -22,9 +22,14 @@ services:
     container_name: catalog-enrichment-backend
     environment:
       - NGC_API_KEY=${NGC_API_KEY}
+      - NVIDIA_API_KEY=${NVIDIA_API_KEY:-}
+      - NVIDIA_API_BASE_URL=${NVIDIA_API_BASE_URL:-https://integrate.api.nvidia.com/v1}
       - HF_TOKEN=${HF_TOKEN}
+      - MILVUS_HOST=${MILVUS_HOST:-milvus-standalone}
+      - MILVUS_PORT=${MILVUS_PORT:-19530}
     volumes:
       - ./data/outputs:/app/data/outputs
+      - ./data/policies:/app/data/policies
       - ./shared/config:/app/shared/config:ro
     depends_on:
       - vlm-nim
@@ -105,6 +110,33 @@ services:
     networks:
       - catalog-network
 
+  # NVIDIA NIM - Embedding Model
+  embedqa:
+    image: nvcr.io/nim/nvidia/nv-embedqa-e5-v5:1.6
+    container_name: embedqa
+    ports:
+      - "8005:8000"
+    environment:
+      - NGC_API_KEY=${NGC_API_KEY}
+    volumes:
+      - ${LOCAL_NIM_CACHE:-~/.cache/nim}:/opt/nim/.cache
+    user: "${UID:-1000}"
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              device_ids: ['2']
+              capabilities: [gpu]
+    restart: "no"
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:8000/v1/health/ready"]
+      interval: 30s
+      timeout: 10s
+      retries: 5
+    networks:
+      - catalog-network
+
   # Trellis Model - 3D Asset Generation
   trellis-nim:
     image: nvcr.io/nim/microsoft/trellis:1.0.1
@@ -157,9 +189,9 @@ services:
 
 networks:
   catalog-network:
-    driver: bridge
+    external: true
+    name: catalog-network
 
 volumes:
   nim-cache:
     driver: local
-