Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 27 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,13 +49,15 @@ A GenAI-powered catalog enrichment system that transforms basic product images i
**AI Models:**
- NVIDIA Nemotron VLM (vision-language model)
- NVIDIA Nemotron LLM (prompt planning)
- NVIDIA Embeddings (Policy Compliance)
- FLUX models (image generation)
- Microsoft TRELLIS (3D generation)

**Infrastructure:**
- Docker & Docker Compose
- NVIDIA NIM containers
- HuggingFace model hosting
- Milvus vector database for policy PDF retrieval

## Minimum System Requirements

Expand All @@ -67,10 +69,11 @@ For self-hosting the NIM microservices locally, the following GPU requirements a
|-------|---------|-------------|-----------------|
| Nemotron-Nano-12B-V2-VL | Vision-Language Analysis | 1× A100 | 1× H100 |
| Nemotron-Nano-V3 | Prompt Planning (LLM) | 1× A100 | 1× H100 |
| nv-embedqa | Embeddings (Policy Compliance) | 1× A100 | 1× H100 |
| FLUX Kontext Dev | Image Generation | 1× H100 | 1× H100 |
| Microsoft TRELLIS | 3D Asset Generation | 1× L40S | 1× L40S |
| Microsoft TRELLIS | 3D Asset Generation | 1× L40S | 1× H100 |

**Total recommended setup**: 3× H100 + 1× L40S (or 4× H100 for uniform configuration)
**Total recommended setup**: 3× H100 + 1× L40S (or 4× H100 for uniform configuration). Embeddings model can be deploy on the same GPU as Flux or Trellis models.

### Deployment Options

Expand Down Expand Up @@ -146,6 +149,10 @@ Make sure you have accepted [https://huggingface.co/black-forest-labs/FLUX.1-Kon

trellis:
url: "http://localhost:8004/v1/infer" # Your TRELLIS NIM endpoint

embeddings:
url: "http://localhost:8005/v1" #Your Embeddings NIM endpoint
model: "nvidia/nv-embedqa-e5-v5"
```

See the **[Docker Deployment Guide](docs/DOCKER.md)** for instructions on deploying these NIMs.
Expand All @@ -166,7 +173,7 @@ The frontend at `http://localhost:3000`.

### Docker Deployment (Self-Hosted NIMs)

The Docker deployment includes all required self-hosted NVIDIA NIM containers (Nemotron VLM, Nemotron LLM, FLUX, and TRELLIS). The `shared/config/config.yaml` is pre-configured with the correct service URLs for Docker networking.
The Docker deployment includes all required self-hosted NVIDIA NIM containers (Nemotron VLM, Nemotron LLM, FLUX, and TRELLIS). If you want to use uploaded policy PDFs in the UI, start the companion Milvus stack from `docker-compose.rag.yml` as well. The `shared/config/config.yaml` is pre-configured with the correct service URLs for Docker networking.

For complete Docker deployment instructions, see the **[Docker Deployment Guide](docs/DOCKER.md)**.

Expand All @@ -185,15 +192,27 @@ For complete Docker deployment instructions, see the **[Docker Deployment Guide]
chmod a+w "$LOCAL_NIM_CACHE"
```

3. **Start all services**:
3. **Create the shared Docker network**:
```bash
docker network create catalog-network || true
```

4. **Start the policy RAG stack**:
```bash
docker compose -f docker-compose.rag.yml up -d
```

5. **Start the application stack**:
```bash
docker-compose up -d
docker compose up -d
```

4. **Access the application**:
6. **Access the application**:
- Frontend: `http://localhost:3000`
- Backend API: `http://localhost:8000`
- Health Check: `http://localhost:8000/health`
- Milvus: `localhost:19530`
- MinIO Console: `http://localhost:9001`

## API Endpoints

Expand All @@ -211,12 +230,12 @@ For detailed API documentation with request/response examples, see **[API Docume

## License

GOVERNING TERMS: The Blueprint scripts are governed by Apache License, Version 2.0, and enables use of separate open source and proprietary software governed by their respective licenses: [NVIDIA-Nemotron-Nano-12B-v2-VL](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemotron-nano-12b-v2-vl?version=1), [Nemotron-Nano-V3](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemotron-3-nano?version=1.7.0), [FLUX.1-Kontext-Dev](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md), and [Microsoft TRELLIS](https://catalog.ngc.nvidia.com/orgs/nim/teams/microsoft/containers/trellis?version=1).
GOVERNING TERMS: The Blueprint scripts are governed by Apache License, Version 2.0, and enables use of separate open source and proprietary software governed by their respective licenses: [NVIDIA-Nemotron-Nano-12B-v2-VL](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemotron-nano-12b-v2-vl?version=1), [Nemotron-Nano-V3](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemotron-3-nano?version=1.7.0), [nv-embedqa-e5-v5](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nv-embedqa-e5-v5?version=latest), [FLUX.1-Kontext-Dev](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md), and [Microsoft TRELLIS](https://catalog.ngc.nvidia.com/orgs/nim/teams/microsoft/containers/trellis?version=1).

ADDITIONAL INFORMATION:
FLUX.1-Kontext-Dev license: [https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md).

Third-Party Community Consideration:
The FLUX Kontext model is not owned or developed by NVIDIA. This model has been developed and built to a third-party’s requirements for this application and use case; see link to: black-forest-labs/FLUX.1-Kontext-dev Model Card - [https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev).

This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.
This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.
20 changes: 14 additions & 6 deletions deploy/1_Deploy_Catalog_Enrichment.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -357,7 +357,7 @@
"source": [
"<a id=\"spin-up-blueprint\"></a>\n",
"## Spin Up Blueprint\n",
"Docker compose scripts are provided which spin up the microservices on a single node. This docker-compose yaml file will start the agents as well as dependant microservices. This may take up to **15 minutes** to complete.\n"
"Docker compose scripts are provided which spin up the microservices on a single node. Start by creating the shared Docker network, then launch the Milvus policy RAG stack from `docker-compose.rag.yml`, and finally bring up the main application stack. This may take up to **15 minutes** to complete.\n"
]
},
{
Expand All @@ -369,6 +369,8 @@
},
"outputs": [],
"source": [
"!docker network create catalog-network || true\n",
"!docker compose -f docker-compose.rag.yml up -d > /dev/null 2>&1\n",
"!docker compose up -d > /dev/null 2>&1"
]
},
Expand Down Expand Up @@ -413,7 +415,7 @@
"id": "7d90c358-f0e9-4607-8b88-32a44ffce74e",
"metadata": {},
"source": [
"This command should produce similiar output in the following format:"
"These commands should produce similar output in the following format:"
]
},
{
Expand All @@ -430,6 +432,10 @@
"nim-llm 2025-12-16 18:30:24 +0000 UTC Up 1 minutes\n",
"nim-trellis 2025-12-16 18:30:24 +0000 UTC Up 1 minutes\n",
"nim-flux 2025-12-16 18:30:24 +0000 UTC Up 1 minutes\n",
"embedqa 2025-12-16 18:30:24 +0000 UTC Up 1 minutes\n",
"milvus-etcd 2025-12-16 18:30:24 +0000 UTC Up 1 minutes (healthy)\n",
"milvus-minio 2025-12-16 18:30:24 +0000 UTC Up 1 minutes (healthy)\n",
"milvus-standalone 2025-12-16 18:30:24 +0000 UTC Up 1 minutes (healthy)\n",
"```"
]
},
Expand Down Expand Up @@ -529,7 +535,7 @@
"<a id=\"stopping-services-and-cleaning-up\"></a>\n",
"## Stopping Services and Cleaning Up\n",
"\n",
"To shut down the microservices, run the following command"
"To shut down the microservices, run the following commands"
]
},
{
Expand All @@ -539,7 +545,8 @@
"metadata": {},
"outputs": [],
"source": [
"!docker compose down > /dev/null 2>&1"
"!docker compose down > /dev/null 2>&1\n",
"!docker compose -f docker-compose.rag.yml down > /dev/null 2>&1"
]
},
{
Expand Down Expand Up @@ -577,7 +584,8 @@
"\n",
"**Explanation:** When running the blueprint for the first time, all models need to be downloaded from their respective sources. Depending on your internet connection speed, this process can take 20-30 minutes or longer. The models include:\n",
"- NVIDIA Nemotron VLM\n",
"- NVIDIA Nemotron LLM \n",
"- NVIDIA Nemotron LLM \n",
"- NVIDIA Embeddings \n",
"- FLUX image generation model\n",
"- TRELLIS 3D asset generation model\n",
"\n",
Expand All @@ -596,7 +604,7 @@
"source": [
"## LICENSE\n",
"\n",
"GOVERNING TERMS: The Blueprint scripts are governed by Apache License, Version 2.0, and enables use of separate open source and proprietary software governed by their respective licenses: [NVIDIA-Nemotron-Nano-12B-v2-VL](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemotron-nano-12b-v2-vl?version=1), [Nemotron-Nano-V3](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemotron-3-nano?version=1.7.0), [FLUX.1-Kontext-Dev](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md), and [Microsoft TRELLIS](https://catalog.ngc.nvidia.com/orgs/nim/teams/microsoft/containers/trellis?version=1).\n",
"GOVERNING TERMS: The Blueprint scripts are governed by Apache License, Version 2.0, and enables use of separate open source and proprietary software governed by their respective licenses: [NVIDIA-Nemotron-Nano-12B-v2-VL](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemotron-nano-12b-v2-vl?version=1), [Nemotron-Nano-V3](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nemotron-3-nano?version=1.7.0), [nv-embedqa-e5-v5](https://catalog.ngc.nvidia.com/orgs/nim/teams/nvidia/containers/nv-embedqa-e5-v5?version=latest) [FLUX.1-Kontext-Dev](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md), and [Microsoft TRELLIS](https://catalog.ngc.nvidia.com/orgs/nim/teams/microsoft/containers/trellis?version=1).\n",
"\n",
"ADDITIONAL INFORMATION: \n",
"FLUX.1-Kontext-Dev license: [https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md).\n",
Expand Down
73 changes: 73 additions & 0 deletions docker-compose.rag.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
services:
milvus-etcd:
image: quay.io/coreos/etcd:v3.5.5
container_name: milvus-etcd
environment:
- ETCD_AUTO_COMPACTION_MODE=revision
- ETCD_AUTO_COMPACTION_RETENTION=1000
- ETCD_QUOTA_BACKEND_BYTES=4294967296
- ETCD_SNAPSHOT_COUNT=50000
volumes:
- etcd_data:/etcd
command: etcd -listen-client-urls=http://0.0.0.0:2379 -advertise-client-urls=http://milvus-etcd:2379 --data-dir /etcd
healthcheck:
test: ["CMD", "etcdctl", "endpoint", "health"]
interval: 30s
timeout: 20s
retries: 3
networks:
- catalog-network

milvus-minio:
image: minio/minio:RELEASE.2023-03-20T20-16-18Z
container_name: milvus-minio
environment:
MINIO_ACCESS_KEY: minioadmin
MINIO_SECRET_KEY: minioadmin
volumes:
- minio_data:/minio_data
command: minio server /minio_data --console-address ":9001"
ports:
- "9001:9001"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9000/minio/health/live"]
interval: 30s
timeout: 20s
retries: 3
networks:
- catalog-network

milvus-standalone:
image: milvusdb/milvus:v2.4.0
container_name: milvus-standalone
command: ["milvus", "run", "standalone"]
ports:
- "19530:19530"
- "9091:9091"
environment:
ETCD_ENDPOINTS: milvus-etcd:2379
MINIO_ADDRESS: milvus-minio:9000
volumes:
- milvus_data:/var/lib/milvus
depends_on:
milvus-etcd:
condition: service_healthy
milvus-minio:
condition: service_healthy
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:9091/healthz"]
interval: 30s
timeout: 20s
retries: 3
networks:
- catalog-network

networks:
catalog-network:
external: true
name: catalog-network

volumes:
etcd_data:
minio_data:
milvus_data:
36 changes: 34 additions & 2 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,14 @@ services:
container_name: catalog-enrichment-backend
environment:
- NGC_API_KEY=${NGC_API_KEY}
- NVIDIA_API_KEY=${NVIDIA_API_KEY:-}
- NVIDIA_API_BASE_URL=${NVIDIA_API_BASE_URL:-https://integrate.api.nvidia.com/v1}
- HF_TOKEN=${HF_TOKEN}
- MILVUS_HOST=${MILVUS_HOST:-milvus-standalone}
- MILVUS_PORT=${MILVUS_PORT:-19530}
volumes:
- ./data/outputs:/app/data/outputs
- ./data/policies:/app/data/policies
- ./shared/config:/app/shared/config:ro
depends_on:
- vlm-nim
Expand Down Expand Up @@ -105,6 +110,33 @@ services:
networks:
- catalog-network

# NVIDIA NIM - Embedding Model
embedqa:
image: nvcr.io/nim/nvidia/nv-embedqa-e5-v5:1.6
container_name: embedqa
ports:
- "8005:8000"
environment:
- NGC_API_KEY=${NGC_API_KEY}
volumes:
- ${LOCAL_NIM_CACHE:-~/.cache/nim}:/opt/nim/.cache
user: "${UID:-1000}"
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['2']
capabilities: [gpu]
restart: "no"
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/v1/health/ready"]
interval: 30s
timeout: 10s
retries: 5
networks:
- catalog-network

# Trellis Model - 3D Asset Generation
trellis-nim:
image: nvcr.io/nim/microsoft/trellis:1.0.1
Expand Down Expand Up @@ -157,9 +189,9 @@ services:

networks:
catalog-network:
driver: bridge
external: true
name: catalog-network

volumes:
nim-cache:
driver: local

Loading
Loading