This document provides detailed information about the Catalog Enrichment System API endpoints.
- Local Development:
http://localhost:8000 - Docker Deployment:
http://localhost:8000
Returns a plaintext greeting message.
Response:
Catalog Enrichment Backend
Health check endpoint for monitoring service status.
Response:
{
"status": "ok"
}The API provides a modular approach for optimal performance and flexibility:
1) Fast VLM Analysis (POST /vlm/analyze) - Get product fields quickly
2) FAQ Generation (POST /vlm/faqs) - Generate product FAQs from enriched data
3) Image Generation (POST /generate/variation) - Generate 2D variations on demand
4) 3D Asset Generation (POST /generate/3d) - Generate 3D models on demand
Benefits of this approach:
- Display product information immediately to users
- Generate images and 3D assets in the background or on-demand
- Cache VLM results and generate multiple variations
- Better error handling for each step
- Parallel generation of multiple asset types
Manage the persistent PDF policy library used during analysis.
Policy documents are handled as a persistent single-user RAG library:
- uploaded PDFs are parsed and normalized into structured policy summaries
- normalized policy records are embedded and stored in Milvus
/vlm/analyzeautomatically performs semantic retrieval against the loaded policy library- the compliance classifier receives the analyzed product plus the retrieved policy records
Returns metadata for the currently loaded policy library.
{
"documents": [
{
"document_hash": "string",
"filename": "string",
"file_size": 12345,
"chunk_count": 10,
"created_at": 1735689600,
"updated_at": 1735689600
}
]
}chunk_count is the number of indexed policy records generated from the normalized PDF, not the raw page count.
Content-Type: multipart/form-data
| Parameter | Type | Required | Description |
|---|---|---|---|
files |
file[] | Yes | One or more PDF files to add to the persistent policy library |
locale |
string | No | Locale used when normalizing newly uploaded policies (default: en-US) |
curl -X POST \
-F "locale=en-US" \
-F "files=@policy-a.pdf;type=application/pdf" \
-F "files=@policy-b.pdf;type=application/pdf" \
http://localhost:8000/policies{
"documents": [
{
"document_hash": "string",
"filename": "string",
"file_size": 12345,
"chunk_count": 10,
"created_at": 1735689600,
"updated_at": 1735689600
}
],
"results": [
{
"document_hash": "string",
"filename": "string",
"chunk_count": 10,
"already_loaded": false,
"processed": true
}
]
}Notes:
- repeated uploads of the same PDF are deduplicated by content hash
already_loaded=truemeans the document was already present in the libraryprocessed=truemeans the upload was newly parsed, normalized, embedded, and indexed
Clears the persistent policy library, including stored PDF artifacts and vector embeddings.
curl -X DELETE http://localhost:8000/policies{
"status": "ok"
}Extract product fields using NVIDIA Nemotron VLM and, when policies are loaded, run policy retrieval plus compliance classification.
Endpoint: POST /vlm/analyze
Content-Type: multipart/form-data
| Parameter | Type | Required | Description |
|---|---|---|---|
image |
file | Yes | Product image file (JPEG, PNG) |
locale |
string | No | Regional locale code (default: "en-US") |
product_data |
JSON string | No | Existing product data to augment |
brand_instructions |
string | No | Custom brand voice, tone, style, and taxonomy guidelines |
When one or more policy PDFs have been loaded through /policies, this endpoint also:
- retrieves semantically relevant normalized policy records from Milvus using the VLM title/description/categories/tags/colors
- runs a compliance classifier against the analyzed product and the retrieved policy records
{
"title": "string",
"description": "string",
"price": "number",
"categories": ["string"],
"tags": ["string"]
}{
"title": "string",
"description": "string",
"categories": ["string"],
"tags": ["string"],
"colors": ["string"],
"locale": "string",
"policy_decision": {
"status": "pass | fail",
"label": "string",
"summary": "string",
"matched_policies": [
{
"document_name": "string",
"policy_title": "string",
"rule_title": "string",
"reason": "string",
"evidence": ["string"]
}
],
"warnings": ["string"],
"evidence_note": "string"
}
}policy_decision is included only when the policy library contains at least one loaded document.
curl -X POST \
-F "image=@bag.jpg;type=image/jpeg" \
-F "locale=en-US" \
http://localhost:8000/vlm/analyzecurl -X POST \
-F "image=@bag.jpg;type=image/jpeg" \
-F 'product_data={"title":"Classic Black Patent purse","description":"Elegant bag","price":15.99,"categories":["accessories"],"tags":["bag","purse"]}' \
-F "locale=en-US" \
http://localhost:8000/vlm/analyzecurl -X POST \
-F "image=@bag.jpg;type=image/jpeg" \
-F 'product_data={"title":"Black Purse","description":"Elegant bag"}' \
-F "locale=es-ES" \
http://localhost:8000/vlm/analyzecurl -X POST \
-F "image=@bag.jpg;type=image/jpeg" \
-F 'product_data={"title":"Beauty Product","description":"Nice cream"}' \
-F "locale=en-US" \
-F 'brand_instructions=Write the catalog as a professional expert in Sephora Beauty. Strictly use this tone and style when writing the product document. Use this example as guidance for fragrance products: Title: Good Girl Blush Eau de Parfum with Floral Vanilla Description: A fresh, floral explosion of femininity, this radiant reinvention of the iconic Good Girl scent reveals the multifaceted nature of modern womanhood with a double dose of sensual vanilla and exotic ylang-ylang.' \
http://localhost:8000/vlm/analyze{
"title": "Glamorous Black Evening Handbag with Gold Accents",
"description": "This exquisite handbag exudes sophistication and elegance. Crafted from high-quality, glossy leather...",
"categories": ["accessories"],
"tags": ["black leather", "gold accents", "evening bag", "rectangular shape"],
"colors": ["black", "gold"],
"locale": "en-US",
"policy_decision": {
"status": "pass",
"label": "Policy Check Passed",
"summary": "No loaded policy appears applicable to this product.",
"matched_policies": [],
"warnings": [],
"evidence_note": "Policy retrieval did not return any candidate matches for this product."
}
}Generate 3-5 frequently asked questions and answers for a product based on its enriched catalog data. Designed to be called after /vlm/analyze completes, using the enriched result as input.
Endpoint: POST /vlm/faqs
Content-Type: multipart/form-data
| Parameter | Type | Required | Description |
|---|---|---|---|
title |
string | No | Product title from VLM analysis |
description |
string | No | Product description from VLM analysis |
categories |
JSON string | No | Categories array (default: []) |
tags |
JSON string | No | Tags array (default: []) |
colors |
JSON string | No | Colors array (default: []) |
locale |
string | No | Regional locale code (default: en-US) |
{
"faqs": [
{
"question": "string",
"answer": "string"
}
]
}# Call after /vlm/analyze to generate FAQs from enriched data
curl -X POST \
-F "title=Craftsman 20V Cordless Lawn Mower" \
-F "description=A cordless lawn mower featuring a black and red design..." \
-F 'categories=["electronics"]' \
-F 'tags=["cordless","lawn mower","Craftsman"]' \
-F 'colors=["black","red"]' \
-F "locale=en-US" \
http://localhost:8000/vlm/faqs{
"faqs": [
{
"question": "What type of battery does this mower use?",
"answer": "This mower operates on a 20V cordless battery system, providing the flexibility to mow without a power cord."
},
{
"question": "Does this mower come with a grass collection bag?",
"answer": "Yes, it includes a rear-mounted grass collection bag for convenient clippings management."
},
{
"question": "What are the main colors of this mower?",
"answer": "The mower features a black and red color scheme with prominent Craftsman branding."
}
]
}Generate culturally-appropriate product variations using FLUX models based on VLM analysis results.
Endpoint: POST /generate/variation
Content-Type: multipart/form-data
| Parameter | Type | Required | Description |
|---|---|---|---|
image |
file | Yes | Product image file (JPEG, PNG) |
title |
string | Yes | Product title from VLM analysis |
description |
string | Yes | Product description from VLM analysis |
categories |
JSON string | Yes | Categories array from VLM analysis |
locale |
string | No | Regional locale code (default: "en-US") |
tags |
JSON string | No | Tags array from VLM analysis |
colors |
JSON string | No | Colors array from VLM analysis |
enhanced_product |
JSON string | No | Enhanced product data |
{
"generated_image_b64": "string (base64)",
"artifact_id": "string",
"image_path": "string",
"metadata_path": "string",
"locale": "string"
}# First, run VLM analysis to get the fields, then:
curl -X POST \
-F "image=@bag.jpg;type=image/jpeg" \
-F "locale=en-US" \
-F "title=Glamorous Black Evening Handbag with Gold Accents" \
-F "description=This exquisite handbag exudes sophistication..." \
-F 'categories=["accessories"]' \
-F 'tags=["black leather","gold accents","evening bag"]' \
-F 'colors=["black","gold"]' \
http://localhost:8000/generate/variation{
"generated_image_b64": "iVBORw0KGgoAAAANS...",
"artifact_id": "a4511bbed05242078f9e3f7ead3b2247",
"image_path": "data/outputs/a4511bbed05242078f9e3f7ead3b2247.png",
"metadata_path": "data/outputs/a4511bbed05242078f9e3f7ead3b2247.json",
"locale": "en-US"
}Generate interactive 3D GLB models from 2D product images using Microsoft's TRELLIS model.
Endpoint: POST /generate/3d
Content-Type: multipart/form-data
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
image |
file | Yes | - | Product image file (JPEG, PNG) |
slat_cfg_scale |
float | No | 5.0 | SLAT configuration scale |
ss_cfg_scale |
float | No | 10.0 | SS configuration scale |
slat_sampling_steps |
int | No | 50 | SLAT sampling steps |
ss_sampling_steps |
int | No | 50 | SS sampling steps |
seed |
int | No | 0 | Random seed for reproducibility |
return_json |
bool | No | false | Return JSON with base64 GLB or binary GLB |
Returns binary GLB file (model/gltf-binary) ready for download.
{
"glb_base64": "string (base64)",
"artifact_id": "string",
"metadata": {
"slat_cfg_scale": 5.0,
"ss_cfg_scale": 10.0,
"slat_sampling_steps": 50,
"ss_sampling_steps": 50,
"seed": 42,
"size_bytes": 1234567
}
}curl -X POST \
-F "image=@bag.jpg;type=image/jpeg" \
http://localhost:8000/generate/3d \
--output product.glbcurl -X POST \
-F "image=@bag.jpg;type=image/jpeg" \
-F "slat_cfg_scale=5.0" \
-F "ss_cfg_scale=10.0" \
-F "slat_sampling_steps=50" \
-F "ss_sampling_steps=50" \
-F "seed=42" \
http://localhost:8000/generate/3d \
--output product.glbcurl -X POST \
-F "image=@bag.jpg;type=image/jpeg" \
-F "return_json=true" \
http://localhost:8000/generate/3dThe API supports 10 regional locales for language and cultural context:
en-US- American English (default)en-GB- British Englishen-AU- Australian Englishen-CA- Canadian English
es-ES- Spain Spanish (uses "ordenador")es-MX- Mexican Spanish (uses "computadora")es-AR- Argentinian Spanishes-CO- Colombian Spanish
fr-FR- Metropolitan Frenchfr-CA- Quebec French (Canadian)
All endpoints return standard HTTP status codes:
- 200: Success
- 400: Bad Request (invalid parameters)
- 422: Unprocessable Entity (validation error)
- 500: Internal Server Error
Error response format:
{
"detail": "Error message description"
}