For ingesting text only files, developers do not need to deploy the complete pipeline with all NIMs connected. In case your usecase requires extracting text from files, follow steps below to deploy just the necassary components.
-
Follow steps outlined in the quickstart guide till step 3.
-
While deploying the NIMs in step 4, selectively deploy just the NIMs necessary for rag-server and the page-elements NIM for ingestion.
USERID=$(id -u) docker compose --profile rag -f deploy/compose/nims.yaml up -dUSERID=$(id -u) docker compose -f deploy/compose/nims.yaml up page-elements -dConfirm all the below mentioned NIMs are running and the one's specified below are in healthy state before proceeding further. Make sure to allocate GPUs according to your hardware (2xH100 or 4xA100 to
nim-llm-msbased on your deployment GPU profile) as stated in the quickstart guide.watch -n 2 'docker ps --format "table {{.Names}}\t{{.Status}}"'NAMES STATUS nemoretriever-ranking-ms Up 14 minutes (healthy) nemoretriever-embedding-ms Up 14 minutes (healthy) nim-llm-ms Up 14 minutes (healthy) compose-page-elements-1 Up 14 minutes -
Continue following the rest of steps in quickstart to deploy the ingestion-server and rag-server containers.
-
Once the ingestion and rag servers are deployed, open the ingestion notebook and follow the steps. While trying out the the
Upload Document Endpointset the payload to below. We are settingextract_tables,extract_chartstoFalse.data = { "vdb_endpoint": "http://milvus:19530", "collection_name": collection_name, "split_options": { "chunk_size": 1024, "chunk_overlap": 150 } } -
After ingestion completes, you can try out the queries relevant to the text in the documents using retrieval notebook.
📝 Note: In case you are interacting with cloud hosted models and want to enable text only mode, then in step 2, just export these specific environment variables as shown below:
export APP_EMBEDDINGS_SERVERURL=""
export APP_LLM_SERVERURL=""
export APP_RANKING_SERVERURL=""
export EMBEDDING_NIM_ENDPOINT="https://integrate.api.nvidia.com/v1"
export YOLOX_HTTP_ENDPOINT="https://ai.api.nvidia.com/v1/cv/nvidia/nemoretriever-page-elements-v2"
export YOLOX_INFER_PROTOCOL="http"To ingest text-only files, you do not need to deploy the complete pipeline with all NIMs connected. If your scenario requires only text extraction from files, use the following steps to deploy only the necessary components using Helm.
When you install the Helm chart, enable only the following services that are required for text ingestion:
rag-serveringestor-servernv-ingestnvidia-nim-llama-32-nv-embedqa-1b-v2text-reranking-nimnemoretriever-page-elements-v2nim-llmmilvusminio
Additionally, ensure that table extraction, chart extraction, and image extraction are disabled.
Example Helm install command:
helm upgrade --install rag -n rag https://helm.ngc.nvidia.com/nvidia/blueprint/charts/nvidia-blueprint-rag-v2.1.0.tgz \
--username '$oauthtoken' \
--password "${NGC_API_KEY}" \
--set nim-llm.enabled=true \
--set nvidia-nim-llama-32-nv-embedqa-1b-v2.enabled=true \
--set text-reranking-nim.enabled=true \
--set ingestor-server.enabled=true \
--set ingestor-server.nv-ingest.nemoretriever-page-elements-v2.deployed=true \
--set ingestor-server.nv-ingest.nemoretriever-graphic-elements-v1.deployed=false \
--set ingestor-server.nv-ingest.nemoretriever-table-structure-v1.deployed=false \
--set ingestor-server.nv-ingest.paddleocr-nim.deployed=false \
--set imagePullSecret.password=$NGC_API_KEY \
--set ngcApiSecret.password=$NGC_API_KEY