Skip to content

Latest commit

 

History

History
45 lines (34 loc) · 2.89 KB

CHANGELOG.md

File metadata and controls

45 lines (34 loc) · 2.89 KB

Changelog

All notable changes to this project will be documented in this file. The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[2.0.0] - 2025-03-18

This release adds support for multimodal documents using Nvidia Ingest including support for parsing PDFs, Word and PowerPoint documents. It also significantly improves accuracy and perf considerations by refactoring the APIs, architecture as well as adds a new developer friendly UI.

Added

Changed

    • In RAG v1.0.0, a single server managed both ingestion and retrieval/generation APIs. In RAG v2.0.0, the architecture has evolved to utilize two separate microservices.
  • Helm charts are now modularized, seperate helm charts are provided for each distinct microservice.
  • Default settings configured to achieve a balance between accuracy and perf.
    • Default flow uses on-prem models with option to switch to API catalog endpoints for docker based flow.
    • Query rewriting uses a smaller llama3.1-8b-instruct and is turned off by default.
    • Support to use conversation history during retrieval for low-latency  multiturn support.

Known Issues

  • The rag-playground container needs to be rebuild if the APP_LLM_MODELNAME, APP_EMBEDDINGS_MODELNAME or APP_RANKING_MODELNAME environment variable values are changed.
  • Optional features reflection, nemoguardrails and image captioning are not available in helm based deployment.
  • Uploading large files with .txt extension may fail during ingestion, we recommend splitting such files into smaller parts, to avoid this issue.

A detailed guide is available here for easing developers experience, while migrating from older versions.

[1.0.0] - 2025-01-15

Added

  • First release.