Intel® AI for Enterprise RAG

Intel® AI for Enterprise RAG simplifies transforming your enterprise data into actionable insights. Powered by Intel® Xeon® processors and Intel® Gaudi® AI accelerators, it integrates components from industry partners to offer a streamlined approach to deploying enterprise solutions.

Why use Intel® AI for Enterprise RAG?

Enable intelligent AI experiences that understand your business context:

Domain-Specific Intelligence - Enrich conversations and document processing with your organizational knowledge without training or fine-tuning models
Multiple Use Cases - Support for ChatQ&A for conversational AI and Document Summarization for extracting key insights from documents
Rapid Deployment - Transform enterprise documents into AI-powered experiences in minutes, not months
Enterprise-Ready Scale - Deploy secure, compliant AI solutions that grow with your business needs

Core Features

One-Click Enterprise Deployment - Fully automated Kubernetes cluster provisioning with Ansible playbooks, supporting both single-node and multi-node configurations with comprehensive infrastructure setup.
Optimized AI Hardware Support - Native support for Intel® Xeon® processors and Intel® Gaudi® AI accelerators with Horizontal Pod Autoscaling (HPA), balloons policy for CPU pinning on NUMA architectures, and performance-tuned configurations.
Enterprise-Grade Security & Compliance - Integrated Identity and Access Management (IAM) with Keycloak, programmable guardrails for fine-grained control, Pod Security Standards (PSS) enforcement for secure enterprise operations, role-based access control for vector databases, and Intel® Trust Domain Extensions (TDX) support for confidential computing.
Comprehensive Monitoring & Observability - Integrated telemetry stack with Prometheus, Grafana dashboards, distributed tracing with Tempo, and centralized logging with Loki for full pipeline visibility.

If you're interested in getting a glimpse of how Intel® AI for Enterprise RAG works, check out following demo.

Note

The video provided below showcases the beta release of our project. As we've transitioned to next releases, users can anticipate an improved UI design, improved installation process along with other enhancements.

Pipeline Architecture

Our system consists of two primary processing pipelines, each built on top of a shared microservices architecture. However, only one pipeline can be deployed at once.

ChatQnA – enabling retrieval-augmented question answering through conversational interaction.
Document Summarization (DocSum) – responsible for generating concise summaries from input documents.

The pipeline architecture for ChatQnA is shown below. For the detailed microservices architecture, refer here.

Document Summarization's pipeline architecture is available here.

Requirements

System Requirements

Category	Details
Operating System	Ubuntu 22.04/24.04
Hardware Platforms	4th Gen Intel® Xeon® Scalable processors 5th Gen Intel® Xeon® Scalable processors 6th Gen Intel® Xeon® Scalable processors 3rd Gen Intel® Xeon® Scalable processors and Intel® Gaudi® 2 AI Accelerator 4th Gen Intel® Xeon® Scalable processors and Intel® Gaudi® 2 AI Accelerator 6th Gen Intel® Xeon® Scalable processors and Intel® Gaudi® 3 AI Accelerator
Kubernetes Version	1.29.5 1.29.12 1.30.8 1.31.4
Python	3.11

Software Prerequisites

Hugging Face Model Access: Ensure you have the necessary access to download and use the chosen Hugging Face model. Default models can be inspected in config.yaml.
For multi-node clusters CSI driver with StorageClass supporting accessMode ReadWriteMany (RWX) is required. NFS server with CSI driver that supports RWX can be installed via deployment guide.

Hardware Requirements

These are minimal requirements to run Intel® AI for Enterprise RAG with default settings. In case of more(or less) resources available, feel free to adjust the parameters in the resource configuration files for your chosen pipeline:

ChatQA: resources-reference-cpu.yaml or resources-reference-hpu.yaml
Docsum: resources-reference-cpu.yaml or resources-reference-hpu.yaml

Deployment on Xeon only

To deploy the solution using Xeon only, you will need access to any platform with Intel® Xeon® Scalable processor that meet below requirements:

logical cores: A minimum of 88 logical cores
RAM memory: A minimum of 250GB of RAM
Disk Space: 200GB of disk space is generally recommended, though this is highly dependent on the model size

Note

By default, Intel® AI for Enterprise RAG uses the NRI plugin for performance optimization. For more info: NRI plugin

Deployment on Xeon + Gaudi Accelerator

To deploy the solution on a platform with Gaudi® AI Accelerator you need to have access to instance with minimal requirements:

logical cores: A minimum of 56 logical cores
RAM memory: A minimum of 250GB of RAM though this is highly dependent on database size
Disk Space: 500GB of disk space is generally recommended, though this is highly dependent on the model size and database size
Gaudi cards: 8
Gaudi driver: 1.22.1

Getting Started

Install the prerequisites.

cd deployment/
sudo apt-get install python3-venv
python3 -m venv erag-venv
source erag-venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
ansible-galaxy collection install -r requirements.yaml --upgrade

Validate Hardware Requirements

Before proceeding with the deployment, it's recommended to validate that your hardware meets the requirements for Intel® AI for Enterprise RAG. To perform hardware validation, you need to create an inventory.ini file first.

An example inventory.ini file structure and detailed instructions are provided in the Cluster Deployment Guide.

Once you have created the inventory.ini file, you can validate your hardware resources using the validate playbook located at playbooks/validate.yaml:

ansible-playbook playbooks/validate.yaml --tags hardware -i inventory/test-cluster/inventory.ini

Note

If this is a Gaudi deployment, add the additional flag -e is_gaudi_platform=true

Install a Kubernetes cluster (optional - if you don't have one)

Intel® AI for Enterprise RAG offers ansible automation for creating a K8s cluster. If you want to set up a K8s cluster, follow the Cluster Deployment Guide.

Install infrastructure components (storage, operators, backup tools)

The Intel® AI for Enterprise RAG repository offers installation of additional infrastructure components on the deployed K8s cluster:

Gaudi_operator - dedicated for K8s clusters with nodes that use Gaudi AI accelerators
CSI drivers - need to dynamically provision storage for PODs
Velero - installing Velero backup tool

If your K8s cluster requires installing any of these tools, please follow the Infrastructure Components Guide.

Deploy the Intel® AI for Enterprise RAG application on top of the prepared infrastructure

Once you have a K8s cluster with all infrastructure components installed, you can install the Intel® AI for Enterprise RAG application on top of it. Please follow the Application Deployment Guide.

Documentation

Refer to deployment/README.md or docs for more detailed deployment guide or in-depth instructions on Intel® AI for Enterprise RAG components.

Support

Submit questions, feature requests, and bug reports on the GitHub Issues page.

Publications

Feel free to checkout articles about Intel® AI for Enterprise RAG:

License

Intel® AI for Enterprise RAG is licensed under the Apache License Version 2.0. Refer to the "LICENSE" file for the full license text and copyright notice.

This distribution includes third-party software governed by separate license terms. This third-party software, even if included with the distribution of the Intel software, may be governed by separate license terms, including without limitation, third-party license terms, other Intel software license terms, and open-source software license terms. These separate license terms govern your use of the third-party programs as set forth in the "THIRD-PARTY-PROGRAMS" file.

Please note: component(s) depend on software subject to non-open source licenses. If you use or redistribute this software, it is your sole responsibility to ensure compliance with such licenses.

Security

The Security Policy outlines our guidelines and procedures for ensuring the highest level of security and trust for our users who consume Intel® AI for Enterprise RAG.

Intel's Human Rights Principles

Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See Intel's Global Human Rights Principles. Intel's products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.

Model Card Guidance

You, not Intel, are responsible for determining model suitability for your use case. For information regarding model limitations, safety considerations, biases, or other information consult the model cards (if any) for models you use, typically found in the repository where the model is available for download. Contact the model provider with questions. Intel does not provide model cards for third party models.

Contributing

If you want to contribute to the project, please refer to the guide in CONTRIBUTING.md file.

Trademark Information

Intel, the Intel logo, OpenVINO, the OpenVINO logo, Pentium, Xeon, and Gaudi are trademarks of Intel Corporation or its subsidiaries.

Other names and brands may be claimed as the property of others.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github		.github
deployment		deployment
docs		docs
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
THIRD-PARTY-PROGRAMS		THIRD-PARTY-PROGRAMS
ibm_catalog.json		ibm_catalog.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Intel® AI for Enterprise RAG

Why use Intel® AI for Enterprise RAG?

Core Features

Pipeline Architecture

Table of Contents

Requirements

System Requirements

Software Prerequisites

Hardware Requirements

Deployment on Xeon only

Deployment on Xeon + Gaudi Accelerator

Getting Started

Validate Hardware Requirements

Install a Kubernetes cluster (optional - if you don't have one)

Install infrastructure components (storage, operators, backup tools)

Deploy the Intel® AI for Enterprise RAG application on top of the prepared infrastructure

Documentation

Support

Publications

License

Security

Intel's Human Rights Principles

Model Card Guidance

Contributing

Trademark Information

About

Uh oh!

Releases 15

Packages

Uh oh!

Contributors 36

Languages

License

opea-project/Enterprise-RAG

Folders and files

Latest commit

History

Repository files navigation

Intel® AI for Enterprise RAG

Why use Intel® AI for Enterprise RAG?

Core Features

Pipeline Architecture

Table of Contents

Requirements

System Requirements

Software Prerequisites

Hardware Requirements

Deployment on Xeon only

Deployment on Xeon + Gaudi Accelerator

Getting Started

Validate Hardware Requirements

Install a Kubernetes cluster (optional - if you don't have one)

Install infrastructure components (storage, operators, backup tools)

Deploy the Intel® AI for Enterprise RAG application on top of the prepared infrastructure

Documentation

Support

Publications

License

Security

Intel's Human Rights Principles

Model Card Guidance

Contributing

Trademark Information

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 15

Packages 0

Uh oh!

Contributors 36

Languages

Packages