RAG Server Deployment Guide

Video Tutorial (Persian)

(Video explanation in Persian/Farsi - English tutorial coming soon)

Persian Language Tutorial - Click the thumbnail above to watch a detailed implementation walkthrough in Persian.

How to Run the Server

Install Models

Prerequisites

Ensure ninja tool is on your PATH
Ubuntu example:
```
sudo apt install ninja-build
```

Installation Steps

Run the installation script:
```
./install-models.sh
```

Important Notes:

Modify install-models.sh to specify which models to install
For models with HuggingFace restrictions:
- Environment variable method:
```
HUGGINGFACE_TOKEN=hf_tokenblahblahblah ./install-models.sh
```
- CLI argument method (overrides environment variable):
```
./install-models.sh --hf-token hf_tokenblahblahblah
```

Run RAG Server via Docker

docker compose up --build -d --wait rag

Populate Vectorstore

Follow the vectorstore population guide to load online article content.

Python Environment Setup

python3 -m venv ./examples/populate-vectorstore/.venv
source ./examples/populate-vectorstore/.venv/bin/activate
pip install --requirement=./examples/populate-vectorstore/requirements.txt

Content Population Examples

Artificial Intelligence:

python3 ./examples/populate-vectorstore/populate-vectorstore.py \
  --max-chunk-bytes 2000 \
  https://en.wikipedia.org/wiki/Artificial_intelligence

Cyrus the Great:

python3 ./examples/populate-vectorstore/populate-vectorstore.py \
  --max-chunk-bytes 2000 \
  https://en.wikipedia.org/wiki/Cyrus_the_Great

Test the RAG Server

Follow the chat client guide to deploy a test client.

Client Setup

Generate JS client stubs:

docker run --network=host --rm -v ${PWD}/examples/rag-chat/:/local \
  openapitools/openapi-generator-cli generate \
  -i http://localhost:8000/v1/rag.swagger.json \
  -g javascript \
  -o /local/js-client \
  --additional-properties=usePromises=true,useES6=true

Verify directory structure:

./examples/rag-chat/
├── index.html
└── js-client
    ├── ...
    └── ...

Install serve tool:
```
npm install -g serve
```
Start web server:
```
serve -l 3000 ./examples/rag-chat/
```
Access chat client:
```
http://localhost:3000
```

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.vscode		.vscode
cmd		cmd
docker		docker
examples		examples
gen		gen
internal		internal
pkg		pkg
proto		proto
scripts		scripts
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.llama.cpp-version		.llama.cpp-version
README.md		README.md
buf.gen.yaml		buf.gen.yaml
buf.lock		buf.lock
buf.yaml		buf.yaml
compose.yaml		compose.yaml
go.mod		go.mod
go.sum		go.sum
install-models.sh		install-models.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG Server Deployment Guide

Table of Contents

Video Tutorial (Persian)

How to Run the Server

Install Models

Prerequisites

Installation Steps

Run RAG Server via Docker

Populate Vectorstore

Python Environment Setup

Content Population Examples

Test the RAG Server

Client Setup

About

Uh oh!

Releases

Packages

Languages

aria3ppp/rag-server

Folders and files

Latest commit

History

Repository files navigation

RAG Server Deployment Guide

Table of Contents

Video Tutorial (Persian)

How to Run the Server

Install Models

Prerequisites

Installation Steps

Run RAG Server via Docker

Populate Vectorstore

Python Environment Setup

Content Population Examples

Test the RAG Server

Client Setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages