Skip to content

Commit f1355df

Browse files
committed
add openGauss support for dataprep
Signed-off-by: sunshuang1866 <[email protected]>
1 parent 9c6cd5a commit f1355df

File tree

9 files changed

+657
-1
lines changed

9 files changed

+657
-1
lines changed

comps/dataprep/deployment/docker_compose/compose.yaml

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ include:
1616
- ../../../third_parties/vllm/deployment/docker_compose/compose.yaml
1717
- ../../../third_parties/arangodb/deployment/docker_compose/compose.yaml
1818
- ../../../third_parties/mariadb/deployment/docker_compose/compose.yaml
19+
- ../../../third_parties/opengauss/deployment/docker_compose/compose.yaml
1920

2021
services:
2122

@@ -191,6 +192,28 @@ services:
191192
security_opt:
192193
- no-new-privileges:true
193194

195+
dataprep-opengauss:
196+
image: ${REGISTRY:-opea}/dataprep:${TAG:-latest}
197+
container_name: dataprep-opengauss-server
198+
ports:
199+
- "${DATAPREP_PORT:-5000}:5000"
200+
depends_on:
201+
opengauss-db:
202+
condition: service_healthy
203+
ipc: host
204+
environment:
205+
no_proxy: ${no_proxy}
206+
http_proxy: ${http_proxy}
207+
https_proxy: ${https_proxy}
208+
DATAPREP_COMPONENT_NAME: "OPEA_DATAPREP_OPENGAUSS"
209+
GS_CONNECTION_STRING: ${GS_CONNECTION_STRING}
210+
healthcheck:
211+
test: ["CMD-SHELL", "curl -f http://localhost:5000/v1/health_check || exit 1"]
212+
interval: 10s
213+
timeout: 5s
214+
retries: 10
215+
restart: unless-stopped
216+
194217
dataprep-pgvector:
195218
image: ${REGISTRY:-opea}/dataprep:${TAG:-latest}
196219
container_name: dataprep-pgvector-server
Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
# Dataprep Microservice with openGauss
2+
3+
## Table of contents
4+
5+
1. [🚀1. Start Microservice with Docker](#1-start-microservice-with-docker)
6+
2. [🚀2. Consume Microservice](#2-consume-microservice)
7+
8+
## 🚀1. Start Microservice with Docker
9+
10+
### 1.1 Start openGauss
11+
12+
Please refer to this [readme](../../third_parties/opengauss/src/README.md).
13+
14+
### 1.2 Setup Environment Variables
15+
16+
```bash
17+
export GS_CONNECTION_STRING=opengauss+psycopg2://gaussdb:openGauss@123@${your_ip}:5432/postgres
18+
export INDEX_NAME=${your_index_name}
19+
export TEI_EMBEDDING_ENDPOINT=${your_tei_embedding_endpoint}
20+
export HF_TOKEN=${your_hf_api_token}
21+
```
22+
23+
### 1.3 Build Docker Image
24+
25+
```bash
26+
cd GenAIComps
27+
docker build -t opea/dataprep:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/src/Dockerfile .
28+
```
29+
30+
### 1.4 Run Docker with CLI (Option A)
31+
32+
```bash
33+
docker run --name="dataprep-opengauss" -p 6007:6007 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e GS_CONNECTION_STRING=$GS_CONNECTION_STRING -e INDEX_NAME=$INDEX_NAME -e EMBED_MODEL=${EMBED_MODEL} -e TEI_EMBEDDING_ENDPOINT=$TEI_EMBEDDING_ENDPOINT -e HF_TOKEN=${HF_TOKEN} -e DATAPREP_COMPONENT_NAME="OPEA_DATAPREP_OPENGAUSS" opea/dataprep:latest
34+
```
35+
36+
### 1.5 Run with Docker Compose (Option B)
37+
38+
```bash
39+
cd comps/dataprep/deployment/docker_compose
40+
docker compose -f compose.yaml up dataprep-opengauss -d
41+
```
42+
43+
## 🚀2. Consume Microservice
44+
45+
### 2.1 Consume Upload API
46+
47+
Once document preparation microservice for openGauss is started, user can use below command to invoke the microservice to convert the document to embedding and save to the database.
48+
49+
```bash
50+
curl -X POST \
51+
-H "Content-Type: application/json" \
52+
-d '{"path":"/path/to/document"}' \
53+
http://localhost:6007/v1/dataprep/ingest
54+
```
55+
56+
### 2.2 Consume get API
57+
58+
To get uploaded file structures, use the following command:
59+
60+
```bash
61+
curl -X POST \
62+
-H "Content-Type: application/json" \
63+
http://localhost:6007/v1/dataprep/get
64+
```
65+
66+
Then you will get the response JSON like this:
67+
68+
```json
69+
[
70+
{
71+
"name": "uploaded_file_1.txt",
72+
"id": "uploaded_file_1.txt",
73+
"type": "File",
74+
"parent": ""
75+
},
76+
{
77+
"name": "uploaded_file_2.txt",
78+
"id": "uploaded_file_2.txt",
79+
"type": "File",
80+
"parent": ""
81+
}
82+
]
83+
```
84+
85+
### 2.3 Consume delete API
86+
87+
To delete uploaded file/link, use the following command.
88+
89+
The `file_path` here should be the `id` get from `/v1/dataprep/get` API.
90+
91+
```bash
92+
# delete link
93+
curl -X POST \
94+
-H "Content-Type: application/json" \
95+
-d '{"file_path": "https://www.ces.tech/.txt"}' \
96+
http://localhost:6007/v1/dataprep/delete
97+
98+
# delete file
99+
curl -X POST \
100+
-H "Content-Type: application/json" \
101+
-d '{"file_path": "uploaded_file_1.txt"}' \
102+
http://localhost:6007/v1/dataprep/delete
103+
104+
# delete all files and links
105+
curl -X POST \
106+
-H "Content-Type: application/json" \
107+
-d '{"file_path": "all"}' \
108+
http://localhost:6007/v1/dataprep/delete
109+
```

0 commit comments

Comments
 (0)