Skip to content

Commit e231574

Browse files
add text2cypher component (#1319)
Signed-off-by: jeanyu-habana <[email protected]>
1 parent 838d16d commit e231574

File tree

15 files changed

+1657
-0
lines changed

15 files changed

+1657
-0
lines changed
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
# this file should be run in the root of the repo
5+
services:
6+
text2cypher-gaudi:
7+
build:
8+
dockerfile: comps/text2cypher/src/Dockerfile.intel_hpu
9+
image: ${REGISTRY:-opea}/text2cypher-gaudi:${TAG:-latest}

comps/cores/mega/constants.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ class ServiceType(Enum):
3535
IMAGE2IMAGE = 18
3636
TEXT2SQL = 19
3737
TEXT2GRAPH = 20
38+
TEXT2CYPHER = 21
3839

3940

4041
class MegaServiceEndpoint(Enum):
@@ -54,6 +55,7 @@ class MegaServiceEndpoint(Enum):
5455
RETRIEVALTOOL = "/v1/retrievaltool"
5556
FAQ_GEN = "/v1/faqgen"
5657
GRAPH_RAG = "/v1/graphrag"
58+
HYBRID_RAG = "/v1/hybridrag"
5759
# Follow OPENAI
5860
EMBEDDINGS = "/v1/embeddings"
5961
TTS = "/v1/audio/speech"
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# Deploy text2cypher microservice using docker-compose
2+
3+
## Deploy on Intel Gaudi
4+
5+
```
6+
unset http_proxy
7+
service_name="neo4j-apoc text2cypher-gaudi"
8+
export ip_address=$(hostname -I | awk '{print $1}')
9+
export host_ip=${ip_address}
10+
export TAG="comps"
11+
export NEO4J_AUTH="neo4j/neo4jtest"
12+
export NEO4J_URL="bolt://${ip_address}:7687"
13+
export NEO4J_USERNAME="neo4j"
14+
export NEO4J_PASSWORD="neo4jtest"
15+
export NEO4J_apoc_export_file_enabled=true
16+
export NEO4J_apoc_import_file_use__neo4j__config=true
17+
export NEO4J_PLUGINS=\[\"apoc\"\]
18+
19+
cd $WORKPATH/comps/text2cypher/deployment/docker_compose/
20+
docker compose up ${service_name} -d
21+
22+
```
Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# Copyright (C) 2025 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
include:
5+
- ../../../third_parties/neo4j/deployment/docker_compose/compose.yaml
6+
7+
services:
8+
text2cypher:
9+
image: opea/text2cypher:${TAG:-latest}
10+
container_name: text2cypher-container
11+
ports:
12+
- ${TEXT2CYPHER_PORT:-9097}:9097
13+
depends_on:
14+
neo4j-apoc:
15+
condition: service_healthy
16+
17+
text2cypher-gaudi:
18+
image: opea/text2cypher-gaudi:${TAG:-latest}
19+
#pull_policy: never
20+
container_name: text2cypher-gaudi-container
21+
ports:
22+
- ${TEXT2CYPHER_PORT:-9097}:9097
23+
depends_on:
24+
neo4j-apoc:
25+
condition: service_healthy
26+
ipc: host
27+
environment:
28+
no_proxy: ${no_proxy}
29+
http_proxy: ${http_proxy}
30+
https_proxy: ${https_proxy}
31+
INDEX_NAME: ${INDEX_NAME}
32+
HUGGINGFACEHUB_API_TOKEN: ${HF_TOKEN}
33+
HF_TOKEN: ${HF_TOKEN}
34+
LOGFLAG: ${LOGFLAG:-False}
35+
HABANA_VISIBLE_DEVICES: all
36+
OMPI_MCA_btl_vader_single_copy_mechanism: none
37+
TOKENIZERS_PARALLELISM: False
38+
NEO4J_URI: ${NEO4J_URI}
39+
NEO4J_URL: ${NEO4J_URI}
40+
NEO4J_USERNAME: ${NEO4J_USERNAME}
41+
NEO4J_PASSWORD: ${NEO4J_PASSWORD}
42+
host_ip: ${host_ip}
43+
runtime: habana
44+
cap_add:
45+
- SYS_NICE
46+
restart: unless-stopped
47+
48+
networks:
49+
default:
50+
driver: bridge
Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
# Copyright (C) 2025 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
# HABANA environment
5+
FROM vault.habana.ai/gaudi-docker/1.19.0/ubuntu22.04/habanalabs/pytorch-installer-2.5.1 AS hpu
6+
7+
ENV LANG=en_US.UTF-8
8+
ARG REPO=https://github.com/huggingface/optimum-habana.git
9+
ARG REPO_VER=v1.15.0
10+
11+
RUN apt-get update && apt-get install -y --no-install-recommends --fix-missing \
12+
git-lfs \
13+
libgl1-mesa-glx \
14+
libjemalloc-dev
15+
16+
#RUN useradd -m -s /bin/bash user && \
17+
# mkdir -p /home/user && \
18+
# chown -R user /home/user/
19+
20+
RUN git lfs install
21+
22+
COPY comps /root/comps
23+
#RUN chown -R user /home/user/comps/text2cypher
24+
25+
#RUN rm -rf /etc/ssh/ssh_host*
26+
27+
RUN pip install --no-cache-dir --upgrade pip setuptools && \
28+
pip install --no-cache-dir --upgrade-strategy eager optimum[habana] && \
29+
pip install --no-cache-dir git+https://github.com/HabanaAI/[email protected]
30+
31+
RUN git clone ${REPO} /root/optimum-habana && \
32+
cd /root/optimum-habana && git checkout ${REPO_VER} && \
33+
cd /root/comps/text2cypher/src && pip install --no-cache-dir -r requirements.txt && \
34+
pip install --no-cache-dir --upgrade --force-reinstall pydantic numpy==1.26.3
35+
36+
# Set environment variables
37+
ENV PYTHONPATH=/root:/usr/lib/habanalabs/:/root/optimum-habana
38+
ENV HABANA_VISIBLE_DEVICES=all
39+
ENV OMPI_MCA_btl_vader_single_copy_mechanism=none
40+
ENV DEBIAN_FRONTEND="noninteractive" TZ=Etc/UTC
41+
42+
#USER user
43+
WORKDIR /root/comps/text2cypher/src
44+
45+
RUN echo python opea_text2cypher_microservice.py --device hpu --use_hpu_graphs --bf16 >> run.sh
46+
47+
CMD bash run.sh
48+

comps/text2cypher/src/README.md

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
# 🛢 Text-to-Cypher Microservice
2+
3+
The microservice enables a wide range of use cases, making it a versatile tool for businesses, researchers, and individuals alike. Users can generate queries based on natural language questions, enabling them to quickly retrieve relevant data from graph databases. This service executes locally on Intel Gaudi.
4+
5+
---
6+
7+
## 🛠️ Features
8+
9+
**Implement Cypher Query based on input text**: Transform user-provided natural language into Cypher queries, subsequently executing them to retrieve data from Graph databases.
10+
11+
---
12+
13+
## ⚙️ Implementation
14+
15+
The text-to-cypher microservice able to implement with various framework and support various types of Graph databases.
16+
17+
### 🔗 Utilizing Text-to-Cypher with Langchain framework
18+
19+
The follow guide provides set-up instructions and comprehensive details regarding the Text-to-Cypher microservices via LangChain. In this configuration, we will employ Neo4J DB as our example database to showcase this microservice.
20+
21+
---
22+
23+
### Start Neo4J Service
24+
25+
### 🚀 Start Text2Cypher Microservice with Python(Option 1)
26+
27+
#### Install Requirements
28+
29+
```bash
30+
pip install -r requirements.txt
31+
```
32+
33+
#### Start Text-to-Cypher Microservice with Python Script
34+
35+
Start Text-to-Cypher microservice with below command.
36+
37+
```bash
38+
python3 opea_text2cypher_microservice.py
39+
```
40+
41+
---
42+
43+
### 🚀 Start Microservice with Docker (Option 2)
44+
45+
#### Build Docker Image
46+
47+
```bash
48+
cd GenAIComps/
49+
docker build -t opea/text2cypher:latest -f comps/text2cypher/src/Dockerfile.intel_hpu .
50+
```
51+
52+
#### Run Docker with CLI (Option A)
53+
54+
```bash
55+
docker run --name="comps-langchain-text2cypher" -p 9097:8080 --ipc=host opea/text2cypher:latest
56+
```
57+
58+
#### Run via docker compose (Option B)
59+
60+
##### Setup Environment Variables.
61+
62+
```bash
63+
ip_address=$(hostname -I | awk '{print $1}')
64+
export HF_TOKEN=${HUGGINGFACEHUB_API_TOKEN}
65+
export NEO4J_USER=neo4j
66+
export NEO4J_PASSWORD=neo4jtest
67+
export NEO4J_URL="bolt://${ip_address}:7687"
68+
export TEXT2CYPHER_PORT=11801
69+
```
70+
71+
##### Start the services.
72+
73+
- Gaudi2 HPU
74+
75+
```bash
76+
cd comps/text2cypher/deployment/docker_compose
77+
docker compose -f compose.yaml up text2cypher-gaudi -d
78+
```
79+
80+
---
81+
82+
### ✅ Invoke the microservice.
83+
84+
The Text-to-Cypher microservice exposes the following API endpoints:
85+
86+
- Execute Cypher Query with Pre-seeded Data and Schema:
87+
88+
```bash
89+
curl http://${ip_address}:${TEXT2CYPHER_PORT}/v1/text2cypher\
90+
-X POST \
91+
-d '{"input_text": "what are the symptoms for Diabetes?","conn_str": {"user": "'${NEO4J_USERNAME}'","password": "'${NEO4J_PASSWPORD}'","url": "'${NEO4J_URL}'" }}' \
92+
-H 'Content-Type: application/json'
93+
```
94+
95+
- Execute Cypher Query with User Data and Schema:
96+
97+
Define customized cypher_insert statements:
98+
99+
```bash
100+
export cypher_insert='
101+
LOAD CSV WITH HEADERS FROM "https://docs.google.com/spreadsheets/d/e/2PACX-1vQCEUxVlMZwwI2sn2T1aulBrRzJYVpsM9no8AEsYOOklCDTljoUIBHItGnqmAez62wwLpbvKMr7YoHI/pub?gid=0&single=true&output=csv" AS rows
102+
MERGE (d:disease {name:rows.Disease})
103+
MERGE (dt:diet {name:rows.Diet})
104+
MERGE (d)-[:HOME_REMEDY]->(dt)
105+
106+
MERGE (m:medication {name:rows.Medication})
107+
MERGE (d)-[:TREATMENT]->(m)
108+
109+
MERGE (s:symptoms {name:rows.Symptom})
110+
MERGE (d)-[:MANIFESTATION]->(s)
111+
112+
MERGE (p:precaution {name:rows.Precaution})
113+
MERGE (d)-[:PREVENTION]->(p)
114+
'
115+
```
116+
117+
Pass the cypher_insert to the cypher2text service. The user can also specify whether to refresh the Neo4j database using the refresh_db option.
118+
119+
```bash
120+
curl http://${ip_address}:${TEXT2CYPHER_PORT}/v1/text2cypher \
121+
-X POST \
122+
-d '{"input_text": "what are the symptoms for Diabetes?", \
123+
"conn_str": {"user": "'${NEO4J_USERNAME}'","password": "'${NEO4J_PASSWPORD}'","url": "'${NEO4J_URL}'" } \
124+
"seeding": {"cypher_insert": "'${cypher_insert}'","refresh_db": "True" }}' \
125+
-H 'Content-Type: application/json'
126+
127+
```

comps/text2cypher/src/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Copyright (C) 2025 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Copyright (C) 2025 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0

0 commit comments

Comments
 (0)