The LAC Analysis plugin integrates Lucene LAC analyzer. It supports major versions of Elasticsearch and OpenSearch.
The plugin comprises analyzer: lac, and tokenizer: lac.
See PaddleHub
conda create -n paddlehub python=3.8
conda activate paddlehub
pip install paddlepaddle paddlehub
hub serving start --modules lac --port 8866 --use_multiprocess --workers 8
curl -X 'POST' 'http://127.0.0.1:8866/predict/lac' \
-H 'accept: */*' \
-H 'Content-Type: application/json' \
-d '{
"text": [
"今天是个好日子", "天气预报说:今天要下雨"
]
}'
bin/elasticsearch-plugin install https://github.com/thundax-lyp/analysis-lac/releases/download/8.12.2/elasticsearch-analysis-lac-8.12.2.jar
bin/opensearch-plugin install https://github.com/thundax-lyp/analysis-lac/releases/download/8.12.2/elasticsearch-analysis-lac-8.12.2.jar
Make sure to replace the version number with the one that matches your Elasticsearch or OpenSearch version.
Step 1. create an index
curl -XPUT http://localhost:9200/index
Step 2. create a mapping
curl -XPOST http://localhost:9200/index/_mapping -H 'Content-Type:application/json' -d'
{
"properties": {
"content": {
"type": "text",
"analyzer": "lac",
"search_analyzer": "lac"
}
}
}'
Step 3. index some docs
curl -XPOST http://localhost:9200/index/_create/1 \
-H 'Content-Type:application/json' \
-d'{
"content":"站一个制高点看上海,上海的弄堂是壮观的景象。"
}'
Config file LacAnalyzer.cfg.xml
can be located at {conf}/analysis-lac/config/LacAnalyzer.cfg.xml
or {plugins}/elasticsearch-analysis-lac-*/config/LacAnalyzer.cfg.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<entry key="service_url">http://127.0.0.1:8866/predict/lac/</entry>
</properties>