-
Notifications
You must be signed in to change notification settings - Fork 84
jieba插件对包含空格的字符串分词后,包含了值为空格的token #17
Copy link
Copy link
Open
Description
使用jieba插件分词,对包含空格的字符串分词,会包含值为空格的token,search和index模式都是如此,比如:
curl http://localhost:9200/test/_analyze?text=你好%20北京&analyzer=jieba_search&pretty
{
"tokens": [
{
"token": "你好",
"start_offset": 0,
"end_offset": 2,
"type": "word",
"position": 0
},
{
"token": " ",
"start_offset": 2,
"end_offset": 3,
"type": "word",
"position": 1
},
{
"token": "北京",
"start_offset": 3,
"end_offset": 5,
"type": "word",
"position": 2
}
]
}
那这样,如果用户搜索内容包括空格时,就有可能影响搜索结果了,因为搜索分词时包含空格,但是es索引的内容可能不包含空格。
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels