You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"HuixiangDou" is a domain-specific knowledge assistant based on the LLM. Features:
8
+
1. Deal with complex scenarios like group chats, answer user questions without causing message flooding.
9
+
2. Propose an algorithm pipeline for answering technical questions.
10
+
3. Low deployment cost, only need the LLM model to meet 4 traits can answer most of the user's questions, see [technical report](./resource/HuixiangDou.pdf).
After running, HuixiangDou can distinguish which user topics should be dealt with and which chitchats should be rejected. Please edit [good_questions](./resource/good_questions.json) and [bad_questions](./resource/bad_questions.json), and try your own domain knowledge (medical, finance, electricity, etc.).
33
+
34
+
```shell
35
+
# Accept technical topics
36
+
process query: Does mmdeploy support mmtrack model conversion now?
37
+
process query: Are there any Chinese text to speech models?
HuixiangDou uses a search engine. Click on the [serper official website](https://serper.dev/api-key) to obtain a quota-limited WEB_SEARCH_TOKEN and fill it in `config.ini`.
47
48
48
-
```bash
49
+
```shell
49
50
# config.ini
50
51
..
51
52
[web_search]
52
53
x_api_key = "${YOUR-X-API-KEY}"
53
54
..
54
55
```
55
56
56
-
**测试问答效果**
57
+
**Test Q&A Effect**
57
58
58
-
请保证 GPU 显存超过 20GB(如 3090 及以上),若显存较低请按 FAQ 修改。
59
+
Please ensure that the GPU memory is over 20GB (such as 3090 or above). If the memory is low, please modify it according to the FAQ.
The first run will automatically download the configuration of internlm2-7B and text2vec-large-chinese, please ensure network connectivity.
61
62
62
-
***非 docker 用户**。如果你**不**使用 docker 环境,可以一次启动所有服务。
63
-
```bash
63
+
***Non-docker users**. If you **don't** use docker environment, you can start all services at once.
64
+
```shell
64
65
# standalone
65
66
python3 main.py workdir --standalone
66
67
..
67
-
ErrorCode.SUCCESS, 请教下视频流检测 跳帧 造成框一闪一闪的 有好的优化办法吗
68
-
1. 帧率控制和跳帧策略是优化视频流检测性能的关键,但需要注意跳帧对检测结果的影响。
69
-
2. 多线程处理和缓存机制可以提高检测效率,但需要注意检测结果的稳定性。
70
-
3. 使用滑动窗口的方式可以减少跳帧和缓存对检测结果的影响。
68
+
ErrorCode.SUCCESS, Could you please advise if there is any good optimization method for video stream detection flickering caused by frame skipping?
69
+
1. Frame rate control and frame skipping strategy are key to optimizing video stream detection performance, but you need to pay attention to the impact of frame skipping on detection results.
70
+
2. Multithreading processing and caching mechanism can improve detection efficiency, but you need to pay attention to the stability of detection results.
71
+
3. The use of sliding window method can reduce the impact of frame skipping and caching on detection results.
71
72
```
72
73
73
-
***docker 用户**。如果你正在使用 docker,`HuixiangDou` 的 Hybrid LLM Service 需要分离部署。
74
-
```bash
75
-
#启动 LLM 服务
74
+
***Docker users**. If you are using docker, HuixiangDou's Hybrid LLM Service needs to be deployed separately.
75
+
```shell
76
+
# Start LLM service
76
77
python3 service/llm_server_hybride.py
77
78
```
78
-
打开新终端,把 host IP 配置进`config.ini`,运行
79
+
Open a new terminal, configure the host IP in `config.ini`, run
Click [Create a Feishu Custom Robot](https://open.feishu.cn/document/client-docs/bot-v3/add-custom-bot) to get the WEBHOOK_URL callback, and fill it in the config.ini.
92
93
93
-
```bash
94
+
```shell
94
95
# config.ini
95
96
..
96
97
[frontend]
97
98
type = "lark"
98
99
webhook_url = "${YOUR-LARK-WEBHOOK-URL}"
99
100
```
100
101
101
-
运行。结束后,技术助手的答复将发送到飞书群。
102
-
```bash
102
+
Run. After it ends, the technical assistant's reply will be sent to the Feishu group chat.
If you still need to read Feishu group messages, see [Feishu Developer Square - Add Application Capabilities - Robots](https://open.feishu.cn/app?lang=zh-CN).
108
109
109
-
## STEP4.高精度配置[可选]
110
-
为了进一步提升助手的答复体验,以下特性,开启得越多越好。
110
+
## STEP4. High Accuracy Method [Optional]
111
+
To further enhance the experience of the assistant's response, the more features you turn on, the better.
111
112
112
-
1. 使用更高精度local LLM
113
+
1. Use higher accuracy local LLM
113
114
114
-
把 config.ini 中的`llm.local`模型调整为 `internlm2-20B`
115
-
此选项效果显著,但需要更大的 GPU 显存。
115
+
Adjust the `llm.local` model in config.ini to `internlm2-20B`.
116
+
This option has a significant effect, but requires more GPU memory.
116
117
117
118
2. Hybrid LLM Service
118
119
119
-
对于支持 openai 接口的 LLM 服务,茴香豆可以发挥它的 Long Context 能力。
120
-
以 kimi 为例,以下是 `config.ini`配置示例:
120
+
For LLM services that support the openai interface, HuixiangDou can utilize its Long Context ability.
121
+
Using Kimi as an example, below is an example of `config.ini` configuration:
121
122
122
-
```bash
123
+
```shell
123
124
# config.ini
124
125
[llm.server]
125
126
..
@@ -129,83 +130,88 @@ python3 main.py workdir
129
130
remote_llm_max_text_length = 128000
130
131
remote_llm_model = "moonshot-v1-128k"
131
132
```
132
-
我们同样支持 gpt API。注意此特性会增加响应耗时和运行成本。
133
+
We also support the gpt API. Note that this feature will increase response time and operating costs.
133
134
134
-
3. repo 搜索增强
135
+
3. Repo search enhancement
135
136
136
-
此特性适合处理疑难问题,需要基础开发能力调整 prompt。
137
+
This feature is suitable for handling difficult questions and requires basic development capabilities to adjust the prompt.
Run `main.py`, HuixiangDou will enable search enhancement when appropriate.
171
+
172
+
4. Tune Parameters
170
173
174
+
It is often unavoidable to adjust parameters with respect to business scenarios.
175
+
* Refer to [data.json](./tests/data.json) to add real data, run [test_intention_prompt.py](./tests/test_intention_prompt.py) to get suitable prompts and thresholds, and update them into [worker](./service/worker.py).
176
+
* Adjust the [number of search results](./service/worker.py) based on the maximum length supported by the model.
* WeChat. For Enterprise WeChat, see [Enterprise WeChat Application Development Guide](https://developer.work.weixin.qq.com/document/path/90594) ; for personal WeChat, we have confirmed with the WeChat team that there is currently no API, you need to search and learn by yourself.
182
+
* DingTalk. Refer to [DingTalk Open Platform-Custom Robot Access](https://open.dingtalk.com/document/robots/custom-robot-access)
* Fill in the questions that should be answered in the real scenario into `resource/good_questions.json`, and fill the ones that should be rejected into `resource/bad_questions.json`.
187
+
* Adjust the theme content in `repodir` to ensure that the markdown documents in the main library do not contain irrelevant content.
182
188
183
-
重新执行`service/feature_store.py`,更新阈值和特征库
189
+
Re-run `service/feature_store.py` to update thresholds and feature libraries.
184
190
185
-
3. 启动正常,但运行期间显存 OOM 怎么办?
191
+
3. Launch is normal, but out of memory during runtime?
LLM long text based on transformers structure requires more memory. At this time, kv cache quantization needs to be done on the model, such as [lmdeploy quantization description](https://github.com/InternLM/lmdeploy/blob/main/docs/en/kv_int8.md). Then use docker to independently deploy Hybrid LLM Service.
188
194
189
-
4. 如何接入其他 local LLM/ 接入后效果不理想怎么办?
195
+
4. How to access other local LLM / After access, the effect is not ideal?
* Open [hybrid llm service](./service/llm_server_hybrid.py), add a new LLM inference implementation.
198
+
* Refer to [test_intention_prompt and test data](./tests/test_intention_prompt.py), adjust prompt and threshold for the new model, and update them into [worker.py](./service/worker.py).
193
199
194
-
5. 响应太慢/请求总是失败怎么办?
200
+
5. What if the response is too slow/request always fails?
* Refer to [hybrid llm service](./service/llm_server_hybrid.py) to add exponential backoff and retransmission.
203
+
* Replace local LLM with an inference framework such as [lmdeploy](https://github.com/internlm/lmdeploy), instead of the native huggingface/transformers.
200
204
201
-
此时无法运行 local LLM,只能用 remote LLM 配合 text2vec 执行 pipeline。请确保 `config.ini` 只使用 remote LLM,关闭 local LLM
205
+
6. What if the GPU memory is too low?
202
206
203
-
# 📝 引用
204
-
```bash
207
+
At this time, it is impossible to run local LLM, and only remote LLM can be used in conjunction with text2vec to execute the pipeline. Please make sure that `config.ini` only uses remote LLM and turn off local LLM.
208
+
209
+
# 📝 Citation
210
+
```shell
205
211
@misc{2023HuixiangDou,
206
212
title={HuixiangDou: Overcoming Group Chat Scenarios with LLM-based Technical Assistance},
0 commit comments