Skip to content

Commit fb9e149

Browse files
authored
Merge pull request #659 from ymcui/doc_fix
Miscellaneous doc fix
2 parents 7745723 + 3e2ff96 commit fb9e149

File tree

2 files changed

+9
-4
lines changed

2 files changed

+9
-4
lines changed

README.md

+4-1
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
<img alt="GitHub release (latest by date)" src="https://img.shields.io/github/v/release/ymcui/Chinese-LLaMA-Alpaca">
1111
<img alt="GitHub top language" src="https://img.shields.io/github/languages/top/ymcui/Chinese-LLaMA-Alpaca">
1212
<img alt="GitHub last commit" src="https://img.shields.io/github/last-commit/ymcui/Chinese-LLaMA-Alpaca">
13-
<a href="https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki"><img alt="GitHub wiki" src="https://img.shields.io/badge/Github%20Wiki-v4.0-green"></a>
13+
<a href="https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki"><img alt="GitHub wiki" src="https://img.shields.io/badge/Github%20Wiki-v4.1-green"></a>
1414
</p>
1515

1616

@@ -245,6 +245,8 @@ chinese_llama_lora_7b/
245245

246246
需要注意的是,综合评估大模型能力仍然是亟待解决的重要课题,合理辩证地看待大模型相关各种评测结果有助于大模型技术的良性发展。推荐用户在自己关注的任务上进行测试,选择适配相关任务的模型。
247247

248+
C-Eval推理代码请参考本项目 >>> [📚 GitHub Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/C-Eval评测结果与脚本)
249+
248250
## 训练细节
249251

250252
整个训练流程包括词表扩充、预训练和指令精调三部分。
@@ -273,6 +275,7 @@ FAQ中给出了常见问题的解答,请在提Issue前务必先查看FAQ。
273275
问题8:Chinese-Alpaca-Plus效果很差
274276
问题9:模型在NLU类任务(文本分类等)上效果不好
275277
问题10:为什么叫33B,不应该是30B吗?
278+
问题11:模型合并之后SHA256不一致
276279
```
277280

278281
具体问题和解答请参考本项目 >>> [📚 GitHub Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/常见问题)

README_EN.md

+5-3
Original file line numberDiff line numberDiff line change
@@ -10,12 +10,11 @@
1010
<img alt="GitHub release (latest by date)" src="https://img.shields.io/github/v/release/ymcui/Chinese-LLaMA-Alpaca">
1111
<img alt="GitHub top language" src="https://img.shields.io/github/languages/top/ymcui/Chinese-LLaMA-Alpaca">
1212
<img alt="GitHub last commit" src="https://img.shields.io/github/last-commit/ymcui/Chinese-LLaMA-Alpaca">
13-
<a href="https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki"><img alt="GitHub wiki" src="https://img.shields.io/badge/Github%20Wiki-v4.0-green"></a>
13+
<a href="https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki"><img alt="GitHub wiki" src="https://img.shields.io/badge/Github%20Wiki-v4.1-green"></a>
1414
</p>
1515

1616

1717

18-
1918
Large Language Models (LLM), represented by ChatGPT and GPT-4, have sparked a new wave of research in the field of natural language processing, demonstrating capabilities of Artificial General Intelligence (AGI) and attracting widespread attention from the industry. However, the expensive training and deployment of large language models have posed certain obstacles to building transparent and open academic research.
2019

2120
To promote open research of large models in the Chinese NLP community, this project has open-sourced the **Chinese LLaMA model and the Alpaca large model with instruction fine-tuning**. These models expand the Chinese vocabulary based on the original LLaMA and use Chinese data for secondary pre-training, further enhancing Chinese basic semantic understanding. Additionally, the project uses Chinese instruction data for fine-tuning on the basis of the Chinese LLaMA, significantly improving the model's understanding and execution of instructions. Please refer to our technical report for further details [(Cui, Yang, and Yao, 2023)](https://arxiv.org/abs/2304.08177).
@@ -25,7 +24,7 @@ To promote open research of large models in the Chinese NLP community, this proj
2524
- 🚀 Extended Chinese vocabulary on top of original LLaMA with significant encode/decode efficiency
2625
- 🚀 Open-sourced the Chinese LLaMA (general purpose) and Alpaca (instruction-tuned)
2726
- 🚀 Open-sourced the pre-training and instruction finetuning (SFT) scripts for further tuning on user's data
28-
- 🚀 Quickly deploy and experience the quantized version of the large model on CPU/GPU of your laptop (personal PC)
27+
- 🚀 Quickly deploy and experience the quantized version of the large model on CPU/GPU of your laptop (personal PC)
2928
- 🚀 Support [🤗transformers](https://github.com/huggingface/transformers), [llama.cpp](https://github.com/ggerganov/llama.cpp), [text-generation-webui](https://github.com/oobabooga/text-generation-webui), [LlamaChat](https://github.com/alexrozanski/LlamaChat), [LangChain](https://github.com/hwchase17/langchain), , [privateGPT](https://github.com/imartinez/privateGPT), etc.
3029
- Released versions: 7B (basic, **Plus**), 13B (basic, **Plus**) , 33B (basic)
3130

@@ -250,6 +249,8 @@ This project also conducted tests on relevant models using the "NLU" objective e
250249

251250
It is important to note that the comprehensive assessment of the capabilities of large models is still an urgent and significant topic to address. It is beneficial to approach the various evaluation results of large models in a rational and balanced manner to promote the healthy development of large-scale model technology. It is recommended for users to conduct tests on their own tasks and choose models that are suitable for the relevant tasks.
252251

252+
For C-Eval inference code, please refer to >>> [📚GitHub Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/C-Eval-performance-and-script).
253+
253254
## Training Details
254255

255256
The entire training process includes three parts: vocabulary expansion, pre-training, and instruction fine-tuning. Please refer to [merge_tokenizers.py](scripts/merge_tokenizer/merge_tokenizers.py) for vocabulary expansion; refer to [run_clm.py](https://github.com/huggingface/transformers/blob/main/examples/pytorch/language-modeling/run_clm.py) in 🤗transformers and the relevant parts of dataset processing in the [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) project for pre-training and self-instruct fine-tuning.
@@ -278,6 +279,7 @@ Q7: Chinese-LLaMA 13B model cannot be launched with llama.cpp, reporting inconsi
278279
Q8: Chinese-Alpaca-Plus does not show better performance than the others.
279280
Q9: The model does not perform well on NLU tasks, such as text classification.
280281
Q10: Why 33B not 30B?
282+
Q11: Inconsistent SHA256
281283
```
282284

283285
Please refer to our >>> [📚GitHub Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/FAQ).

0 commit comments

Comments
 (0)