使用Qlora微调InternLM2_1.8b_chat实验记录

因为资源有限，但是想体验一下微调llm的实践效果，于是就使用internLM2的小模型进行了实验，1、2张卡就可以实验。本次没有使用任何模型量化，batchisze设置为6，并用了deepspeed zero2在两张 3090 24GB上满显存训练3轮训练了10分钟。

一、下面我将一步一步的给出实验步骤

第一步：初始化环境

1. 创建一个conda环境：

conda create -n xtuner0121 python=3.10 -y

2. 安装pytorch相关的库：

conda install pytorch==2.7.1 torchvision==0.22.1 pytorch-cuda=12.1 -c pytorch -c nvidia -y

这里可以用pip下载，也可以使用清华源进行加速 -i https://pypi.tuna.tsinghua.edu.cn/simple 只要版本正确就行了

3. 下载transformers

pip install transformers==4.43.0

这里要说明一下，我在qlora微调的时候使用的是这个版本，但是在后面推理的时候由于版本比较新，所以有一个bug，所以我在后面进行推理的时候使用的是版本4.39.3，读者可以选一个不新也不旧的版本

4. 下载 streamlit

pip install streamlit==1.36.0

聊天对话的可视化

5. 下载我们的微调工具xtuner

#创建一个目录，用来存放源代码 mkdir -p /home/finetune_qlora/ git clone -b v0.1.21 https://github.com/InternLM/XTuner /home/

6. 进入xtuner模块，然后下载一些软件包

pip install -e '.[deepspeed]'

你也可以将deepspeed改为all，所有相关软件都要下载

7、降低 numpy的版本

因为下载numpy会不知不觉使得numpy变为2.x以上的版本，所以我们降个级：

pip install numpy==1.23.5

8、查看xtuner版本

xtuner version

你应该看到：

第二步：下载模型并看看原来模型的效果

找一个你认为合适的目录下载相应的模型，模型的网址是：internlm2_1_8b_chat 可以使用下面的命令进行下载：

export HF_ENDPOINT=https://hf-mirror.com 
huggingface-cli download --resume-download internlm/internlm2-chat-1_8b  --local-dir (模型存放位置)

使用下面命令看一下原模型的效果： xtuner_streamlit_demo来源请看参考资料

streamlit run /home/finetune_qlora/xtuner_streamlit_demo.py

你可以看到：

第三步：准备数据

这个你就可以自己随便选了，只要你能想到的，但可以选一些比较有特点的，如果数据集太大或者逻辑复杂知识很多，那么这个1.5B模型可以都不太行，可能看不出效果。比较快的方法是用chatgpt最强大的模型按照你的要求和格式直接生成大量的数据，我微调使用的数据就是用chatgpt生成的。原本还想用deepseek api 一个一个生成，但是太慢了。最后将你的数据保存在一个json文件里面，格式是：

[
	{
		"conversation": [
			{
				"system": "xxxx",
				"input": "xxxxxxxx",
				"output": "xxxxxxxxxxxxxxxxx"
			},
			{
				"input": "xxxx",
				"output": "xxxxxxxxxxxxxxxxxx"
			},
			{
				"input": "xxxx",
				"output": "xxxxxxxxxxxxxxxxxxx"
			}
					]
	},
	{
		"conversation": [
			{
				"system": "xxxx",
				"input": "xxxxxxxx",
				"output": "xxxxxxxxxxxxxxxxx"
			},
			{
				"input": "xxxx",
				"output": "xxxxxxxxxxxxxxxxxx"
			},
			{
				"input": "xxxx",
				"output": "xxxxxxxxxxxxxxxxxxx"
			}
					]
	}
]

第四步：开始微调

1. 首先看一些有哪些微调文件

xtuner list-cfg -p internlm2

2. 修改微调配置文件

拷贝到你的项目主目录， .是当前目录的意思

xtuner copy-cfg internlm2_chat_1_8b_qlora_alpaca_e3 .

参考后面相应的资料，我们做下面的修改：

4.2.1 指定预训练模型路径

pretrained_model_name_or_path = 'internlm/internlm2-chat-1_8b'

改成你自己模型的路径

4.2.2 指定微调数据

alpaca_en_path = 'data/assistant.json'

4.2.3 改变模型参数

量化一直报错，所以我就直接全部注释了

4.2.4 其他所有的修改

看参考资料的博客

3. 启动微调

xtuner train /home/finetune_qlora/internlm2_chat_1_8b_qlora_alpaca_e3_copy.py

如果想要使用deepspeed，就使用下面的命令：

 xtuner train ./internlm2_chat_1_8b_qlora_alpaca_e3_copy.py  --deepspeed deepspeed_zero2

第五步：生成hf模型参数并验证效果

1.合并模型参数

如果你没有使用deepspeed，那你按照后面的博客操作，但是如果使用了deepspeed，我遇到了下面的问题。

首先是torch.load的问题：

In PyTorch 2.6, we changed the default value of the weights_only argument in torch.load from False to True. Re-running torch.load with weights_only set to False will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. 我直接去了xtuner的源码在torch.load里面改成了state_dict = torch.load(pth_model, map_location='cpu', weights_only=False) 加了weights_only=False

然后是转为hf的问题：

(xtuner0121) root@qK6o3J:/home/finetune_qlora# xtuner convert pth_to_hf --fp32 ./internlm2_chat_1_8b_qlora_alpaca_e3_copy.py /home/finetune_qlora/work_dirs/internlm2_chat_1_8b_qlora_alpaca_e3_copy/iter_96.pth/mp_rank_00_model_states.pt ./hf [2025-06-07 16:02:42,793] [INFO] [real_accelerator.py:254:get_accelerator] Setting ds_accelerator to cuda (auto detect) [2025-06-07 16:02:48,283] [INFO] [real_accelerator.py:254:get_accelerator] Setting ds_accelerator to cuda (auto detect) Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 4.75it/s] Load State Dict: 0%| | 0/22 [00:00<?, ?it/s] Traceback (most recent call last): File "/home/xtuner0121/xtuner/tools/model_converters/pth_to_hf.py", line 139, in main() File "/home/xtuner0121/xtuner/tools/model_converters/pth_to_hf.py", line 115, in main set_module_tensor_to_device(model, name, 'cpu', param) File "/root/miniconda3/envs/xtuner0121/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 260, in set_module_tensor_to_device raise ValueError(f"{module} does not have a parameter or a buffer named {tensor_name}.") ValueError: SupervisedFinetune( (data_preprocessor): BaseDataPreprocessor() (llm): PeftModelForCausalLM( (base_model): LoraModel( (model): InternLM2ForCausalLM( (model): InternLM2Model( (tok_embeddings): Embedding(92544, 2048, padding_idx=2) (layers): ModuleList( (0-23): 24 x InternLM2DecoderLayer( (attention): InternLM2Attention( (wqkv): lora.Linear( (base_layer): Linear(in_features=2048, out_features=4096, bias=False) (lora_dropout): ModuleDict( (default): Dropout(p=0.1, inplace=False) ) (lora_A): ModuleDict( (default): Linear(in_features=2048, out_features=64, bias=False) ) (lora_B): ModuleDict( (default): Linear(in_features=64, out_features=4096, bias=False) ) (lora_embedding_A): ParameterDict() (lora_embedding_B): ParameterDict() (lora_magnitude_vector): ModuleDict() ) (wo): lora.Linear( (base_layer): Linear(in_features=2048, out_features=2048, bias=False) (lora_dropout): ModuleDict( (default): Dropout(p=0.1, inplace=False) ) (lora_A): ModuleDict( (default): Linear(in_features=2048, out_features=64, bias=False) ) (lora_B): ModuleDict( (default): Linear(in_features=64, out_features=2048, bias=False) ) (lora_embedding_A): ParameterDict() (lora_embedding_B): ParameterDict() (lora_magnitude_vector): ModuleDict() ) (rotary_emb): InternLM2DynamicNTKScalingRotaryEmbedding() ) (feed_forward): InternLM2MLP( (w1): lora.Linear( (base_layer): Linear(in_features=2048, out_features=8192, bias=False) (lora_dropout): ModuleDict( (default): Dropout(p=0.1, inplace=False) ) (lora_A): ModuleDict( (default): Linear(in_features=2048, out_features=64, bias=False) ) (lora_B): ModuleDict( (default): Linear(in_features=64, out_features=8192, bias=False) ) (lora_embedding_A): ParameterDict() (lora_embedding_B): ParameterDict() (lora_magnitude_vector): ModuleDict() ) (w3): lora.Linear( (base_layer): Linear(in_features=2048, out_features=8192, bias=False) (lora_dropout): ModuleDict( (default): Dropout(p=0.1, inplace=False) ) (lora_A): ModuleDict( (default): Linear(in_features=2048, out_features=64, bias=False) ) (lora_B): ModuleDict( (default): Linear(in_features=64, out_features=8192, bias=False) ) (lora_embedding_A): ParameterDict() (lora_embedding_B): ParameterDict() (lora_magnitude_vector): ModuleDict() ) (w2): lora.Linear( (base_layer): Linear(in_features=8192, out_features=2048, bias=False) (lora_dropout): ModuleDict( (default): Dropout(p=0.1, inplace=False) ) (lora_A): ModuleDict( (default): Linear(in_features=8192, out_features=64, bias=False) ) (lora_B): ModuleDict( (default): Linear(in_features=64, out_features=2048, bias=False) ) (lora_embedding_A): ParameterDict() (lora_embedding_B): ParameterDict() (lora_magnitude_vector): ModuleDict() ) (act_fn): SiLU() ) (attention_norm): InternLM2RMSNorm() (ffn_norm): InternLM2RMSNorm() ) ) (norm): InternLM2RMSNorm() ) (output): lora.Linear( (base_layer): Linear(in_features=2048, out_features=92544, bias=False) (lora_dropout): ModuleDict( (default): Dropout(p=0.1, inplace=False) ) (lora_A): ModuleDict( (default): Linear(in_features=2048, out_features=64, bias=False) ) (lora_B): ModuleDict( (default): Linear(in_features=64, out_features=92544, bias=False) ) (lora_embedding_A): ParameterDict() (lora_embedding_B): ParameterDict() (lora_magnitude_vector): ModuleDict() ) ) ) ) ) does not have a parameter or a buffer named module.

直接看最后一句话does not have a parameter or a buffer named module. 这是因为deepspeed会把原来的模型包裹在一个module里面，所以改成了：

add by nju-niu 这里加了一行，因为我使用了deepspeed的deepspeed_zero2来qlora微调，所以它会在外面的模型上面再套一层module，这里我们只用取出参数进行转换，所以姑且这么写。 state_dict = state_dict['module'] for name, param in tqdm(state_dict.items(), desc='Load State Dict'): # print(f"name {name}\n" ) set_module_tensor_to_device(model, name, 'cpu', param)

最终使用下面命令进行转换：

 xtuner convert pth_to_hf --fp32 ./internlm2_chat_1_8b_qlora_alpaca_e3_copy.py  /home/finetune_qlora/work_dirs/internlm2_chat_1_8b_qlora_alpaca_e3_copy/iter_96.pth/mp_rank_00_model_states.pt  ./hf

转化为huggingface格式后，我们将原来的模型和lora adaptor进行融合，形成最终的模型：

xtuner convert merge /home/ckpts/internlm2_1_8b_chat ./hf ./merged --max-shard-size 2GB

第六步：验证最终效果

将xtuner_streamlit_demo.py里面的模型路径改成你合并后的路径，输入命令：

streamlit run ./xtuner_streamlit_demo.py

看到下面的效果：

效果改进，但是如果更加深入，还是得全量微调并使用更大的模型。

二、参考资料

一个博客：书生大模型实战营-L1-XTuner微调个人小助手认知 - JunyaoHu (胡钧耀) xtuner ：Tutorial/tools/L1_XTuner_code/xtuner_streamlit_demo.py at camp4 · InternLM/Tutorial 这个py被用来可视化

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
datasets		datasets
pictures		pictures
.gitignore		.gitignore
internlm2_chat_1_8b_qlora_alpaca_e3_copy.py		internlm2_chat_1_8b_qlora_alpaca_e3_copy.py
readme.md		readme.md
requirements.txt		requirements.txt
xtuner_streamlit_demo.py		xtuner_streamlit_demo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

使用Qlora微调InternLM2_1.8b_chat实验记录

一、下面我将一步一步的给出实验步骤

第一步：初始化环境

1. 创建一个conda环境：

2. 安装pytorch相关的库：

3. 下载transformers

4. 下载 streamlit

5. 下载我们的微调工具xtuner

6. 进入xtuner模块，然后下载一些软件包

7、降低 numpy的版本

8、查看xtuner版本

第二步：下载模型并看看原来模型的效果

第三步：准备数据

第四步：开始微调

1. 首先看一些有哪些微调文件

2. 修改微调配置文件

4.2.1 指定预训练模型路径

4.2.2 指定微调数据

4.2.3 改变模型参数

4.2.4 其他所有的修改

3. 启动微调

第五步：生成hf模型参数并验证效果

1.合并模型参数

首先是torch.load的问题：

然后是转为hf的问题：

第六步：验证最终效果

二、参考资料

About

Uh oh!

Releases

Packages

Languages

NiuHuangxiaozi/Internlm2_1.8b_chat_sft_demo

Folders and files

Latest commit

History

Repository files navigation

使用Qlora微调InternLM2_1.8b_chat实验记录

一、下面我将一步一步的给出实验步骤

第一步：初始化环境

1. 创建一个conda环境：

2. 安装pytorch相关的库：

3. 下载transformers

4. 下载 streamlit

5. 下载我们的微调工具xtuner

6. 进入xtuner模块，然后下载一些软件包

7、降低 numpy的版本

8、查看xtuner版本

第二步：下载模型并看看原来模型的效果

第三步：准备数据

第四步：开始微调

1. 首先看一些有哪些微调文件

2. 修改微调配置文件

4.2.1 指定预训练模型路径

4.2.2 指定微调数据

4.2.3 改变模型参数

4.2.4 其他所有的修改

3. 启动微调

第五步：生成hf模型参数并验证效果

1.合并模型参数

首先是torch.load的问题：

然后是转为hf的问题：

第六步：验证最终效果

二、参考资料

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages