Skip to content

如何使用unsloth进行dpo微调(qlora) #7324

Answered by SnowFox4004
SnowFox4004 asked this question in Q&A
Discussion options

You must be logged in to vote

在代码里打印generate的输出后发现

all_logits: CausalLMOutputWithPast(loss=tensor(0.4934, device='cuda:0', grad_fn=<LinearCrossEntropyFunctionBackward>), logits=Unsloth: Logits are empty from 2024.11 onwards. To get raw logits again, please set the environment variable `UNSLOTH_RETURN_LOGITS` to `"1" BEFORE starting to train ie before `trainer.train()`. For example:
 \```
import os
os.environ['UNSLOTH_RETURN_LOGITS'] = '1'
trainer.train()
 \```
No need to restart your console - just add `os.environ['UNSLOTH_RETURN_LOGITS'] = '1'` before trainer.train() and re-run the cell!, past_key_values=None, hidden_states=None, attentions=None)

是unsloth改了导致不会输出logits,只需修改环境变量即可

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by SnowFox4004
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
1 participant