Skip to content

预训练数据构造有误#72

Open
hy-struggle wants to merge 1 commit intobrightmart:masterfrom
hy-struggle:master
Open

预训练数据构造有误#72
hy-struggle wants to merge 1 commit intobrightmart:masterfrom
hy-struggle:master

Conversation

@hy-struggle
Copy link

原来的代码中,在wwm阶段,MLM任务中标签会被构造为开头带'##'的汉字,而在下游任务中这种token是没有意义的。
这就导致wwm在下游任务中其实并没有任何作用,反而伤害了模型。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant