Skip to content

Dataloader for HuggingFace gpt/gpt-2 and our Chinese gpt#364

Open
lemon234071 wants to merge 6 commits into
thu-coai:masterfrom
lemon234071:yida_Chinese-gpt
Open

Dataloader for HuggingFace gpt/gpt-2 and our Chinese gpt#364
lemon234071 wants to merge 6 commits into
thu-coai:masterfrom
lemon234071:yida_Chinese-gpt

Conversation

@lemon234071
Copy link
Copy Markdown
Member

Description:
Added dataloader for Chinese-gpt implemented by pytorch-transformers.

Reference Issues: #XX (XX is the issue number you work on)
Dataloader for huggingface transformers #1300
1, Added two classes -- HGFSingleTurnDialog, HGFCleanWB which only add formatted inputs for pytorch-transformers. The others are the same as BERTSingleTurnDialog, BERTOpenSubtitles.
2, The tokenizer is hard to changed to fit the model, maybe need a general base class for pytorch-transformers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant