How to fine-tune T5 with a Casual Language Modeling object?

Dear all,
I am new to NLP and has some strange questions, I try to explain them clearly.

My goal is to using a specific corpus to fine-tune t5-base model with a casual language modeling, I find this [document](https://huggingface.co/docs/transformers/main/en/tasks/language_modeling#causal-language-modeling) and it use `AutoModelForCasualLM`, but this liabrary just not include series of t5 models. 

So my question is:
1. How should I do to finetune t5 model for CLM object? In my understanding, CLM is a process of predicting `token_2` from `token_1` , `token_3` from `token_1, token_2` until the end of input sequence, so i am confused how to finish this process myself.

2. I try to spilt one my train data into something like this (ti == token_i, 1 == eos_token):
            input_ids&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;labels

-    `[t1, 1, 1, 1, 1, 1, ...]`&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`[t1, t2, 1, 1, 1, 1, ...]`

-    `[t1, t2, 1, 1, 1, 1, ...]`&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`[t1, t2, t3, 1, 1, 1, ...]`

-    `[t1, t2, t3, 1, 1, 1, ...]`&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`[t1, t2, t3, t4, 1, 1, ...]`

-    `[t1, t2, t3, t4, 1, 1, ...]`&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;`[t1, t2, t3, t4, t5, 1, ...]`
The first problem is obvious, the expanded dataset is too large and requires more time to fine-tune; The second problem is that this seems strange, and I don't know if this fulfills the CLM's mission requirements. This is the only idea that i can catch up to solve this problem, does it work?

Thanks!!



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to fine-tune T5 with a Casual Language Modeling object? #1097

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How to fine-tune T5 with a Casual Language Modeling object? #1097

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions