Skip to content

How to fine-tune T5 with a Casual Language Modeling object? #1097

Open
@nanbeitk

Description

Dear all,
I am new to NLP and has some strange questions, I try to explain them clearly.

My goal is to using a specific corpus to fine-tune t5-base model with a casual language modeling, I find this document and it use AutoModelForCasualLM, but this liabrary just not include series of t5 models.

So my question is:

  1. How should I do to finetune t5 model for CLM object? In my understanding, CLM is a process of predicting token_2 from token_1 , token_3 from token_1, token_2 until the end of input sequence, so i am confused how to finish this process myself.

  2. I try to spilt one my train data into something like this (ti == token_i, 1 == eos_token):
    input_ids                                                     labels

  • [t1, 1, 1, 1, 1, 1, ...]         [t1, t2, 1, 1, 1, 1, ...]

  • [t1, t2, 1, 1, 1, 1, ...]        [t1, t2, t3, 1, 1, 1, ...]

  • [t1, t2, t3, 1, 1, 1, ...]       [t1, t2, t3, t4, 1, 1, ...]

  • [t1, t2, t3, t4, 1, 1, ...]      [t1, t2, t3, t4, t5, 1, ...]
    The first problem is obvious, the expanded dataset is too large and requires more time to fine-tune; The second problem is that this seems strange, and I don't know if this fulfills the CLM's mission requirements. This is the only idea that i can catch up to solve this problem, does it work?

Thanks!!

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions