Skip to content

Clarification on Teacher Checkpoint and Dataset Details for Student Model Training #246

Open
@dargma

Description

@dargma

Hi,

I'm trying to reproduce the Sana-Sprint experiment and have two questions regarding the configuration and dataset used for training the student model:

  1. Teacher Checkpoint for Student Initialization:
    As I understand from the Sana-Sprint paper, the student model should be initialized using a fine-tuned teacher model. However, in the configuration file (SanaSprint_1600M_1024px_allqknorm_bf16_scm_ladd.yaml), the parameter is set as follows:

    model.load_from: hf://Efficient-Large-Model/Sana_Sprint_1.6B_1024px/checkpoints/Sana_Sprint_1.6B_1024px.pth

    Should this checkpoint actually be the teacher checkpoint (for example, something like Sana_Sprint_1.6B_1024px_teacher.pth), which is then used to initialize the student model? Please confirm if the current setting is correct.

  2. Dataset Details for Student Training:
    There isn’t enough information available about the dataset used for training the student model. Could you provide details on:

    • The number of image-caption pairs used.
    • The sources of the images (e.g., LAION, SA-1B, internal databases, etc.).
    • How the captions were generated (e.g., using models like GPT-4 or other methods).

Any clarification or additional documentation on these points would be greatly appreciated.

Thank you!

Metadata

Metadata

Assignees

No one assigned

    Labels

    AnsweredAnswered the question

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions