Clarification on Teacher Checkpoint and Dataset Details for Student Model Training

Hi,

I'm trying to reproduce the Sana-Sprint experiment and have two questions regarding the configuration and dataset used for training the student model:

1. **Teacher Checkpoint for Student Initialization:**  
   As I understand from the Sana-Sprint paper, the student model should be initialized using a fine-tuned teacher model. However, in the configuration file (`SanaSprint_1600M_1024px_allqknorm_bf16_scm_ladd.yaml`), the parameter is set as follows:
   
   ```yaml
   model.load_from: hf://Efficient-Large-Model/Sana_Sprint_1.6B_1024px/checkpoints/Sana_Sprint_1.6B_1024px.pth
   ```
   
   Should this checkpoint actually be the teacher checkpoint (for example, something like `Sana_Sprint_1.6B_1024px_teacher.pth`), which is then used to initialize the student model? Please confirm if the current setting is correct.

2. **Dataset Details for Student Training:**  
   There isn’t enough information available about the dataset used for training the student model. Could you provide details on:
   
   - The number of image-caption pairs used.
   - The sources of the images (e.g., LAION, SA-1B, internal databases, etc.).
   - How the captions were generated (e.g., using models like GPT-4 or other methods).

Any clarification or additional documentation on these points would be greatly appreciated.

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on Teacher Checkpoint and Dataset Details for Student Model Training #246

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clarification on Teacher Checkpoint and Dataset Details for Student Model Training #246

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions