Skip to content

FEAT: Generate a REAP-reciepe.yaml and REAP-score.yaml #2

@johnr14

Description

@johnr14

Hi, I was wondering if the method could be applied to the base model to have a smaller starting point for complete instruction tuning.

Would it be possible to generate a sort of REAP recipe of pruning steps that could be be used to re-generate the REAP model ?

Then applying the same recipe to the BASE model should provide with a strong base model somewhat reproducing a similar starting point for SFT.

So I would propose that you add a REAP-recipe.yaml file to the generated models. The REAP codebase should generate that file automatically for REAP models when uploading them to HF.

It would notably be possible to compare different REAP model generation using different datasets, different % pruning and even different merging methods.

This recipe could also be used to compare final models when changing the dataset composition, notably for different types of languages (Chinese, English, French, German...).

It would also be possible to have a detailed version of expert activation REAP-score that could be used to try different % REAP without having to recompute the expert activation for a particular dataset. Sharing those REAP-scores.yaml files could enable additional research into different expert merging, pruning or novel approach. The REAP-score would be a summary of :

By taking into consideration both the router gate-values and expert activation norms, REAP prunes the experts which contribute the least to each layers output on a per-token average, regardless of usage frequency.

Publishing the REAP-recipe.yaml and REAP-scores.yaml could enable quick prototyping of custom REAP pruned models for evaluation in different scenarios. Those files should details the number of samples, their lengths, and used datasets (if public). Comments for run-time and hardware config used would be nice to get an idea of the work involved.

This could enable domain specific evaluations using small targeted datasets and comparing the result between them and against the reference REAP model generated using a more comprehensive dataset..

Ideally REAP models could be recreated with some tool like mergekit.

I believe that this recipe creation would be the next step in evaluating REAP performance in various scenarios. This is definitely a step in the right direction :

Finally, this work highlights the importance of comprehensive downstream evaluations and the significant
challenges involved with evaluating LLMs.

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions