Skip to content

Conversation

@Kostis-S-Z
Copy link
Collaborator

@Kostis-S-Z Kostis-S-Z commented Feb 18, 2025

What's changing

This improvement makes it possible to finetune whisper large-v3(!) locally or on colab in a 8GB VRAM GPU! The previous code was able to only finetune whisper small in such a machine.

Closes #2

How to test it

Steps to test the changes:

Additional notes for reviewers

The issue is that the model produced by the training is a Peft Model, meaning you cant load and use it the same way as the standard model from the Trainer. This means that we need to add extra custom code for the Transcription app and the evaluation of the dataset.

I already...

  • Tested the changes in a working environment to ensure they work as expected
  • Added some tests for any new functionality
  • Updated the documentation (both comments in code and under /docs)

@Kostis-S-Z Kostis-S-Z linked an issue Feb 18, 2025 that may be closed by this pull request
@Kostis-S-Z Kostis-S-Z added enhancement New feature or request help wanted Extra attention is needed paused External contributions to this PR are welcomed. Interanlly, work on this is paused. labels Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request help wanted Extra attention is needed paused External contributions to this PR are welcomed. Interanlly, work on this is paused.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve fine-tuning training code

2 participants