Skip to content

update whisperx to 3.8.1#1776

Open
bgruening wants to merge 1 commit intomasterfrom
whisperx381
Open

update whisperx to 3.8.1#1776
bgruening wants to merge 1 commit intomasterfrom
whisperx381

Conversation

@bgruening
Copy link
Copy Markdown
Owner

No description provided.

@bgruening
Copy link
Copy Markdown
Owner Author

@arash77 any idea why the test fails?

Or better how could the test pass in your first version :)

@arash77
Copy link
Copy Markdown
Contributor

arash77 commented Feb 16, 2026

@bgruening I can't remember how did they pass the test before! but should we include the models in the docker file so that the CI can access it?

@bgruening
Copy link
Copy Markdown
Owner Author

No, they are too large I think. Our models are here: https://github.com/usegalaxy-eu/infrastructure-playbook/blob/master/files/galaxy/tpv/tools.yml#L523

@arash77
Copy link
Copy Markdown
Contributor

arash77 commented Feb 16, 2026

Previously, the test for this tool might not have even run because we didn't pass WHISPERX_MODEL_DIR to it. How can we test the tool in CI if we don't have access to the models? Should we use the expect_failure approach?

@bgruening
Copy link
Copy Markdown
Owner Author

I don't know, that was my questino. The CI def. run ... but it turned green. Can it be that huggingface downloaded the model on the fly?

@arash77
Copy link
Copy Markdown
Contributor

arash77 commented Feb 16, 2026

But it needs an hf_token that has accepted the terms and conditions to download the models. Since we are using --model_cache_only True, it shouldn't be downloading them anyway.

@arash77
Copy link
Copy Markdown
Contributor

arash77 commented Feb 16, 2026

Also by upgrading WhisperX, it now requires another model: pyannote/speaker-diarization-community-1.

@bgruening
Copy link
Copy Markdown
Owner Author

Ok, then I have no clue how this ever worked on CI :)

Do you have time to test it locally, and then we merge? Or should we YOLO?

@arash77
Copy link
Copy Markdown
Contributor

arash77 commented Feb 16, 2026

I'm trying to test it, but it requires a GPU! I'm also not sure if the GPU is enabled on GitHub Actions but I see we fallback to cpu here.

@arash77
Copy link
Copy Markdown
Contributor

arash77 commented Feb 16, 2026

There are some models that need to be added based on the WhisperX documentation, then we can run it on galaxy to test if it works.
This should be defently done, pyannote/speaker-diarization-3.1 is no longer needed and replaced with pyannote/speaker-diarization-community-1. So we have to add this model to the infrastructure.
Not sure about this one, but the NLTK punkt_tab tokenizer for alignment seems to be required

@arash77
Copy link
Copy Markdown
Contributor

arash77 commented Feb 17, 2026

I have tested it, and the Docker container is safe to work with. However, the models should be updated as I mentioned before; perhaps using this script. Also, make sure to accept the terms for pyannote/speaker-diarization-community-1 and provide the HF_AUTH_TOKEN to it.
And then adding export NLTK_DATA=\${WHISPERX_MODEL_DIR}/nltk_data && to the tool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants