Whisper or Fast Whisper? “large-v2” model not working! #2750

Giak1234 · 2025-12-11T20:55:36Z

Giak1234
Dec 11, 2025

Good evening,
I’m looking for a solution to make audio/video file transcription (in various formats) work with Whisper.
Is anyone able to share the correct setup for Windows 11 (with a GPU featuring 16GB dedicated VRAM and 128GB RAM) to use a model larger than “medium”, for example “large-v2”?

Here is what I have already done:

I’m working offline, so as indicated in the file found in iped-4-2-2/conf/AudioTranscriptTaskConfig.txt, I downloaded the model from the faster-whisper-large-v2 repository on Hugging Face—already converted for use with CTranslate2 (which faster-whisper uses). Hugging Face → https://huggingface.co/guillaumekln/faster-whisper-large-v2

I placed the downloaded files in the folders I created (fastwhisper and large-v2) under the path:
iped-4-2-2/models/fastwhisper/large-v2
(containing: config.json, model.bin, tokenizer.json, vocabulary.txt)

I replaced the Python folder (iped-4-2-2/python) with the one included in IPED-4.2.x_plugins_Whisper_FaceRecognition_Win64_CPU.zip

I configured the parameters in AudioTranscriptTaskConfig.txt as follows:

whisperModel = large-v2
device = gpu
precision = int8
batchSize = 1

I installed CUDA Toolkit > 12 with all required dependencies and added the environment variables

When I run the command
iped.exe -d image.dd -o output
I get the following error:
“Processing Error: Error loading 'large-v2' transcription model.”

Any suggestions?

lfcnassif · 2025-12-12T14:30:35Z

lfcnassif
Dec 12, 2025
Maintainer

Hi @Giak1234. Currently I'm traveling on vacation, but I'll try to give you some tips.

I placed the downloaded files in the folders I created (fastwhisper and large-v2) under the path:
iped-4-2-2/models/fastwhisper/large-v2
(containing: config.json, model.bin, tokenizer.json, vocabulary.txt)

AFAIK this won't work. Try to setup and run the transcription on a machine with Internet access, the model will be downloaded in the first run, then copy all the content of C:\Users\your_user\.huggingface folder to the machine without Internet access and see if that works.

I configured the parameters in AudioTranscriptTaskConfig.txt as follows:

whisperModel = large-v2 device = gpu precision = int8 batchSize = 1

To transcribe audios from videos, add video; to mimesToProcess option in the same config file.

I also suggest you to try to run transcription on the CPU initially. The iped python plugins package you downloaded works just for CPU, as stated by its name. After it works, then try to run on the GPU. For that, you will need to install the required python libraries with GPU support, please check our manual in the wiki for further details.

0 replies

Giak1234 · 2025-12-18T07:44:20Z

Giak1234
Dec 18, 2025
Author

I've tried everything (I think)... installed Python libraries of all kinds required during testing... CUDA/PyTorch/Fast-Whisper/CTranslate2/Hugging Face/... but nothing!
The only one that works is VOSK, but its reliability is precarious!
If anyone has had positive results with Wav2Vec2 or Whisper and wants to share a .txt file, I thank you in advance!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Whisper or Fast Whisper? “large-v2” model not working! #2750

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Whisper or Fast Whisper? “large-v2” model not working! #2750

Uh oh!

Giak1234 Dec 11, 2025

Replies: 2 comments

Uh oh!

Uh oh!

lfcnassif Dec 12, 2025 Maintainer

Uh oh!

Giak1234 Dec 18, 2025 Author

Giak1234
Dec 11, 2025

lfcnassif
Dec 12, 2025
Maintainer

Giak1234
Dec 18, 2025
Author