Open
Description
如题,Windows平台下想使用Funasr进行打标,pip安装了Funasr和modelscope,但是运行fap transcribe命令时报错:
(FishAudio) PS D:\DLWorkshop\audio-preprocess> fap transcribe .\all_data\sliced\ --model-type funasr --recursive
2024-07-13 16:23:40.875 | INFO | fish_audio_preprocess.cli.transcribe:transcribe:72 - Using paraformer-zh model for funasr as default
2024-07-13 16:23:40.892 | INFO | fish_audio_preprocess.cli.transcribe:transcribe:80 - Using 2 workers for processing
2024-07-13 16:23:40.892 | INFO | fish_audio_preprocess.cli.transcribe:transcribe:81 - Transcribing audio files in .\all_data\sliced\
2024-07-13 16:23:46.168 | INFO | fish_audio_preprocess.utils.transcribe:batch_transcribe:40 - Loading paraformer-zh model for zh transcription
2024-07-13 16:23:46.168 | INFO | fish_audio_preprocess.utils.transcribe:batch_transcribe:40 - Loading paraformer-zh model for zh transcription
You are using the latest version of funasr-1.1.0
You are using the latest version of funasr-1.1.0
2024-07-13 16:23:46,993 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:23:46,993 - modelscope - INFO - Use user-specified model revision: master
2024-07-13 16:23:47,045 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:23:47,045 - modelscope - INFO - Use user-specified model revision: master
2024-07-13 16:23:52,031 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:23:52,031 - modelscope - INFO - Use user-specified model revision: master
2024-07-13 16:23:52,199 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:23:52,199 - modelscope - INFO - Use user-specified model revision: master
Downloading: 100%|█████████████████████| 10.6k/10.6k [00:00<00:00, 29.9kB/s]
Download: iic/speech_fsmn_vad_zh-cn-16k-common-pytorch failed!: [WinError 32] 另一个程序正在使用此文件,进程无法访问。: 'C:\\****\\._____temp\\iic\\speech_fsmn_vad_zh-cn-16k-common-pytorch\\README.md'
Downloading: 100%|█████████████████████| 10.6k/10.6k [00:00<00:00, 30.3kB/s]
2024-07-13 16:23:52,760 - modelscope - ERROR - File C****\._____temp\iic\speech_fsmn_vad_zh-cn-16k-common-pytorch\README.md integrity check failed, expected sha256 signature is 991885cf850e1629de7ce0624d83916e45791cba8049f0e4899477e7837f4f5e, actual is
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\****\.conda\envs\FishAudio\lib\concurrent\futures\process.py", line 246, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
File "D:\DLWorkshop\audio-preprocess\fish_audio_preprocess\utils\transcribe.py", line 41, in batch_transcribe
model = AutoModel(
File "C:\Users\****\.conda\envs\FishAudio\lib\site-packages\funasr\auto\auto_model.py", line 134, in __init__
vad_model, vad_kwargs = self.build_model(**vad_kwargs)
File "C:\Users\****\.conda\envs\FishAudio\lib\site-packages\funasr\auto\auto_model.py", line 218, in build_model
assert model_class is not None, f'{kwargs["model"]} is not registered'
AssertionError: fsmn-vad is not registered
简单排查了一下,发现fap transcribe默认启动2个worker:
# cli\transcribe.py line 19
@click.option(
"--num-workers",
help="Number of workers to use for processing, defaults to 2",
default=2,
show_default=True,
type=int,
)
这导致,如果用户没有本地下载好模型,则调用的Funasr进程试图同时向modelscope的同一个._____temp文件夹下载模型文件,这导致Windows系统报错:[WinError 32] 另一个程序正在使用此文件。
加上--num-workers 1
参数后不再产生此问题。
(FishAudio) PS D:\DLWorkshop\audio-preprocess> fap transcribe .\all_data\sliced\ --model-type funasr --recursive --num-workers 1
2024-07-13 16:25:13.418 | INFO | fish_audio_preprocess.cli.transcribe:transcribe:72 - Using paraformer-zh model for funasr as default
2024-07-13 16:25:13.433 | INFO | fish_audio_preprocess.cli.transcribe:transcribe:80 - Using 1 workers for processing
2024-07-13 16:25:13.433 | INFO | fish_audio_preprocess.cli.transcribe:transcribe:81 - Transcribing audio files in .\all_data\sliced\
2024-07-13 16:25:17.453 | INFO | fish_audio_preprocess.utils.transcribe:batch_transcribe:40 - Loading paraformer-zh model for zh transcription
You are using the latest version of funasr-1.1.0
2024-07-13 16:25:18,440 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:25:18,440 - modelscope - INFO - Use user-specified model revision: master
2024-07-13 16:25:22,084 - modelscope - WARNING - Using the master branch is fragile, please use it with caution!
2024-07-13 16:25:22,084 - modelscope - INFO - Use user-specified model revision: master
Downloading: 100%|████████████████| 10.6k/10.6k [00:00<00:00, 29.2kB/s]
建议: fap transcribe指令的默认--num-workers 改为1,或是新增代码来解决workers的下载冲突问题。
Metadata
Metadata
Assignees
Labels
No labels