Skip to content

it has some problems on README #1979

@Cyl147369991

Description

@Cyl147369991

Tested versions

syntax.
FFmpeg version 5: Could not find module 'D:\soft\miniconda3\envs\speaker-split\Lib\site-packages\torchcodec\libtorchcodec_core5.dll' (or one of its dependencies). Try using the full path with constructor syntax.
FFmpeg version 4: Could not find module 'D:\soft\miniconda3\envs\speaker-split\Lib\site-packages\torchcodec\libtorchcodec_core4.dll' (or one of its dependencies). Try using the full path with constructor syntax.
[end of libtorchcodec loading traceback].
warnings.warn(
D:\soft\miniconda3\envs\speaker-split\lib\site-packages\speechbrain\utils\torch_audio_backend.py:57: UserWarning: torchaudio._backend.list_audio_backends has been deprecated. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. The decoding and encoding capabilities of PyTorch for both audio and video are being consolidated into TorchCodec. Please see pytorch/audio#3902 for more information. It will be removed from the 2.9 release.
available_backends = torchaudio.list_audio_backends()
Traceback (most recent call last):
File "e:\publicproject\CosyVoice\split-audio.py", line 14, in
diarization = pipeline("three.wav")
File "D:\soft\miniconda3\envs\speaker-split\lib\site-packages\pyannote\audio\core\pipeline.py", line 476, in call
track_pipeline_apply(self, file, **kwargs)
File "D:\soft\miniconda3\envs\speaker-split\lib\site-packages\pyannote\audio\telemetry\metrics.py", line 152, in track_pipeline_apply
duration: float = Audio().get_duration(file)
File "D:\soft\miniconda3\envs\speaker-split\lib\site-packages\pyannote\audio\core\io.py", line 273, in get_duration
metadata: AudioStreamMetadata = get_audio_metadata(file)
File "D:\soft\miniconda3\envs\speaker-split\lib\site-packages\pyannote\audio\core\io.py", line 86, in get_audio_metadata
metadata = AudioDecoder(file["audio"]).metadata
NameError: name 'AudioDecoder' is not defined

System information

windows 11 pyannote.audio

Issue description

yntax.
FFmpeg version 5: Could not find module 'D:\soft\miniconda3\envs\speaker-split\Lib\site-packages\torchcodec\libtorchcodec_core5.dll' (or one of its dependencies). Try using the full path with constructor syntax.
FFmpeg version 4: Could not find module 'D:\soft\miniconda3\envs\speaker-split\Lib\site-packages\torchcodec\libtorchcodec_core4.dll' (or one of its dependencies). Try using the full path with constructor syntax.
[end of libtorchcodec loading traceback].
warnings.warn(
D:\soft\miniconda3\envs\speaker-split\lib\site-packages\speechbrain\utils\torch_audio_backend.py:57: UserWarning: torchaudio._backend.list_audio_backends has been deprecated. This deprecation is part of a large refactoring effort to transition TorchAudio into a maintenance phase. The decoding and encoding capabilities of PyTorch for both audio and video are being consolidated into TorchCodec. Please see pytorch/audio#3902 for more information. It will be removed from the 2.9 release.
available_backends = torchaudio.list_audio_backends()
Traceback (most recent call last):
File "e:\publicproject\CosyVoice\split-audio.py", line 14, in
diarization = pipeline("three.wav")
File "D:\soft\miniconda3\envs\speaker-split\lib\site-packages\pyannote\audio\core\pipeline.py", line 476, in call
track_pipeline_apply(self, file, **kwargs)
File "D:\soft\miniconda3\envs\speaker-split\lib\site-packages\pyannote\audio\telemetry\metrics.py", line 152, in track_pipeline_apply
duration: float = Audio().get_duration(file)
File "D:\soft\miniconda3\envs\speaker-split\lib\site-packages\pyannote\audio\core\io.py", line 273, in get_duration
metadata: AudioStreamMetadata = get_audio_metadata(file)
File "D:\soft\miniconda3\envs\speaker-split\lib\site-packages\pyannote\audio\core\io.py", line 86, in get_audio_metadata
metadata = AudioDecoder(file["audio"]).metadata
NameError: name 'AudioDecoder' is not defined

Minimal reproduction example (MRE)

instantiate the pipeline from pyannote.audio import Pipeline import torch import os import sys sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(file)))) pipeline = Pipeline.from_pretrained( "pyannote/speaker-diarization-3.1", token="xxx" ) # pipeline.to(torch.device("cuda")) # run the pipeline on an audio file diarization = pipeline("three.wav") # dump the diarization output to disk using RTTM format with open("three.rttm", "w") as rttm: diarization.write_rttm(rttm)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions