Skip to content

MiniCPM-o-4_5 transformer推理"video_url": {"url": video_path, "use_audio": True} use_audio默认都为True不可更改么? #1072

@KivenJoo

Description

@KivenJoo

在接入MiniCPM-o-4_5时,视频推理部分,采用OpenAI的结构推理,content.append({
"type": "video_url",
"video_url": {"url": video_path, "use_audio": False}
}) 使用"use_audio": False 发现未生效,我的视频如果没有音频的时候会报错

answer = self.model.chat(msgs=messages, max_new_tokens=32768, omni_mode=True, use_tts_template=False,
File "/usr/local/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/hadoop-aipnlp/.cache/huggingface/modules/transformers_modules/main/modeling_minicpmo.py", line 1146, in chat
content = normalize_content(content)
File "/home/hadoop-aipnlp/.cache/huggingface/modules/transformers_modules/main/utils.py", line 2400, in normalize_content
normalized = normalize_content_item(item)
File "/home/hadoop-aipnlp/.cache/huggingface/modules/transformers_modules/main/utils.py", line 2338, in normalize_content_item
video_frames, audio_segments, stacked_frames = get_video_frame_audio_segments(
File "/home/hadoop-aipnlp/.local/lib/python3.10/site-packages/minicpmo/utils.py", line 414, in get_video_frame_audio_segments
audio_segments = get_audio_segments(
File "/home/hadoop-aipnlp/.local/lib/python3.10/site-packages/minicpmo/utils.py", line 175, in get_audio_segments
video_clip.audio.write_audiofile(temp_audio_file_path, codec="pcm_s16le", fps=sr)
AttributeError: 'NoneType' object has no attribute 'write_audiofile'

看了下https://huggingface.co/openbmb/MiniCPM-o-4_5/blob/main/utils.py#L2338 好像没有支持use_audio 这个参数的传递?这个目前有其他方法解决么?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions