-
Notifications
You must be signed in to change notification settings - Fork 107
Description
(m3_agent_env_311) (base) ubuntu@VM-0-13-ubuntu:~/m3-agent-master$ python m3_agent/memorization_intermediate_outputs.py \
--data_file data/data.jsonl
/home/ubuntu/m3-agent-master/m3_agent_env_311/lib/python3.11/site-packages/albumentations/check_version.py:147: UserWarning: Error fetching version info The read operation timed out
data = fetch_version_info()
/home/ubuntu/m3-agent-master/m3_agent_env_311/lib/python3.11/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py:121: UserWarning: Specified provider 'CUDAExecutionProvider' is not in available provider names.Available providers: 'AzureExecutionProvider, CPUExecutionProvider'
warnings.warn(
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/ubuntu/.insightface/models/buffalo_l/1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/ubuntu/.insightface/models/buffalo_l/2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/ubuntu/.insightface/models/buffalo_l/det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/ubuntu/.insightface/models/buffalo_l/genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /home/ubuntu/.insightface/models/buffalo_l/w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5
set det-size: (640, 640)
/home/ubuntu/m3-agent-master/mmagent/voice_processing.py:35: FutureWarning: You are usingtorch.loadwithweights_only=False(the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value forweights_onlywill be flipped toTrue. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user viatorch.serialization.add_safe_globals. We recommend you start settingweights_only=Truefor any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
pretrained_state = torch.load("models/pretrained_eres2netv2.ckpt", map_location='cpu')
{'video_found': True, 'audio_found': True, 'metadata': {'major_brand': 'isom', 'minor_version': '512', 'compatible_brands': 'isomiso2mp41', 'encoder': 'Lavf58.29.100'}, 'inputs': [{'streams': [{'input_number': 0, 'stream_number': 0, 'stream_type': 'video', 'language': None, 'default': True, 'size': [1578, 720], 'bitrate': 7249, 'fps': 30.0, 'codec_name': 'hevc', 'profile': '(Main)', 'metadata': {'Metadata': '', 'handler_name': 'VideoHandler', 'vendor_id': '[0][0][0][0]'}}, {'input_number': 0, 'stream_number': 1, 'stream_type': 'audio', 'language': None, 'default': True, 'fps': 44100, 'bitrate': 129, 'metadata': {'Metadata': '', 'handler_name': 'SoundHandler', 'vendor_id': '[0][0][0][0]'}}], 'input_number': 0}], 'duration': 30.17, 'bitrate': 7397, 'start': 0.0, 'default_video_input_number': 0, 'default_video_stream_number': 0, 'video_codec_name': 'hevc', 'video_profile': '(Main)', 'video_size': [1578, 720], 'video_bitrate': 7249, 'video_fps': 30.0, 'default_audio_input_number': 0, 'default_audio_stream_number': 1, 'audio_fps': 44100, 'audio_bitrate': 129, 'video_duration': 30.17, 'video_n_frames': 905}
/home/ubuntu/m3-agent-master/m3_agent_env_311/lib/python3.11/site-packages/imageio_ffmpeg/binaries/ffmpeg-linux-x86_64-v7.0.2 -i data/clips/robot/bedroom_01/48.mp4 -loglevel error -f image2pipe -vf scale=1578:720 -sws_flags bicubic -pix_fmt rgb24 -vcodec rawvideo -
MoviePy - Writing audio in /tmp/tmp_1wfsi4j.wav
MoviePy - Done.
Unrecognized keys inrope_scalingfor 'rope_type'='default': {'mrope_section'}
You are attempting to use Flash Attention 2 without specifying a torch dtype. This might lead to unexpected behaviour
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:03<00:00, 1.27it/s]
The image processor of typeQwen2VLImageProcessoris now loaded as a fast processor by default, even if the model checkpoint was saved with a slow processor. This is a breaking change and may produce slightly different outputs. To continue using the slow processor, instantiate this class withuse_fast=False. Note that this behavior will be extended to all models in a future release.
You have video processor config saved inpreprocessor.jsonfile which is deprecated. Video processor configs should be saved in their ownvideo_preprocessor.jsonfile. You can rename the file or load and save the processor back which renames it automatically. Loading frompreprocessor.jsonwill be removed in v5.0.
Traceback (most recent call last):
File "/home/ubuntu/m3-agent-master/mmagent/voice_processing.py", line 234, in process_voices
with open(save_path, "r") as f:
^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'data/intermediate_outputs/robot/bedroom_01/clip_8_voices.json'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/ubuntu/m3-agent-master/m3_agent/memorization_intermediate_outputs.py", line 93, in
streaming_process_video(json.loads(line))
File "/home/ubuntu/m3-agent-master/m3_agent/memorization_intermediate_outputs.py", line 77, in streaming_process_video
process_segment(
File "/home/ubuntu/m3-agent-master/m3_agent/memorization_intermediate_outputs.py", line 40, in process_segment
process_voices(
File "/home/ubuntu/m3-agent-master/mmagent/voice_processing.py", line 239, in process_voices
asrs = diarize_audio(base64_video, filter=filter_duration_based)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/m3-agent-master/mmagent/voice_processing.py", line 158, in diarize_audio
response, _ = qwen_get_response(messages)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/m3-agent-master/mmagent/utils/chat_qwen.py", line 48, in get_response
text = processor.apply_chat_template(messages, add_generation_prompt=True, tokenize=False)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/ubuntu/m3-agent-master/m3_agent_env_311/lib/python3.11/site-packages/transformers/models/qwen2_5_omni/processing_qwen2_5_omni.py", line 343, in apply_chat_template
or conversation[0]["content"][0]["text"]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^
TypeError: string indices must be integers, not 'str'
(m3_agent_env_311) (base) ubuntu@VM-0-13-ubuntu:~/m3-agent-master$