Skip to content

Kcz/add mp4 spliting into frames#3

Closed
krzyczar wants to merge 135 commits intoxipingyan:masterfrom
krzyczar:kcz/add_mp4_disabling_into_frames
Closed

Kcz/add mp4 spliting into frames#3
krzyczar wants to merge 135 commits intoxipingyan:masterfrom
krzyczar:kcz/add_mp4_disabling_into_frames

Conversation

@krzyczar
Copy link
Copy Markdown

@krzyczar krzyczar commented Oct 8, 2025

Description

Ticket:

Fixes #(issue)

Checklist:

  • Tests have been updated or added to cover the new code
  • This patch fully addresses the ticket.
  • I have made corresponding changes to the documentation

xipingyan and others added 30 commits August 11, 2025 09:31
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Only calc once for video process.

Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
2: add ov::Properity::video

Signed-off-by: xipingya <xiping.yan@intel.com>
Co-authored-by: Wanglei Shen <wanglei.shen@intel.com>
# Conflicts:
#	src/cpp/src/continuous_batching/pipeline_base.cpp
#	src/cpp/src/visual_language/inputs_embedder.cpp
#	src/cpp/src/visual_language/inputs_embedder.hpp
#	src/cpp/src/visual_language/qwen2vl/classes.cpp
#	src/cpp/src/visual_language/qwen2vl/classes.hpp
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Add "video" to continues batching.

Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
xipingyan and others added 24 commits October 16, 2025 09:54
2:Add: mix video+image inputs.

Signed-off-by: xiping.yan <xiping.yan@intel.com>
<video_pad> + <image_pad>,

so put video to ahead of image. keep align with genai.

Signed-off-by: xiping.yan <xiping.yan@intel.com>
2: Split video and image process.
3: Fix copy embed feature bug.

Signed-off-by: xipingya <xiping.yan@intel.com>
Signed-off-by: xiping.yan <xiping.yan@intel.com>
Signed-off-by: xiping.yan <xiping.yan@intel.com>
Signed-off-by: xiping.yan <xiping.yan@intel.com>
Signed-off-by: xiping.yan <xiping.yan@intel.com>
Signed-off-by: xiping.yan <xiping.yan@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
wrapper calculate_product

Signed-off-by: xiping.yan <xiping.yan@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: xiping.yan <xiping.yan@intel.com>
Signed-off-by: xiping.yan <xiping.yan@intel.com>
Signed-off-by: xipingya <xiping.yan@intel.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: xiping.yan <xiping.yan@intel.com>
@sbalandi
Copy link
Copy Markdown

fyi @Wovchena

if input_data.get("video", None):
entry = Path(input_data["video"])
ordered_frames = pu.split_video_into_frames(entry, required_frames)
images.extend(ordered_frames)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That indeed interprets a video as images. Just like the title says. But GenAI started supporting video input as a separate entity. What's the reason to split the video into frames?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I will fix it

gen_fn = run_visual_language_generation_optimum
else:
gen_fn = run_visual_language_generation_genai
if use_genai: gen_fn = run_visual_language_generation_genai
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please, use ternary operator or keep previous variant, multiline code is more readable

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

else:
raise RuntimeError('== key word "prompt" does not exist ==')
prompt_data = create_base_prompt(json_data)
assert ("media" in json_data) ^ ("video" in json_data)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please, add message fo assert, but its better to use raise RuntimeError

if args['prompt_file'] is not None and len(args['prompt_file']) > 0:
vlm_file['media'] = model_utils.resolve_media_file_path(vlm_file.get("media"), args['prompt_file'][0])
if args['prompt_file'] is not None and len(args['prompt_file']) > 0 and 'media' in vlm_file:
if 'video' in vlm_file: log.warning('media and video cannot be specify in a single prompt file')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have warning here, but assert above, let's rise exeption in both cases

@krzyczar krzyczar changed the base branch from xp/enable_qwen_vl_video_preprocess to master November 6, 2025 09:26
@krzyczar krzyczar closed this by deleting the head repository Nov 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants