Skip to content

docs: add TwelveLabs Pegasus video understanding example#2355

Draft
mohit-twelvelabs wants to merge 3 commits into
roboflow:developfrom
mohit-twelvelabs:feat/twelvelabs-integration
Draft

docs: add TwelveLabs Pegasus video understanding example#2355
mohit-twelvelabs wants to merge 3 commits into
roboflow:developfrom
mohit-twelvelabs:feat/twelvelabs-integration

Conversation

@mohit-twelvelabs

Copy link
Copy Markdown

Hi! I'm Mohit, I work at TwelveLabs (@mohit-twelvelabs).

Description

This adds a new examples/twelvelabs_video_understanding/ example that pairs per-frame Supervision detections with a whole-video understanding generated by TwelveLabs Pegasus.

Supervision answers "what object is where, in each frame" with precise, box-level detections. Pegasus answers "what is this video about" with a natural-language summary of the whole clip. Running both on the same video combines fine-grained detection with a high-level narrative — useful for captioning, search, moderation, or generating a quick description alongside annotated output.

Type of Change

  • 📝 Documentation update (new self-contained example; no changes to the supervision package or any existing behavior)

Motivation and Context

Supervision is model-agnostic and already ships examples that plug in detectors (Ultralytics, Inference, RF-DETR). This example shows how to enrich those detections with video-level understanding, an angle none of the existing examples cover. It's entirely opt-in and additive — it lives only under examples/, imports nothing into the library, and changes no defaults.

Changes Made

  • Add examples/twelvelabs_video_understanding/ with twelvelabs_example.py, README.md, requirements.txt, and .gitignore.
  • Follow the existing example conventions exactly: jsonargparse auto_cli entrypoint, from __future__ import annotations, Google-style docstrings, and the same README layout/sections as the other examples.
  • Register the example in examples/README.md.

Testing

  • I have tested this code locally
  • Ran ruff check and ruff format --check on the new script — both pass with the repo config.
  • Exercised the real analyze_video() path end-to-end against the live TwelveLabs Pegasus API (pegasus1.5) with a short generated clip; it returned a correct natural-language description of the video content.
  • Verified the missing-API-key guard raises a clear error.

The example needs a TwelveLabs API key to run the Pegasus call (read from --api_key or the TWELVELABS_API_KEY env var). You can grab a free API key at https://twelvelabs.io — there's a generous free tier.

Additional Notes

Happy to adjust naming, placement, or the README wording to match maintainer preferences.

@CLAassistant

CLAassistant commented Jun 25, 2026

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

@Borda Borda requested a review from Copilot June 26, 2026 09:12
@Borda Borda marked this pull request as draft June 26, 2026 09:14

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new self-contained example under examples/ demonstrating how to combine frame-level Supervision detections (YOLO/Ultralytics) with a whole-video natural-language summary from TwelveLabs Pegasus, and registers it in the examples index.

Changes:

  • Introduces twelvelabs_example.py with a jsonargparse CLI that runs Pegasus analysis + per-frame detection/annotation.
  • Adds accompanying example documentation (README.md), dependencies (requirements.txt), and local ignore rules (.gitignore).
  • Adds the new example entry to examples/README.md.

Assessment (per guidelines):

  • Code quality: 4/5
  • Testing: 3/5 (example-only; no repo tests expected)
  • Documentation: 4/5

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
examples/twelvelabs_video_understanding/twelvelabs_example.py New runnable example script combining Pegasus video summary with per-frame Supervision detections.
examples/twelvelabs_video_understanding/requirements.txt Declares dependencies needed to run the example.
examples/twelvelabs_video_understanding/README.md Provides install/run instructions and argument documentation for the example.
examples/twelvelabs_video_understanding/.gitignore Prevents committing local data, weights, and generated media for this example.
examples/README.md Registers the new example in the examples list.

Comment thread examples/twelvelabs_video_understanding/twelvelabs_example.py
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants