All

127 repositories

Olaf-World
Public
Orienting Latent Actions for Video World Modeling
genie video-generation interactive-video
genie video-generation interactive-video world-models latent-action
MIT License
•0•84•1•0•Updated Apr 16, 2026Apr 16, 2026
UENR-600K
Public
Other
•0•2•0•0•Updated Apr 9, 2026Apr 9, 2026
Awesome-Video-Diffusion
Public
A curated list of recent diffusion models for video generation, editing, and various other applications.
awesome video-editing video-generation
awesome video-editing video-generation diffusion-models motion-customization video-generation-evaluation
354•5.6k•0•0•Updated Apr 3, 2026Apr 3, 2026
showui-pi
Public
[CVPR 2026] ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands
agent gui
agent gui
Python
•
Apache License 2.0
•15•112•3•0•Updated Mar 27, 2026Mar 27, 2026
P-Flow
Public
P-Flow: Prompting Visual Effects Generation
0•2•1•0•Updated Mar 24, 2026Mar 24, 2026
Kiwi-Edit
Public
A unified and fully open-source framework for instruction-guided and reference-guided video editing using natural language.
video-editing video-generation
video-editing video-generation
Python
•
MIT License
•23•255•12•0•Updated Mar 11, 2026Mar 11, 2026
SMS
Public
[ICCV 2025] Balanced Image Stylization with Style Matching Score
style-transfer diffusion score-distillation
style-transfer diffusion score-distillation iccv2025
Python
•
MIT License
•2•70•0•0•Updated Mar 9, 2026Mar 9, 2026
DoraCycle
Public
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles
Python
•
Apache License 2.0
•1•32•2•0•Updated Mar 8, 2026Mar 8, 2026
Paper2Video
Public
Automatic Video Generation from Scientific Papers
Python
•
MIT License
•320•2.2k•4•0•Updated Mar 5, 2026Mar 5, 2026
Adv-GRPO
Public
[CVPR 2026] An official implementation of Adv-GRPO. The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation.
Python
•
MIT License
•1•80•1•0•Updated Feb 26, 2026Feb 26, 2026
World-VLA-Loop
Public
Github repository for World-VLA-Loop.
JavaScript
•2•20•2•0•Updated Feb 25, 2026Feb 25, 2026
videogui
Public
[NeurIPS 2024 D&B] VideoGUI: A Benchmark for GUI Automation from Instructional Videos
gui video-language llm-agent
gui video-language llm-agent
JavaScript
•3•52•2•0•Updated Feb 22, 2026Feb 22, 2026
Edit2Perceive
Public
[CVPR 2026] Official Implementation of Edit2Perceive
Python
•2•35•1•0•Updated Feb 21, 2026Feb 21, 2026
FocusUI
Public
[CVPR 2026] FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection
Python
•1•31•0•0•Updated Feb 10, 2026Feb 10, 2026
World-VLA-Loop-Pages
Public
World-VLA-Loop Project Github Pages
JavaScript
•0•2•0•0•Updated Feb 9, 2026Feb 9, 2026
Q2A
Public
[ECCV 2022] AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant
Python
•5•23•1•0•Updated Jan 30, 2026Jan 30, 2026
D-AR
Public
the official repo for "D-AR: Diffusion via Autoregressive Models"
diffusion-models autoregressive-models llms
diffusion-models autoregressive-models llms
Python
•
MIT License
•2•137•2•0•Updated Jan 29, 2026Jan 29, 2026
macosworld
Public
Python
•
Other
•2•32•0•0•Updated Jan 28, 2026Jan 28, 2026
DIM
Public
[ICLR 2026] Draw-In-Mind: Rebalancing Designer-Painter Roles in Unified Multimodal Models Benefits Image Editing
Python
•
Other
•0•27•0•0•Updated Jan 27, 2026Jan 27, 2026
T2F-Bench
Public
A comprehensive benchmark for evaluating text-to-film generation performance.
0•6•0•0•Updated Jan 22, 2026Jan 22, 2026
whisperVideo
Public
Find out who said what in the video.
video speech-recognition face-detection
video speech-recognition face-detection speech-to-text whisper asr
Jupyter Notebook
•18•141•1•0•Updated Jan 22, 2026Jan 22, 2026
ShowUI
Public
[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.
agent vision-language-model vision-language-action
agent vision-language-model vision-language-action computer-use gui-agent
Python
•
Apache License 2.0
•133•1.8k•15•0•Updated Jan 20, 2026Jan 20, 2026
ShowUI-Aloha
Public
Human-taught Computer-use Agent Designed for Real Windows and MacOS Desktops.
Python
•
Apache License 2.0
•34•278•5•0•Updated Jan 20, 2026Jan 20, 2026
Mitty
Public
Official code implementation of "Mitty: Diffusion-based Human-to-Robot Video Generation"
Python
•2•15•2•0•Updated Jan 14, 2026Jan 14, 2026
Aloha_Page
Public
The website for aloha introduction
HTML
•0•0•0•0•Updated Jan 13, 2026Jan 13, 2026
Show-o
Public
[ICLR & NeurIPS 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.
multimodal diffusion-models large-language-models
multimodal diffusion-models large-language-models
Python
•
Apache License 2.0
•90•1.9k•67•3•Updated Jan 8, 2026Jan 8, 2026
SAM-I2VPP
Public
[TPAMI 2026] SAM-I2V++
Jupyter Notebook
•
Apache License 2.0
•0•4•0•0•Updated Jan 7, 2026Jan 7, 2026
SAM-I2V
Public
[CVPR 2025] SAM-I2V
Jupyter Notebook
•
Apache License 2.0
•1•37•0•0•Updated Jan 2, 2026Jan 2, 2026
X-Humanoid
Public
Other
•2•41•2•0•Updated Dec 20, 2025Dec 20, 2025
RobotSeg
Public
Apache License 2.0
•0•45•2•0•Updated Dec 18, 2025Dec 18, 2025

ProTip! When viewing an organization's repositories, you can use the props. filter to filter by custom property.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Show Lab

All

All

127 repositories

Olaf-World

UENR-600K

Awesome-Video-Diffusion

showui-pi

P-Flow

Kiwi-Edit

SMS

DoraCycle

Paper2Video

Adv-GRPO

World-VLA-Loop

videogui

Edit2Perceive

FocusUI

World-VLA-Loop-Pages

Q2A

D-AR

macosworld

DIM

T2F-Bench

whisperVideo

ShowUI

ShowUI-Aloha

Mitty

Aloha_Page

Show-o

SAM-I2VPP

SAM-I2V

X-Humanoid

RobotSeg

All

All

Repositories list

127 repositories