MegaFlow: Zero-Shot Large Displacement Optical Flow

Dingxi Zhang¹ Fangjinhua Wang¹ Marc Pollefeys^1,2 Haofei Xu^1,3

¹ ETH Zurich ² Microsoft ³ University of Tübingen, Tübingen AI Center

MegaFlow is a simple, powerful, and unified model for zero-shot large displacement optical flow and point tracking.

MegaFlow leverages pre-trained Vision Transformer features to naturally capture extreme motion, followed by a lightweight iterative refinement for sub-pixel accuracy. This approach achieves state-of-the-art zero-shot performance across major optical flow benchmarks (Sintel, KITTI, Spring) while delivering highly competitive zero-shot generalizability on long-range point tracking benchmarks.

Highlights

🏆 Strong zero-shot performance across Sintel, Spring, and KITTI
🎯 Excels in large displacement optical flow estimation
📹 Flexible temporal window: seamlessly processes any number of frames
🔄 General motion backbone: naturally extends to point tracking

Installation

# Clone the repository
git clone https://github.com/cvg/megaflow.git
cd megaflow

# Create local conda environment
conda create -n megaflow python=3.12 -y
conda activate megaflow

# Install dependencies
pip install -e .

# (Optional) Install FlashAttention-3 for faster inference on Hopper GPUs
git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention/hopper
python setup.py install
cd ../..

Or install directly:

pip install git+https://github.com/cvg/megaflow.git

Requirements: Python ≥ 3.12, PyTorch ≥ 2.7, CUDA recommended.

Pretrained Models

Pretrained checkpoints are available on 🤗 HuggingFace and are auto-downloaded:

Model Name	Description
`megaflow-flow`	Optical flow (default)
`megaflow-chairs-things`	Optical flow trained on FlyingThings and FlyingChairs
`megaflow-track`	Point tracking (Kubric fine-tuned)

import torch
from megaflow import MegaFlow
from megaflow.utils.basic import gridcloud2d

device = "cuda" if torch.cuda.is_available() else "cpu"

# Prepare video tensor [1, T, 3, H, W] in float32, range [0, 255]
video = ...

with torch.inference_mode():
    with torch.autocast(device_type=device, dtype=torch.bfloat16, enabled=True):
        # --- Task 1: Optical Flow ---
        flow_model = MegaFlow.from_pretrained("megaflow-flow").eval().to(device)
        # Returns flow predictions for consecutive frame pairs (0->1, 1->2...)
        flow_predictions = flow_model(video, num_reg_refine=8)["flow_preds"][-1]

        # --- Task 2: Point Tracking ---
        track_model = MegaFlow.from_pretrained("megaflow-track").eval().to(device)
        # Returns tracking offsets between first frame and query frame (0->t)
        flows_e = track_model.forward_track(video, num_reg_refine=8)["flow_final"]
        # Add absolute grid coordinates to get final point tracks
        grid_xy = gridcloud2d(1, H, W, norm=False, device=device).float()  
        grid_xy = grid_xy.permute(0, 2, 1).reshape(1, 1, 2, H, W)  
        tracking_predictions = flows_e + grid_xy

Demo

Optical Flow Estimation

# Processes the video and auto-downloads the megaflow-flow model
python demo_flow.py --input assets/longboard.mp4 --output output/longboard_flow.mp4

Point Tracking

# Tracks points and auto-downloads the megaflow-track model
python demo_track.py --input assets/apple.mp4 --grid_size 8

You can also run python demo_gradio.py to launch a local web UI, try our HuggingFace demo or open the Colab notebook for an interactive online demo directly in the browser.

Datasets

To train and evaluate MegaFlow, you will need to download the required datasets: FlyingChairs, FlyingThings3D, Sintel, KITTI, HD1K, TartanAir, and Spring.

For tracking, you will need to download processed Kubric from AllTracker and TAP-Vid:

Kubric: Download the 24-frame data (kubric_au.tar.gz) and the 64-frame data parts (part1, part2, part3).
TAP-Vid: Download the TAP-Vid-DAVIS, TAP-Vid-RGB-stacking and TAP-Vid-Kinetics datasets from here for evaluation.

Merge the point tracking splits by concatenating:

cat ce64_kub_aa ce64_kub_ab ce64_kub_ac > ce64_kub.tar.gz

By default datasets.py will search for the datasets in these locations. You can create symbolic links to wherever the datasets were downloaded in the datasets folder:

├── datasets
    ├── FlyingChairs_release
    ├── FlyingThings3D
    ├── Sintel
    ├── KITTI
    ├── HD1K
    ├── spring
    ├── TartanAir
    ├── kubric_au/
    └── TAP_Vid/
        ├── tapvid_davis/
        ├── tapvid_kinetics/
        └── tapvid_rgb_stacking/

Training

MegaFlow was trained on a multi-stage curriculum, where each stage loads a checkpoint from the previous stage via the restore_ckpt field in the config JSON.

Please refer to train.sh for the complete training curriculum.

Note: Adjust --nproc_per_node based on the number of available GPUs. The effective_batch_size in the config will be split across all GPUs and nodes automatically. Update restore_ckpt in each config to point to the checkpoint from the previous stage.

Evaluation

# Zero-shot evaluation (Sintel + KITTI)
python -m scripts.evaluate --cfg config/eval/zero-shot.json

# Point tracking (TAP-Vid)
python -m scripts.evaluate --cfg config/eval/tapvid.json

Note: Update the restore_ckpt field in each eval config to point to your trained checkpoints.

Citation

If you find MegaFlow useful in your research, please cite:

@article{zhang2026megaflow,
  title   = {MegaFlow: Zero-Shot Large Displacement Optical Flow},
  author  = {Zhang, Dingxi and Wang, Fangjinhua and Pollefeys, Marc and Xu, Haofei},
  journal = {arXiv preprint arXiv:2603.25739},
  year    = {2026}
}

Acknowledgements

We thank the original authors of the following projects for their excellent open-source work: Unimatch, GMFlow, VGGT, AllTracker, SEA-RAFT, and MEMFOF.

License

This project is released under the Apache 2.0 License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MegaFlow: Zero-Shot Large Displacement Optical Flow

Highlights

Installation

Pretrained Models

Demo

Optical Flow Estimation

Point Tracking

Datasets

Training

Evaluation

Citation

Acknowledgements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
config		config
megaflow		megaflow
scripts		scripts
LICENSE		LICENSE
README.md		README.md
demo_colab.ipynb		demo_colab.ipynb
demo_flow.py		demo_flow.py
demo_gradio.py		demo_gradio.py
demo_track.py		demo_track.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
train.sh		train.sh

Folders and files

Latest commit

History

Repository files navigation

MegaFlow: Zero-Shot Large Displacement Optical Flow

Highlights

Installation

Pretrained Models

Demo

Optical Flow Estimation

Point Tracking

Datasets

Training

Evaluation

Citation

Acknowledgements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages