Skip to content

Conversation

@xiaoya27
Copy link

Summary

Adds optional parallel multi-GPU mode for VGGT inference in demo_colmap.py using torch.multiprocessing. This enables processing large image sequences that would otherwise exceed single GPU memory.

Motivation

VGGT inference on 221 images requires ~77GB VRAM, which exceeds the capacity of many GPUs. This PR allows distributing frames across multiple GPUs to:

  • Reduce memory per GPU: Each GPU loads only its shard (e.g., ~42GB for 110 frames on 2 GPUs)
  • Speed up inference: Near-linear speedup with parallel processing

Changes

  • Add --multi_gpu flag to enable parallel multi-GPU mode
  • Add --gpu_ids to specify which GPUs to use (default: all available)
  • Add _worker_process() for multiprocessing workers
  • Add run_VGGT_multi_gpu() for parallel inference orchestration
  • Use shared memory for efficient tensor transfer between processes
  • Set 'spawn' multiprocessing start method for CUDA compatibility

Usage

# Use all available GPUs
python demo_colmap.py --scene_dir=/path/to/scene --multi_gpu

# Specify GPUs
python demo_colmap.py --scene_dir=/path/to/scene --multi_gpu --gpu_ids=0,1

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Dec 12, 2025
Adds optional multi-GPU mode that distributes VGGT inference across
multiple GPUs using torch.multiprocessing for true parallel processing.

Key changes:
- Add --multi_gpu flag to enable parallel multi-GPU mode
- Add --gpu_ids to specify which GPUs to use (default: all available)
- Add _worker_process() for multiprocessing worker
- Add run_VGGT_multi_gpu() for parallel inference orchestration
- Use shared memory for efficient tensor transfer between processes
- Add 'spawn' start method for CUDA compatibility in __main__

Multi-GPU benefits:
- Memory: Each GPU loads only its shard (~55 frames instead of 221)
- Speed: Near-linear speedup with parallel processing
- Example: 221 images on 4 GPUs = ~55 frames/GPU, ~42GB peak each

Usage:
  # All available GPUs
  python demo_colmap.py --scene_dir=/path/to/scene --multi_gpu

  # Specific GPUs
  python demo_colmap.py --scene_dir=/path/to/scene --multi_gpu --gpu_ids=0,1

Note: BA and tracking still run on single GPU after multi-GPU inference.
@xiaoya27 xiaoya27 force-pushed the feat/multi-gpu-inference branch from dbf9299 to d00d1d3 Compare December 12, 2025 08:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant