Skip to content

ETH3D evaluation: support for camera distortion parameters or undistorted images #7

@xjiangan

Description

@xjiangan

Hi,
Thank you for maintaining this benchmark and framework. I would like to raise a concern regarding the ETH3D evaluation setup in robustmvd.

Summary

The ETH3D high-resolution benchmark provides DSLR images with noticeable lens distortion, and the provided ground-truth camera calibrations adopt the THIN_PRISM_FISHEYE camera model to account for this distortion. However, in the current robustmvd evaluation pipeline, the distorted images and ground truth appear to be used without exposing or applying the distortion parameters in the input adapter or evaluation logic. As a result, the evaluation effectively assumes a pinhole camera model, leading to geometric inaccuracy.

This issue is particularly relevant for methods that explicitly rely on geometric reasoning. Supplying inaccurate or incomplete geometric information without notice can systematically degrade their performance on ETH3D. Conversely, this setup may unintentionally bias the benchmark toward methods that either do not rely heavily on provided camera parameters or estimate the camera parameters themselves.

Recent work (e.g., https://arxiv.org/abs/2503.22430) demonstrates that evaluating on undistorted images leads to substantially improved performance on ETH3D, further suggesting that distortion handling plays a significant role in evaluation outcomes.

Request

To improve the validity and fairness of ETH3D evaluation results, I would like to ask whether one of the following options could be considered:

  1. Add explicit support for camera distortion, by exposing the THIN_PRISM_FISHEYE parameters from the ETH3D calibration files and allowing methods to account for the distortion themselves during inference or preprocessing.

    This would enable models to operate on the original images while respecting the correct camera model.

  2. Switch to using undistorted ETH3D images and corresponding pinhole intrinsics and ground truth for evaluation instead of the raw distorted data.

    This would provide a geometrically consistent, pinhole-based evaluation setup aligned with the assumptions of most current depth estimation methods.

Either approach would help reduce evaluation artifacts caused by camera model mismatch and lead to more interpretable and comparable ETH3D benchmark results.

Thank you for your time and consideration.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions