Skip to content

Clarification on datasets used to train the released FoundationStereo model (zero-shot model in Table 2?) #87

Open
@ZYX-MLer

Description

@ZYX-MLer

Hi, thank you for your great work on FoundationStereo — the results are very impressive, especially the zero-shot generalization across challenging real-world datasets.

While reading the paper and using the released model, I had a few clarifying questions:

  1. Training datasets used for the zero-shot foundation model

In Section 4.1, it is mentioned that the foundation model was trained on:

“a mixed dataset consisting of our proposed FSD, together with Scene Flow, Sintel, CREStereo, FallingThings, InStereo2K, and Virtual KITTI 2.”

However, in Table 1, additional datasets like TartanAir and IRS are summarized and compared. Could you please confirm:
• Were TartanAir or IRS used in any version of the model training?
• If not, is there a reason for excluding them (e.g., limited benefit, domain mismatch, or data quality concerns)?

  1. Released model: is it the zero-shot foundation model in the paper?

You have kindly released a pretrained FoundationStereo model. Could you please clarify:
• Does this released checkpoint correspond to the zero-shot foundation model described in the paper?
• Was it trained only on the datasets listed in Section 4.1, without using any of the evaluation/test sets (Middlebury, ETH3D, KITTI 2012/2015)?

  1. Table 2 results: which training data was used?

In Table 2, zero-shot generalization results are reported across four datasets (Middlebury, ETH3D, KITTI-12, KITTI-15).
• Can you confirm that the results in the second block of Table 2 (i.e., your strongest results) are based only on training with the datasets listed in Section 4.1, excluding any test-domain-specific data?

This clarification would help ensure reproducibility and give confidence to others using or fine-tuning the released model.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

      Participants

      @ZYX-MLer

      Issue actions

        Clarification on datasets used to train the released FoundationStereo model (zero-shot model in Table 2?) · Issue #87 · NVlabs/FoundationStereo