A question about input geometric parameters

Dear Sirs, 
I have a question about the expected outcome when passing geometric parameters as input during training. 
In my use case, I have the exact extrinsics (Rt) and intrinsics (K) parameters of each camera in my setup.

I would have expected the model to use these inputs as "fixed ground truth references" when predicting the other outputs (such as depth maps) during inference. 

However, it would appear to me that these input parameters are considered as an "optimization initialization" as they are modified during the inference process. 

Am I missing something obvious here? 
Thank you in advance for your kind help and amazing work!

Here is a relevant snippet for my code:

```
from mapanything.utils.image import load_images

images = [path1, path2, path3] 
views = load_images(images)

views[0].update({"camera_poses":extrinsics_0_torch})
views[1].update({"camera_poses":extrinsics_1_torch})
views[2].update({"camera_poses":extrinsics_2_torch})

views[0].update({"intrinsics":intrinsics_0_torch})
views[1].update({"intrinsics":intrinsics_1_torch})
views[2].update({"intrinsics":intrinsics_2_torch})

predictions = model.infer(
    views,                            # Input views
    memory_efficient_inference=True,  # Trades off speed for more views (up to 2000 views on 140 GB). Trade off is negligible - see profiling section
    minibatch_size=None,              # Minibatch size for memory-efficient inference (use 1 for smallest GPU memory consumption). Default is dynamic computation based on available GPU memory.
    use_amp=True,                     # Use mixed precision inference (recommended)
    amp_dtype="bf16",                 # bf16 inference (recommended; falls back to fp16 if bf16 not supported)
    apply_mask=True,                  # Apply masking to dense geometry outputs
    mask_edges=True,                  # Remove edge artifacts by using normals and depth
    apply_confidence_mask=True,       # Filter low-confidence regions
    confidence_percentile=10,         # Remove bottom 10 percentile confidence pixels
    use_multiview_confidence=False,   # Enable multi-view depth consistency based confidence in place of learning-based one
)

all_predicted_camera_poses = []
all_predicted_intrinsics = []

for i, pred in enumerate(predictions):
    intrinsics = pred["intrinsics"]           # Recovered pinhole camera intrinsics (B, 3, 3)
    camera_poses = pred["camera_poses"]       # OpenCV (+X - Right, +Y - Down, +Z - Forward) cam2world poses in world frame (B, 4, 4)
    all_predicted_camera_poses.append(camera_poses.detach().cpu().numpy())
    all_predicted_intrinsics.append(intrinsics.detach().cpu().numpy())

import pdb; pdb.set_trace()
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A question about input geometric parameters #150

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

A question about input geometric parameters #150

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions