lightgluestick train pipeline & small bugs & README

aubingazhib · aubingazhib · commit 43c41bf76aec · 2025-09-01T20:27:03.000Z
diff --git a/README.md b/README.md
@@ -104,6 +104,30 @@ Since we use points and lines to solve for the homography, we use a different ro
 
 </details>
 
+<details>
+<summary>[Evaluating LightGlueStick]</summary>
+
+To evaluate LightGlueStick on HPatches, run:
+```bash
+python -m gluefactory.eval.hpatches --conf superpoint+lsd+lightgluestick-official --overwrite
+```
+You should expect the following results
+```
+{'H_error_dlt@1px': 0.3725,
+ 'H_error_dlt@3px': 0.6803,
+ 'H_error_dlt@5px': 0.7806,
+ 'H_error_ransac@1px': 0.3907,
+ 'H_error_ransac@3px': 0.6973,
+ 'H_error_ransac@5px': 0.7947,
+ 'H_error_ransac_mAA': 0.6275666666666667,
+ 'mH_error_dlt': nan,
+ 'mH_error_ransac': 0.606,
+ 'mnum_keypoints': 2287.25,
+ 'mnum_matches': 1117.0,
+ 'mprec@1px': 0.281,
+ 'mprec@3px': 0.936}
+```
+</details>
 
 #### MegaDepth-1500
 
@@ -153,6 +177,19 @@ python -m gluefactory.eval.megadepth1500 --conf gluefactory/configs/superpoint+l
 
 </details>
 
+<details>
+<summary>[Evaluating LightGlueStick]</summary>
+
+To evaluate the pre-trained SuperPoint+LightGlueStick model on MegaDepth-1500, run:
+```bash
+python -m gluefactory.eval.megadepth1500 --conf superpoint+lsd+lightgluestick-official
+# or the adaptive variant
+python -m gluefactory.eval.megadepth1500 --conf superpoint+lsd+lightgluestick-official \
+    model.matcher.depth_confidence=0.95
+```
+
+</details>
+
 <details>
 
 Here are the results as Area Under the Curve (AUC) of the pose error at  5/10/20 degrees:
@@ -214,6 +251,21 @@ AP_lines: 69.22
 
 </details>
 
+<details>
+<summary>[Evaluating LightGlueStick]</summary>
+
+To evaluate LightGlueStick on ETH3D, run:
+```bash
+python -m gluefactory.eval.eth3d --conf superpoint+lsd+lightgluestick-official
+```
+You should expect the following results
+```
+AP: 78.13
+AP_lines: 74.62
+```
+
+</details>
+
 #### Image Matching Challenge 2021
 Coming soon!
 
@@ -308,16 +360,46 @@ We then fine-tune the model on the MegaDepth dataset:
 ```bash
 python -m gluefactory.train gluestick_MD --conf gluefactory/configs/superpoint+lsd+gluestick-megadepth.yaml --distributed
 ```
+
 Note that we used the training splits `train_scenes.txt` and `valid_scenes.txt` to train the original model, which contains some overlap with the IMC challenge. The new default splits are now `train_scenes_clean.txt` and `valid_scenes_clean.txt`, without this overlap.
 
 </details>
 
+<details>
+<summary>[Training LightGlueStick]</summary>
+
+We first pre-train LightGlueStick on the homography dataset:
+```bash
+python -m gluefactory.train lightgluestick_H --conf gluefactory/configs/superpoint+lsd+lightgluestick_homography.yaml --distributed
+```
+Feel free to use any other experiment name. Configurations are managed by [OmegaConf](https://omegaconf.readthedocs.io/) so any entry can be overridden from the command line.
+
+We then fine-tune the model on the MegaDepth dataset:
+```bash
+python -m gluefactory.train lightgluestick_MD --conf gluefactory/configs/superpoint+lsd+lightgluestick_megadepth.yaml --distributed
+```
+
+To speed up training on MegaDepth, we suggest to cache the local features before training
+
+```bash
+# extract features
+python -m gluefactory.scripts.export_megadepth --method sp_lsd_wireframe --num_workers 8
+# run training with cached features (change the data.load_features.path depending on the export parameters). We cache 1500 keypoints and 512 lines.
+python -m gluefactory.train lightgluestick_MD \
+    --conf gluefactory/configs/superpoint+lsd+lightgluestick_megadepth.yaml \
+    train.load_experiment=lightgluestick_H \
+    data.load_features.do=True
+```
+
+</details>
+
 ### Available models
 Glue Factory supports training and evaluating the following deep matchers:
 | Model     | Training? | Evaluation? |
 | --------- | --------- | ----------- |
 | [LightGlue](https://github.com/cvg/LightGlue) | ✅         | ✅           |
 | [GlueStick](https://github.com/cvg/GlueStick) | ✅         | ✅           |
+| [LightGlueStick](https://github.com/aubingazhib/LightGlueStick) | ✅         | ✅           |
 | [SuperGlue](https://github.com/magicleap/SuperGluePretrainedNetwork) | ✅         | ✅           |
 | [LoFTR](https://github.com/zju3dv/LoFTR)     | ❌         | ✅           |
 
diff --git a/gluefactory/configs/superpoint+lsd+lightgluestick-official.yaml b/gluefactory/configs/superpoint+lsd+lightgluestick-official.yaml
@@ -1,5 +1,5 @@
 model:
-    name: gluefactory.models.two_view_pipeline
+    name: two_view_pipeline
     extractor:
         name: gluefactory.models.lines.wireframe
         point_extractor:
@@ -18,30 +18,20 @@ model:
             merge_line_endpoints: True
             nms_radius: 3
     matcher:
-      name: gluefactory.models.matchers.lightgluestick
+      name: gluefactory.models.matchers.lightgluestick_pretrained
       depth_confidence: -1
       width_confidence: -1
       filter_threshold: 0.1
-      line_threshold: 3
-      tau: 3
-      method: "mean"
-      weights: superpoint # This will download weights from internet
 
     # ground_truth:    # for ETH3D, comment otherwise
     #     name: gluefactory.models.matchers.depth_matcher
     #     use_lines: True
-
 benchmarks:
     hpatches:
         eval:
             use_lines: True
             estimator: homography_est
             ransac_th: -1    # [1., 1.5, 2., 2.5, 3.]
-    scannet:
-        eval:
-            use_lines: True
-            estimator: homography_est
-            ransac_th: -1
     megadepth1500:
         data:
             preprocessing:
@@ -50,19 +40,6 @@ benchmarks:
         eval:
             estimator: poselib
             ransac_th: -1
-    megadepth1500_match_eval:
-        data:
-            preprocessing:
-                side: long
-                resize: 1600
-        model:
-            ground_truth:
-                name: gluefactory.models.matchers.depth_matcher
-                use_lines: True
-        eval:
-            eval_lines: True
-            estimator: poselib
-            ransac_th: -1
     eth3d:
         model:
             ground_truth:
diff --git a/gluefactory/configs/superpoint+lsd+lightgluestick_homography.yaml b/gluefactory/configs/superpoint+lsd+lightgluestick_homography.yaml
@@ -41,7 +41,6 @@ model:
     th_negative: 5
   matcher:
     name: gluefactory.models.matchers.lightgluestick
-    weights: superpoint
     input_dim: 256
     descriptor_dim: 256
     flash: false
diff --git a/gluefactory/eval/eth3d.py b/gluefactory/eval/eth3d.py
@@ -97,12 +97,10 @@ def get_predictions(self, experiment_dir, model=None, overwrite=False):
         return pred_file
 
     def run_eval(self, loader, pred_file):
-        eval_conf = self.conf.eval
         r = eval_dataset(loader, pred_file)
         if self.conf.eval.eval_lines:
-            r.update(eval_dataset(loader, pred_file, conf=eval_conf, suffix="_lines"))
+            r.update(eval_dataset(loader, pred_file, suffix="_lines"))
         s = {}
-
         return s, {}, r
 
 
@@ -199,4 +197,4 @@ def plot_pr_curve(
                 results,
                 dst_file="eth3d_pr_curve_lines.pdf",
                 suffix="_lines",
-            )
+            )
diff --git a/gluefactory/models/cache_loader.py b/gluefactory/models/cache_loader.py
@@ -121,7 +121,7 @@ def _forward(self, data):
             pred = batch_to_device(pred, device)
             for k, v in pred.items():
                 for pattern in self.conf.scale:
-                    if k.startswith(pattern):
+                    if k.startswith(pattern) and not k.startswith("lines_junc"):
                         view_idx = k.replace(pattern, "")
                         scales = (
                             data["scales"]
diff --git a/gluefactory/models/matchers/lightgluestick.py b/gluefactory/models/matchers/lightgluestick.py
@@ -44,9 +44,9 @@ def rotate_half(x: torch.Tensor) -> torch.Tensor:
 def apply_cached_rotary_emb(freqs: torch.Tensor, t: torch.Tensor) -> torch.Tensor:
     return (t * freqs[0]) + (rotate_half(t) * freqs[1])
 
-def create_mask(lines_junc_idx, num_nodes):
+def create_mask(lines_junc_idx):
     # Get batch size and number of connections
-    bs = lines_junc_idx.shape[0]
+    bs, num_nodes = lines_junc_idx.shape
     # Create an empty mask
     mask = torch.eye(num_nodes, dtype=torch.float32).unsqueeze(0).repeat(bs, 1, 1)
 
@@ -196,6 +196,7 @@ def forward(
             self,
             x: torch.Tensor,
             encoding: torch.Tensor,
+            mask_ffn: torch.Tensor,
             mask: Optional[torch.Tensor] = None,
 
     ) -> torch.Tensor:
@@ -207,7 +208,7 @@ def forward(
         context = self.inner_attn(q, k, v, mask=mask)
         message = self.out_proj(context.transpose(1, 2).flatten(start_dim=-2))
 
-        return x + self.ffn(torch.cat([x, message], -1))
+        return x + self.ffn(torch.cat([x, message], -1)) * mask_ffn.unsqueeze(-1)
 
 class CrossBlock(nn.Module):
     def __init__(
@@ -280,6 +281,8 @@ def forward(
             desc1,
             encoding0,
             encoding1,
+            mask_ffn0,
+            mask_ffn1,
             mask0: Optional[torch.Tensor] = None,
             mask1: Optional[torch.Tensor] = None,
     ):
@@ -290,10 +293,9 @@ def forward(
             n_endpoints1 = mask1.shape[-1]
 
             desc0[:, : n_endpoints0, :] = self.line_layer(desc0[:, : n_endpoints0, :], \
-                                                       encoding0[:, :, :, : n_endpoints0, :], mask0)
+                                                       encoding0[:, :, :, : n_endpoints0, :], mask_ffn0, mask0)
             desc1[:, : n_endpoints1, :] = self.line_layer(desc1[:, : n_endpoints1, :], \
-                                    encoding1[:, :, :, : n_endpoints1, :], mask1)
-
+                                    encoding1[:, :, :, : n_endpoints1, :], mask_ffn1, mask1)
             return self.cross_attn(desc0, desc1)
 
 
@@ -427,7 +429,7 @@ class LightGlueStick(BaseModel):
         "mp": False,  # enable mixed precision
         "depth_confidence": -1,  # early stopping, disable with -1
         "width_confidence": -1,  # point pruning, disable with -1
-        "filter_threshold": 0.1,  # match threshold
+        "filter_threshold": 0.0,  # match threshold
         "checkpointed": False,
         "weights": None,  # either a path or the name of pretrained weights (disk, ...)
         "keypoint_encoder": [32, 64, 128, 256],
@@ -483,10 +485,10 @@ def _init(self, conf) -> None:
         )
 
         self.loss_fn = NLLLoss(conf.loss)
-        self.i = 0
 
         state_dict = None
         if conf.weights is not None:
+            # weights can be either a path or an existing file from official LG
             if Path(conf.weights).exists():
                 state_dict = torch.load(conf.weights, map_location="cpu")
             elif (Path(DATA_PATH) / conf.weights).exists():
@@ -629,6 +631,8 @@ def _forward(self, data: dict) -> dict:
         do_early_stop = self.conf.depth_confidence > 0 and not self.training
         do_point_pruning = self.conf.width_confidence > 0 and not self.training
 
+        all_desc0, all_desc1 = [], []
+
         if do_point_pruning:
             ind0 = torch.arange(0, m, device=device)[None]
             ind1 = torch.arange(0, n, device=device)[None]
@@ -637,18 +641,30 @@ def _forward(self, data: dict) -> dict:
             prune1 = torch.ones_like(ind1)
         token0, token1 = None, None
 
-        n_endpoints0 = lines_junc_idx0.max() + 1
-        n_endpoints1 = lines_junc_idx1.max() + 1
-
         # pre-compute masks for LG-LMP
-        mask0 = create_mask(lines_junc_idx0, n_endpoints0).unsqueeze(1).bool().to(lines_junc_idx0.device)
-        mask1 = create_mask(lines_junc_idx1, n_endpoints1).unsqueeze(1).bool().to(lines_junc_idx1.device)
+        mask0 = create_mask(lines_junc_idx0).unsqueeze(1).bool().to(lines_junc_idx0.device)
+        mask1 = create_mask(lines_junc_idx1).unsqueeze(1).bool().to(lines_junc_idx1.device)
+
+        max_indices0 = lines_junc_idx0.max(1).values
+        max_indices1 = lines_junc_idx1.max(1).values
+
+        mask_ffn0 = torch.arange(mask0.shape[-1], device=mask0.device).unsqueeze(0) <= max_indices0.unsqueeze(1)
+        mask_ffn1 = torch.arange(mask1.shape[-1], device=mask1.device).unsqueeze(0) <= max_indices1.unsqueeze(1)
 
         for i in range(self.conf.n_layers):
-            torch.cuda.synchronize()  # Synchronize before starting the timer
+            if self.conf.checkpointed and self.training:
+                desc0, desc1 = checkpoint(
+                    self.transformers[i], desc0, desc1, encoding0, encoding1, \
+                    mask_ffn0, mask_ffn1, mask0, mask1, use_reentrant=True
+                )
+            else:
+                desc0, desc1 = self.transformers[i](desc0, desc1, encoding0, encoding1, \
+                                                    mask_ffn0, mask_ffn1, mask0, mask1)
 
-            desc0, desc1 = self.transformers[i](desc0, desc1, encoding0, encoding1, \
-                                                    mask0, mask1)
+            if self.training or i == self.conf.n_layers - 1:
+                all_desc0.append(desc0)
+                all_desc1.append(desc1)
+                continue  # no early stopping or adaptive width at last layer
 
             # only for eval
             if do_early_stop:
@@ -659,17 +675,13 @@ def _forward(self, data: dict) -> dict:
             if do_point_pruning:
                 assert b == 1
                 scores0 = self.log_assignment[i].get_matchability(desc0)
-
-                scores0[0, : n_endpoints0] = 1.0
                 prunemask0 = self.get_pruning_mask(token0, scores0, i)
                 keep0 = torch.where(prunemask0)[1]
                 ind0 = ind0.index_select(1, keep0)
                 desc0 = desc0.index_select(1, keep0)
                 encoding0 = encoding0.index_select(-2, keep0)
                 prune0[:, ind0] += 1
                 scores1 = self.log_assignment[i].get_matchability(desc1)
-
-                scores1[0, : n_endpoints1] = 1.0
                 prunemask1 = self.get_pruning_mask(token1, scores1, i)
                 keep1 = torch.where(prunemask1)[1]
                 ind1 = ind1.index_select(1, keep1)
@@ -703,12 +715,12 @@ def _forward(self, data: dict) -> dict:
             "log_assignment": scores,
             "prune0": prune0,
             "prune1": prune1,
-            "early_exit_layer_idx": i + 1
+            "ref_descriptors0": torch.stack(all_desc0, 1),
+            "ref_descriptors1": torch.stack(all_desc1, 1)
         }
 
         if n_lines0 > 0 and n_lines1 > 0:
             m0_lines, m1_lines, mscores0_lines, mscores1_lines = filter_matches(line_scores, self.conf.filter_threshold)
-
             pred["line_log_assignment"] = line_scores
             pred["line_matches0"] = m0_lines
             pred["line_matches1"] = m1_lines
diff --git a/gluefactory/models/matchers/lightgluestick_pretrained.py b/gluefactory/models/matchers/lightgluestick_pretrained.py
@@ -0,0 +1,19 @@
+from lightgluestick import LightGlueStick as LightGlueStick_
+from omegaconf import OmegaConf
+
+from ..base_model import BaseModel
+
+
+class LightGlueStick(BaseModel):
+    default_conf = {"features": "superpoint", **LightGlueStick_.default_conf}
+
+    def _init(self, conf):
+        dconf = OmegaConf.to_container(conf)
+        self.net = LightGlueStick_(dconf)
+        self.set_initialized()
+
+    def _forward(self, data):
+        return self.net(data)
+
+    def loss(pred, data):
+        raise NotImplementedError
diff --git a/pyproject.toml b/pyproject.toml
@@ -38,6 +38,7 @@ urls = {Repository = "https://github.com/cvg/glue-factory"}
 
 [project.optional-dependencies]
 extra = [
+    "lightgluestick @ git+https://github.com/aubingazhib/LightGlueStick.git",
     "pycolmap",
     "poselib",
     "pytlsd @ git+https://github.com/iago-suarez/pytlsd.git@4180ab8990ae68cc9c8797c63aa1dc47b2c714da",