Way forward for co-registration module

We currently have **25 issues** open on co-registration :scream_cat:, and a lot of notes and plans for the way forward dispersed everywhere. I initially tried to organize this in a project: https://github.com/orgs/GlacioHack/projects/3/views/2, but I think that the level of detail of the issues is too inconsistent for this, so I'm writing this summary instead!

**Architecture:**
We already did a lot in https://github.com/GlacioHack/xdem/pull/158 and https://github.com/GlacioHack/xdem/pull/329 to re-organize the coding structure to be more consistent and robust, but some points are left:
- A detailed list in #327 especially about cleaning + having consistent subfunctions of class methods + side-features (see next point),
- Additional cleaning mentioned in #266,
- The discussion of https://github.com/GlacioHack/xdem/issues/402 about merging point methods into a single `fit` and `apply`, and call methods depending on Point-Point, Point-Raster or Raster-Raster support, which would also completely solve #134.

**Features:**
We've been needing for a while to have consistent:
- Plotting: #139 (also in #327),
- Statistics (in #327),
- Weight support: #59 ,
- Subsampling: #243, #137 and mainly the long discussion in #428.

Most of them are not too far since we introduced a consistent structure for optimizers or binning in https://github.com/GlacioHack/xdem/pull/158. It makes weights almost directly supported (but we'll need to raise proper warnings for methods that currently ignore, such as ICP), and will make plotting more easy by consistently treating the type of methods (binning, fit, or both) and the dimensionality of the variables (1D, 2D, ND), which can be re-used for `Tilt` or `NuthKaab` fits in the affine functions.

**Tests:**
Some tests are still a bit old or slow, several related issues could be solved all at once:
- Consistent data fixtures in test modules: #427,
- More varied test data #64, #140 and https://github.com/GlacioHack/geoutils/issues/310, with synthetic data examples from GeoUtils here: https://github.com/GlacioHack/xdem/pull/358#issuecomment-1522200021.

**Performance:**
:warning: **We really need to think ahead for a structure that will allow memory-efficient computations using Dask:** 
- For reading/writing with georeferencing, this will depend directly on `rioxarray`, and https://github.com/GlacioHack/geoutils/issues/383 and #392, so not too much to think about! 
- For `Coreg.fit`, as a regression requires all samples at once, it cannot be combined from solutions of different chunks (except in a blockwise way using `BlockwiseCoreg`). So it's all about the `subsample` which is fairly easy to deal with (read subsample per chunk + return concatenated vector). There's also the computation of derivatives needed, which are also straightforward (slope using overlapping chunks, bias_vars using normal chunks), see the thoughts in https://github.com/GlacioHack/xdem/issues/428#issuecomment-1707538808. Most other methods `residuals`, `plot` can be based on the same logic as they use `subsample`.
- It leaves the logic for `Coreg.apply`, which might not be trivial. For bias corrections, the solution from the optimized function is applicable independently to every pixel given their `bias_vars` (coordinates, angle, etc), so very easy to apply to chunk. However, for affine methods, applying a 3D affine matrix in 4x4 format lazily to independent chunks won't work directly... it would also require a definition of the rotation center of the matrix, and maybe other things... Any thoughts on how to address this @erikmannerfelt? Or maybe @friedrichknuth has some insights?
- We could also improve affine pipelines performance with #79,
- And improve memory performance by adding a default resampling logic, see #437.

**Bugs:**
Here there's a lot, but they might solve themselves (or become irrelevant) after changing architecture + tests:
#423, #422, #404, #326, #232, #193

Idea of plan moving forward:
1. Think ahead on logic for **Performance**: already done a bit in https://github.com/GlacioHack/xdem/issues/428#issuecomment-1707538808. I think we can use the structure proposed there which should work eventually :crossed_fingers:! For `apply`, this should be adaptable down the line...
2. Focus first on things all related to **Architecture**: to avoid the effort of adapting all the rest if we did it first (new features, bug fixes and tests). A lot of lessons learned from https://github.com/GlacioHack/xdem/pull/158 and https://github.com/GlacioHack/xdem/pull/329.
3. Add basic architectural **Tests** to ensure 2. is working as intended with what we already have, and add new data for consistent (and quick!) tests (this specific point might require its own thread of discussion).
4. Add support for all new **Features**: should be made easier by the consistent architecture!
5. Add new **Tests** in parallel of each feature in 4.
6. See if the **Bugs** are still relevant, and fix them if they are! :smile:
7. Celebrate, for we would have reached quite a way! :champagne:

Any thoughts @adehecq @erikmannerfelt? :smile: 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Way forward for co-registration module #435

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Way forward for co-registration module #435

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions