Skip to content

Conversation

@adebardo
Copy link
Contributor

@adebardo adebardo commented Mar 27, 2025

(Edit from @rhugonnet to add a description)

This PR refactors BlockwiseCoreg, rewritting most of the code to be shorter and changing the behaviour for the following:

  • The functionality now supports out-of-memory use through Multiprocessing,
  • The chunk size of the subdivision can be variable in X/Y,
  • Still only works for Nuth and Kääb,
  • The apply step now fits a 2D plane independently for X, Y and Z shifts output, using a RANSAC for filtering,
  • The re-gridding of the apply step now uses grid() from GeoUtils instead of the old warp_dem function, which removes the artefacts from Fix BlockwiseCoreg errors #584.

Resolves #584
Resolves #645
Resolves #697
Resolves #601
Resolves #698

Original message

Lots of work to do but we will use this PR to debate tomorrow

@adebardo
Copy link
Contributor Author

image

@rhugonnet
Copy link
Member

rhugonnet commented Mar 29, 2025

Great! 🙂

I went through all the code as we mentioned in our discussion yesterday. All good on the functionality. In terms of the structure and consistency with the rest of the package, here are my remarks (difficulty level of the change between parenthesis):

  1. (Easy) As mentioned in the call, we should likely add plane_fit_ransac as an option of apply(), keeping the behaviour consistent with BlockwiseCoreg otherwise,
  2. (Easy-Moderate) I don't understand why reference_dem and to_be_aligned_dem need to be __init__. The tiling_grid could be defined during fit() as it only relates to reference_dem (and not the coregistration method itself), and the profile saved there too?
  3. (Moderate) After reviewing the functions, it would be a bit of effort but not as difficult as I anticipated to integrate these changes into BlockwiseCoreg directly, and trigger the out-of-memory behaviour by passing a MultiprocConfig to fit() and apply().

For 3.: Why? I realized we actually don't need to mirror the full complexity of #525 for BlockwiseCoreg. This is thanks to BlockwiseCoreg having its own fit()/apply() function independent of Coreg.fit()/apply().
As BlockwiseCoreg splits the input array very early on (and that once we have blocks, we don't need any memory considerations), we don't need as many complex out-of-memory processing steps during the rest of the coregistration, as we need for the generic Coreg.fit()/apply() steps in #525. For instance, ignoring the preprocessing, coreg.NuthKaab() still requires out-of-memory subsampling, then out-of-memory interpolation that re-runs at each iteration of the algorithm depending on the shift iteration.
Here, we'd only need two simple, one-time, out-of-memory steps to work in the very beginning, and that's it! Those are:

In short: We could pass down the MultiprocConfig to the preprocess_fit/apply functions, and define filenames in MultiprocConfig for these two "break points" (that will create a file for the "reprojected_dem" and "inlier_mask"). Then we just continue the rest of the coregistration with these two files!

So, in short, I think 1-2. shouldn't be a problem.
For 3., it's a bit more effort but not as much as initially thought, and would be ideal to avoid splitting into two classes. I'm happy to help out on this to finalize it in time 😉

Finally, I also have one question: Is there any core difference between our interp_points() and resample_dem()?
We have a grid resample wrapper based on interp_points() (tested to match exactly GDAL.warp in a same CRS), which would be easy to run with Multiprocessing:

def _reproject_horizontal_shift_samecrs(

We generally prefer to use this interp_points() implementation for consistency due to:

  • Its integrated nodata propagation scheme depending on the resampling, to avoid unexpected outliers,
  • Its understanding of pixel interpretation (center or corner of pixel), to avoid unexpected shifts,
  • It is the resamping used by the coregistration algorithms, so it ensures the resulting DEM alignment is fully consistent (otherwise, with a different resampling here, if we re-ran the coregistration algorithm, we would likely find another shift again instead of (0,0,0)).

Some issues I mention above sound like the ones you described in the call, so this might part of the solution.
In terms of performance of our resampling: The default "bilinear" sampling relies on scipy.ndimage.map_coordinates (to harness the regular-grid nature of a raster = X and Y axis have equal spacing), and therefore is much faster than Xarray-SciPy's interpn (that works on a rectilinear-grid = X and Y can have any coordinates).
It's likely not as fast as cars-resample, but shouldn't be too far off 😉

@adebardo adebardo force-pushed the 698-scaledblockwise branch from ec5e5da to 6ce35f3 Compare April 3, 2025 15:44
@rhugonnet
Copy link
Member

@adebardo I thought about the apply step again.

Performing the resampling we talked about using map_overlap (more likely a modified version resembling the original implementation of reproject_multiproc from @vschaffn that gets the right shifted block with overlap) coupled with reproject_samecrs() should work well.
My only worry is that we might get tiling artefact at the boundary of tiles. Having smaller tiles would solve this, but then the Nuth and Kääb algorithm would not have enough points to converge, so we'd be blocked.

Thinking ahead on this, I see two solutions:

So I think we should have it covered in any case 😉

@adebardo adebardo force-pushed the 698-scaledblockwise branch 2 times, most recently from 602ce15 to 08ddde3 Compare April 16, 2025 13:23
@adebardo
Copy link
Contributor Author

Hi @rhugonnet , I know it's not perfect, but could you take a look? I imagine there will be quite a few changes, and I'd like to get everything aligned before I properly finish the tests.

@rhugonnet
Copy link
Member

Looking good! Shorter and better code 😄

I don't have any generic comment, so will just add a couple line-by-line!

@rhugonnet
Copy link
Member

rhugonnet commented Apr 16, 2025

It's great, thanks a lot for all the work!
Much more concise, readable and efficient code 🙂

I have an idea that might solve the efforts you showed to optimize memory usage (in case you don't know the I/O behaviour with chunks, if you know, sorry for explaining again!):
There are chunks "on disk", which are different from chunks "in memory". On file (TIF, COG), a raster is written with certain chunk sizes defined in the file, let's say (300, 300). If you then read a window with Rasterio, let's say (20, 20), it's actually the full (300, 300) chunk that gets loaded, before subsetting to (20, 20) later.
So, if you match your disk chunks in memory (300, 300), and you use a non-zero depth with map_overlap, you'll load 9 tiles (the central one + 8 surrounding) of (300, 300) all at once just to compute the overlap. That's a big memory peak.

In short, you want to have two things: 1/ Use memory chunks that are a large multiple (x4-x10) of your disk chunks, and 2/ Ensure your disk chunks aren't too big (many rasters only have 4 chunks, so almost everything gets loaded all the time, no matter the memory chunks).
Sometimes you need to re-encode the disk chunks (rewrite the TIF/COG) to your desired sizes to get the optimal memory usage. I do this in GeoUtils' out-of-memory test (for netCDFs, but it's similar for TIF/COG) here: https://github.com/GlacioHack/geoutils/blob/10519e99f9f8ec8ee76a72ac9702e2eac8467d90/tests/test_raster/test_distributing_computing/test_delayed_dask.py#L273
Maybe that helps for the memory tests 😉

@adebardo adebardo changed the title feat: new blockwise for large data Refactor blockwise coregistration method Apr 17, 2025
@adebardo adebardo force-pushed the 698-scaledblockwise branch 6 times, most recently from cc30471 to 3d88b62 Compare April 24, 2025 07:40
@rhugonnet
Copy link
Member

Looking good, not much to say on the code side!
I'm mostly curious about how the shifted DEM looks like with this new apply using grid(), and how well the memory usage performs! 😊

@adebardo adebardo force-pushed the 698-scaledblockwise branch from 9a49305 to 64ce190 Compare April 29, 2025 12:23
@vschaffn vschaffn merged commit 8af72d8 into main Apr 29, 2025
24 checks passed
@vschaffn vschaffn deleted the 698-scaledblockwise branch April 29, 2025 12:53
@rhugonnet
Copy link
Member

Great! 😄
@adebardo Would be interesting to see if we still have the edge effects with the new apply method by updating the plot here: https://xdem.readthedocs.io/en/stable/coregistration.html#the-blockwisecoreg-object).
I see it's not there in the latest doc that updated after merging this PR: https://xdem.readthedocs.io/en/latest/coregistration.html#the-blockwisecoreg-object

@rhugonnet
Copy link
Member

There are some rendering issues with the changes in the doc too, we'll have to fix that in another PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

4 participants