Feature request
There is a room for optimization of the amount of reads and memory usage we are doing on both mapping and reducing stages of the margin generation pipeline.
For the mapping stage, we can do the same tricks we are doing in LSDB for spatial searches: (1) filter by MOC (some partitions would just have no matches, especially in low-density catalogs), (2) pass filters to the parquet reader to select on _healpix_29 (even filtering based on order 8-10 healpix would help).
For the reducing stage we can do the same tricks, plus we can query for exact _healpix_29 values, because we kinda know what objects we need after we looked once at the mapping stage. If we have too much objects to look for, this could turn to be a bottleneck, but we can just set a threshold for the number of objects to filter for.
Before submitting
Please check the following:
Feature request
There is a room for optimization of the amount of reads and memory usage we are doing on both mapping and reducing stages of the margin generation pipeline.
For the mapping stage, we can do the same tricks we are doing in LSDB for spatial searches: (1) filter by MOC (some partitions would just have no matches, especially in low-density catalogs), (2) pass
filtersto the parquet reader to select on_healpix_29(even filtering based on order 8-10 healpix would help).For the reducing stage we can do the same tricks, plus we can query for exact
_healpix_29values, because we kinda know what objects we need after we looked once at the mapping stage. If we have too much objects to look for, this could turn to be a bottleneck, but we can just set a threshold for the number of objects to filter for.Before submitting
Please check the following: