Skip to content

Conversation

@snbianco
Copy link
Collaborator

@snbianco snbianco commented Jan 23, 2025

This PR implements the generalized architecture for creating cutouts from FITS files. It implements Cutout, ImageCutout, and FITSCutout. It also changes the functions in cutouts.py to call the aforementioned classes and updates the test suite with a new file, test_FITSCutout.py.

Here is a class diagram of the new generalized architecture, though it may not be fully updated: https://innerspace.stsci.edu/display/MASTC/Class+Diagram

Considerations:

  • While implementing this new generalized architecture, I'm focusing on restructuring the code rather than adding features or making significant changes. The existing API should have no breaking changes, for now.
  • Ideally, FITSCutout will eventually use the _get_cutout_data method defined in ImageCutout that uses astropy.nddata.Cutout2D. Unfortunately, there is an issue with using the section attribute of an ImageHDU object that I've put in a PR to Astropy to fix. We could choose to use data instead of section, but this significantly worsens performance. I'm not sure when Astropy's next release will be, so for now, I think we will have to leave in the code that cuts out the data array and modifies the WCS for the cutout.
  • The new test class looks VERY similar to test_cutouts.py. The main difference is that test_FITSCutout.py uses the FITSCutout class directly while test_cutouts.py uses the fits_cut and img_cut functions. I've essentially copy-pasted the old code and added some assertions and cases to increase coverage.

@snbianco snbianco marked this pull request as ready for review January 28, 2025 01:45
Copy link
Collaborator

@AlexReedy AlexReedy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good to me!

Copy link
Contributor

@havok2063 havok2063 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overall looks like a good start to me. I think one thing to consider during this refactor is where spectral cutouts might fit in here. Slicing a spectrum in wavelength is fairly trivial but perhaps astrocut could provide some helper functions for efficient bulk slicing.

Or if we enable cutouts for IFU cubes, for JWST and SDSS MaNGA, we might want the option to cut both in spatial and spectral directions via this tool.

Comment on lines +284 to +294
def _get_cutout_wcs(self, img_wcs: WCS, cutout_lims: np.ndarray) -> WCS:
"""
Starting with the full image WCS and adjusting it for the cutout WCS.
Adjusts CRPIX values and adds physical WCS keywords.
Parameters
----------
img_wcs : `~astropy.wcs.WCS`
WCS for the image the cutout is being cut from.
cutout_lims : `numpy.ndarray`
The cutout pixel limits in an array of the form [[ymin,ymax],[xmin,xmax]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some FITS products use gwcs, e.g some JWST miri or nirspec cubes. They actually store it within an asdf extension. And since ASDF files are basically dictionary trees, we might eventually have asdf files that store a FITS wcs as a dictionary.

In case we want to add support for JWST cube cutouts in the future, I wonder if we want to abstract out the WCS logic into their own classes, for fits-wcs/gwcs, that could be used by either FitsCutout or ASDFCutout? Not necessarily something to do for this PR, but maybe a follow-up?

Copy link
Collaborator Author

@snbianco snbianco Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great point; I didn't realize that some FITS products are already using GWCS. Since the WCS logic will need to be used by both FITSCutout and ASDFCutout, I think the best place to put it would be the ImageCutout class. That way, both FITSCutout and ASDFCutout have access to the logic as children of ImageCutout. I will have to think some more on the best way to parse the type of WCS that a specific file has. Since this epic focuses on porting over the existing logic, I'll leave this for a follow-up PR.

Comment on lines +129 to +131
def img_cut(input_files: List[Union[str, Path, S3Path]], coordinates: Union[SkyCoord, str],
cutout_size: Union[int, np.ndarray, Quantity, List[int], Tuple[int]] = 25, stretch: str = 'asinh',
minmax_percent: Optional[List[int]] = None, minmax_value: Optional[List[int]] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a "Returns" entry in the docstring? The name img_cut seems a bit confusing to me. Its primary difference from fits_cut is to make color images from FITS? img_cut sounds like I'm making cutouts from non-fits images, like jpeg.

With the abstraction into FITSCutout are these two functions still needed? The primary user endpoint seems to be FITSCutout and both just call that class with different options.

Copy link
Collaborator Author

@snbianco snbianco Feb 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I definitely see what you're saying about the function name (I don't love it either). My motivation for leaving these functions as they are is so that Astrocut stays backwards compatible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense! Can we add a comment, or sentence in the docstring, then clarifying we're keeping these for backwards compatibility?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Comment on lines 25 to 28
if request.param == 'SPOC':
return create_test_imgs('SPOC', 50, 6, dir_name=tmpdir)
else:
return create_test_imgs('TICA', 50, 6, dir_name=tmpdir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be simplified to return create_test_imgs(request.param, 50, 6, dir_name=tmpdir)?

Comment on lines 34 to 40
if request.param == 'SPOC':
return create_test_imgs('SPOC', 50, 1, dir_name=tmpdir,
basename="img_badsip_{:04d}.fits", bad_sip_keywords=True)[0]
else:
return create_test_imgs('TICA', 50, 1, dir_name=tmpdir,
basename="img_badsip_{:04d}.fits", bad_sip_keywords=True)[0]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same with this, create_test_imgs(request.param, ..).

assert new_dir in cutout_files[0]
assert path.exists(new_dir) # new directory should now exist

cutout_hdulist.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since cutout_hdulist is being iterated over, should this be within the loop to ensure each instance gets closed properly?

Comment on lines +335 to +338
# Cutout each input file
for file in self._input_files:
self._cutout_file(file)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did astrocut ever use multiprocessing for handling batch cutouts? Given some of the roman reqs and the tests you did before on mass cutouts, we might want to create optional modes of batch processing files. Not something for this PR, but maybe something for later?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use ThreadPoolExecutor in cube_cut, but definitely agree that this is something we should add to other classes at some point.

@snbianco
Copy link
Collaborator Author

Re: spectral cutouts - this is a great point and would make an excellent new feature for Astrocut. As far as where this would fit in the new architecture, I envision it as a new class inheriting directly from Cutout, so that it would be on the same level as ImageCutout and CubeCutout.

@snbianco
Copy link
Collaborator Author

Is this okay to merge? The next MR will be ASDF cutouts, so changes can also be done there.

@havok2063 havok2063 self-requested a review February 17, 2025 18:23
@havok2063
Copy link
Contributor

I'm good with merging.

@snbianco snbianco merged commit 6c2bca0 into main Feb 17, 2025
8 checks passed
@snbianco snbianco deleted the ASB-30252-FITS-Cutout branch February 17, 2025 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants