-
Notifications
You must be signed in to change notification settings - Fork 17
Generalization classes for FITS cutouts #136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
AlexReedy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking good to me!
havok2063
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This overall looks like a good start to me. I think one thing to consider during this refactor is where spectral cutouts might fit in here. Slicing a spectrum in wavelength is fairly trivial but perhaps astrocut could provide some helper functions for efficient bulk slicing.
Or if we enable cutouts for IFU cubes, for JWST and SDSS MaNGA, we might want the option to cut both in spatial and spectral directions via this tool.
| def _get_cutout_wcs(self, img_wcs: WCS, cutout_lims: np.ndarray) -> WCS: | ||
| """ | ||
| Starting with the full image WCS and adjusting it for the cutout WCS. | ||
| Adjusts CRPIX values and adds physical WCS keywords. | ||
| Parameters | ||
| ---------- | ||
| img_wcs : `~astropy.wcs.WCS` | ||
| WCS for the image the cutout is being cut from. | ||
| cutout_lims : `numpy.ndarray` | ||
| The cutout pixel limits in an array of the form [[ymin,ymax],[xmin,xmax]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some FITS products use gwcs, e.g some JWST miri or nirspec cubes. They actually store it within an asdf extension. And since ASDF files are basically dictionary trees, we might eventually have asdf files that store a FITS wcs as a dictionary.
In case we want to add support for JWST cube cutouts in the future, I wonder if we want to abstract out the WCS logic into their own classes, for fits-wcs/gwcs, that could be used by either FitsCutout or ASDFCutout? Not necessarily something to do for this PR, but maybe a follow-up?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great point; I didn't realize that some FITS products are already using GWCS. Since the WCS logic will need to be used by both FITSCutout and ASDFCutout, I think the best place to put it would be the ImageCutout class. That way, both FITSCutout and ASDFCutout have access to the logic as children of ImageCutout. I will have to think some more on the best way to parse the type of WCS that a specific file has. Since this epic focuses on porting over the existing logic, I'll leave this for a follow-up PR.
| def img_cut(input_files: List[Union[str, Path, S3Path]], coordinates: Union[SkyCoord, str], | ||
| cutout_size: Union[int, np.ndarray, Quantity, List[int], Tuple[int]] = 25, stretch: str = 'asinh', | ||
| minmax_percent: Optional[List[int]] = None, minmax_value: Optional[List[int]] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a "Returns" entry in the docstring? The name img_cut seems a bit confusing to me. Its primary difference from fits_cut is to make color images from FITS? img_cut sounds like I'm making cutouts from non-fits images, like jpeg.
With the abstraction into FITSCutout are these two functions still needed? The primary user endpoint seems to be FITSCutout and both just call that class with different options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I definitely see what you're saying about the function name (I don't love it either). My motivation for leaving these functions as they are is so that Astrocut stays backwards compatible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense! Can we add a comment, or sentence in the docstring, then clarifying we're keeping these for backwards compatibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
astrocut/tests/test_FITSCutout.py
Outdated
| if request.param == 'SPOC': | ||
| return create_test_imgs('SPOC', 50, 6, dir_name=tmpdir) | ||
| else: | ||
| return create_test_imgs('TICA', 50, 6, dir_name=tmpdir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be simplified to return create_test_imgs(request.param, 50, 6, dir_name=tmpdir)?
astrocut/tests/test_FITSCutout.py
Outdated
| if request.param == 'SPOC': | ||
| return create_test_imgs('SPOC', 50, 1, dir_name=tmpdir, | ||
| basename="img_badsip_{:04d}.fits", bad_sip_keywords=True)[0] | ||
| else: | ||
| return create_test_imgs('TICA', 50, 1, dir_name=tmpdir, | ||
| basename="img_badsip_{:04d}.fits", bad_sip_keywords=True)[0] | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same with this, create_test_imgs(request.param, ..).
astrocut/tests/test_FITSCutout.py
Outdated
| assert new_dir in cutout_files[0] | ||
| assert path.exists(new_dir) # new directory should now exist | ||
|
|
||
| cutout_hdulist.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since cutout_hdulist is being iterated over, should this be within the loop to ensure each instance gets closed properly?
| # Cutout each input file | ||
| for file in self._input_files: | ||
| self._cutout_file(file) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did astrocut ever use multiprocessing for handling batch cutouts? Given some of the roman reqs and the tests you did before on mass cutouts, we might want to create optional modes of batch processing files. Not something for this PR, but maybe something for later?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use ThreadPoolExecutor in cube_cut, but definitely agree that this is something we should add to other classes at some point.
|
Re: spectral cutouts - this is a great point and would make an excellent new feature for Astrocut. As far as where this would fit in the new architecture, I envision it as a new class inheriting directly from |
|
Is this okay to merge? The next MR will be ASDF cutouts, so changes can also be done there. |
|
I'm good with merging. |
This PR implements the generalized architecture for creating cutouts from FITS files. It implements
Cutout,ImageCutout, andFITSCutout. It also changes the functions incutouts.pyto call the aforementioned classes and updates the test suite with a new file,test_FITSCutout.py.Here is a class diagram of the new generalized architecture, though it may not be fully updated: https://innerspace.stsci.edu/display/MASTC/Class+Diagram
Considerations:
FITSCutoutwill eventually use the_get_cutout_datamethod defined inImageCutoutthat usesastropy.nddata.Cutout2D. Unfortunately, there is an issue with using thesectionattribute of anImageHDUobject that I've put in a PR to Astropy to fix. We could choose to usedatainstead ofsection, but this significantly worsens performance. I'm not sure when Astropy's next release will be, so for now, I think we will have to leave in the code that cuts out the data array and modifies the WCS for the cutout.test_cutouts.py. The main difference is thattest_FITSCutout.pyuses theFITSCutoutclass directly whiletest_cutouts.pyuses thefits_cutandimg_cutfunctions. I've essentially copy-pasted the old code and added some assertions and cases to increase coverage.