-
-
Notifications
You must be signed in to change notification settings - Fork 152
Proposed photutils API
Initial detection of objects:
object_slices, labeled_array = detect_objects(data, threshold, mask=mask) # data is background-subtracted.
object_slices # list of slices corresponding to minimum parallelpiped containing the object
labeled_array # 2-d array, same shape as data. Each pixel is an int corresponding to the object
# it is a member of (or 0 if not an object)
From there you use a function that loops over object slices and uses labeled_array to estimate all kinds of parameters of each object.
It is about 3 lines of code to implement this function using scipy.ndimage.measurements. However, it would be very simplistic - no source deblending and no convolution to enhance source extraction. For that sort of thing I think we would want a C extension. But do we want a function like this, regardless?
If we directly translate the existing Aperture API, it will look something like this:
from photutils.regions import CircularAperture, CircularAnnulus, EllipticalAperture, EllipticalAnnulus circ = Circular(3.) circ.extent() # extent relative to center: (-3., 3., -3, 3.) circ.area() # area of region: pi*3.**2 # Return a 2-d float array of shape (ny, nx) indicating the fraction of each pixel enclosed # by the region. The available choices for "method" depend on the specific Region class. circ.encloses(xmin, xmax, ymin, ymax, nx, ny, method='center')
All region classes will derive from an abstract base class Region to guarantee that they implement the above methods.
KB: The encloses method API seems a bit clunky with its 6 parameters, but I'm not sure how to improve it. Basically (xmin, xmax, ymin, ymax, nx, ny) specify the coordinates of the pixel grid we're interested in, relative to the region center. If we limit ourselves to array coordinates, we could eliminate 2 parameters (nx, ny could be determined from xmax, xmin and ymax, ymin). Example use cases for grids with coordinates where dx != dy? (I'm sure there are some, but I can't think of any.)
Given a 2-d Numpy array, find the flux around a few positions with an aperture of radius 3 pixels:
flux = photutils.aperture_circular(data, xc, yc, 3.)
If xc and yc are 1-d arrays, flux will be a 1-d array. Multiple
objects per aperture can be specified by making the radius parameter a
2-d array. In this case flux will be a 2-d array.
Calculate the flux and the error on flux given a constant error or error array:
flux, fluxerr = photutils.aperture_circular(data, xc, yc, 3., error=error)
KB: How about something that simply mirrors aperture photometry?:
flux = photutils.psf_photometry(data, xc, yc, psf=psf, sampling=sampling)
- For PSF kernels,
psfcan be a 2-d array.samplingis an integer (or even float) telling you the sampling of the kernel relative to the data. - For analytic PSFs,
psfbe a function that takes two floats(dx, dy)and returns a float.
TR: I think that PSFs should be stored in classes inheriting from an abstract base class PSF, because that way we can create parameterized analytical PSFs, numerical PSFs that can smartly read from files, and also any other kind of PSF we haven't thought of initially. For example, Spitzer used P**R**Fs, which are not oversampled PSFs, but lookup tables - for example, to extract a PSF for a particular offset, if the oversampling is a factor of 5, you would do psf = prf[1::5,1::5], i.e. read every 5 elements. It might be nice to support this kind of PSF definition too, hence why using a PSF class would make more sense (and can hide these kinds of details). Then the sampling becomes an argument of the PSF, rather than the photometry function - that is, the photometry function just asks the PSF for a rasterized PSF on the same grid as the data for a given offset, and the PSF class returns it.
So users might do something like:
psf = PSF.load('spitzer_irac_3.6.fits')
flux = photutils.psf_photometry(data, xc, yc, psf)
KB: I'm on board with a PSF class. The unification of analytical and numerical PSFs is very appealing. However, I say "cautiously" because I am always hesitant to force users to learn how to manipulate a new class unless it is really necessary. A compromise might be to allow psf to be any one of
- A
PSF(or derived class) object - A 2-d array
- A callable
Internally, the 2-d array and callable would be converted to PSF objects right from the start. This approach lets beginner users use the function right away without having to learn about how PSF objects work and what their API is. And of course all the flexibility is still there for more advanced users.
I agree that psf should be a positional, not optional argument (as I had it).
TR: A couple of other points - the input data should probably be an NDData by default (though arrays could be accepted)
KB: First, I think everyone will agree that whatever we do should be consistent between psf_photometry and aperture_photometry.
Accepting NDData may be OK, provided that ndarray would also be accepted (possibly converted to NDData internally; whatever is cleaner internally). However, I have reservations about how error calculations would work with NDUncertainty objects. For example, in aperture_photometry gaussian errors are assumed. Unless this can be changed (and I'm not sure it can without significantly affecting performance), we could only accept NDData objects with certain types of ``NDUncertainty``s. This seems somewhat inelegant and unpythonic to me.
An alternative that I think I prefer: create methods in NDData or a derived class such as Image that correspond to functions in photutils. For example, one would do:
im = Image(...) results = im.psf_photometry(...) results['flux'] # the flux results['chi2'] # chi2 of the fit
... and similarly for aperture_photometry. The advantage here is that the psf_photometry method inside im could do things like checking itself for the type of uncertainty, and taking care of things like units and even WCS stuff, before calling the photutils functions. I like the encapsulation of this: everything specific to the Image object is handled by that object, using self. The photutils functions don't have to go probing into the details of the Image object they are handed. For example, coordinate transformations: If one gives coordinates in (RA, Dec), photutils doesn't have to worry about coordinate transformations, or even asking the Image object for its transformation. The image object will take care of transformations itself before passing the array coordinates to photutils functions.
TR: The result should probably be returned as a Table object:
results = photutils.psf_photometry(...) results['flux'] # the flux results['chi2'] # chi2 of the fit results['background'] # estimated background
etc. which allows any number of fit parameters to be returned.
KB: What if a user prefers to use 'Flux' or 'bkg' (or any other name)? I'm hesitant to force certain field names on a user. Also, is this actually more flexible than returning a tuple of ndarrays? For a tuple, the calling code needs to know how many ndarrays are returned. Here, it needs to know what fields are available in results (and what their names are). Seems pretty similar to me, but with the added complexity of a new API to learn (Table). I think the situation would be different if Table was a standard in the way that ndarray is, but we're not there yet... And if there was a standard for field names like 'flux', 'chi2', 'background', but we're not there yet. I know Table is meant to be an integral component of astropy, but consider a new user who knows how to use ndarray but just installed photutils: if that were me, I would balk at being handed back an unfamiliar object with no apparent gain over simple ndarrays that I am already familiar with.