| layout | tutorial_hands_on | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| title | Source extractor on DESI Legacy Surveys sky images | ||||||||
| questions |
|
||||||||
| objectives |
|
||||||||
| time_estimation | 1H | ||||||||
| key_points |
|
||||||||
| requirements |
|
||||||||
| contributions |
|
||||||||
| tags |
|
One key objective in astronomy and large-scale sky surveys is to identify individual celestial sources, such as stars and galaxies, in wide-field sky images to enable further detailed scientific analyses. For instance, the DESI Legacy Surveys have imaged approximately one-third of the sky, detecting billions of luminous sources. As a follow-up, the DESI project measures individual galaxies' spectra from a subsample of about 50 million targets, selected based on their photometric properties.
SExtractor (Source Extractor) is a widely used tool in astronomy for detecting and measuring sources in astronomical images. The Galaxy source-extractor tool is built on top of SEP, a Python library derived from the core routines of SExtractor.
For more in-depth documentation, you can refer to:
- SEP documentation
- SEP paper
- Source Extractor for Dummies
- Source Extractor paper
- Source Extractor website
In this tutorial, we will cover:
- TOC {:toc}
{: .agenda}
The source-extractor tool accepts a single image file as input, with the option to provide a mask and/or a filter. Typically, for astronomy, a sky image contains luminous sources. In addition, the tool accepts several parameters related to the background estimation and source detectionm, which are set to the suggested default values. A subset of them is described in the subsection below.
Image:
- Preferrably: light sources on a dark background.
- Format: a single-channel 2D array stored as
.tiffor.fits(FITS is a widely used format in the astronomy community).
Mask (Optional):
- Masks regions affected by bright sources (e.g. stars) to improve background estimation.
- Pixels with
value > maskthreshor boolean True are masked.
- Format: a single-channel 2D array stored as
.tiffor.fits.
Checking the metadata of an image
Tip 1: Use {% tool Show image info %} to inspect
.tiffmetadata. Required:
RGB = false (1)Interleaved = falseSizeZ = 1SizeT = 1SizeC = 1Tip 2: Use {% tool astropy fitsinfo %} to check
.fitsmetadata. Required:Dimensions (N, M), whereNandMare pixel dimensions in 2D. {: .comment}
Filter Kernel (Optional): The filter kernel is used to smooth the input image, which can enhance the detection of faint and extended sources. However, in crowded fields, filtering may reduce performance by blending nearby objects.
- If
Filter Caseis set tonone, no filtering is applied. - If
Filter Caseisdefault, a built-in smoothing kernel is used:
1 2 1
2 4 2
1 2 1- If
Filter Caseisfile, you must provide a custom 2D array stored as plain text file, that contains whitespace-separated values.
Checking the metadata of an image You can check on your computer whether the filter file has the correct format by reading it with:
import numpy as npkernel = np.loadtxt("filter.txt")since this is the way the tool's back-end implementation loads the file. {: .comment}
In this subsection, we describe a subset of tool's parameters that you can change.
Before source detection, the tool estimates the image background. This is done by dividing the image into a grid of boxes, each with a default size of:
bw = 64 # box width in pixels
bh = 64 # box height in pixelsWithin each box, the pixel histogram is filtered to remove outliers, and the background level is estimated using a mode approximation based on the median and mean of the remaining pixel values. While 64 is the default value in the SEP package, the original paper suggests that on most images, a value between 32 to 128 pixels should work fine.
After background estimation, the tool identifies groups of pixels that exceed a defined brightness threshold. These parameters should help distinguish between real luminous sources and random fluctuations that can appear in the background.
Detection Criteria:
- Minimum Area: The number of connected pixels required to consider something a source.
minarea = 5 # default- Threshold: The value of the pixel (j, i) must exceed:
thresh * err[j,i]where:
thresh = 1.5 # defaultThe interpretation of err[j,i] depends on the err_option parameter:
err_option = 'float_globalrms' # Use global RMS (i.e. root mean square) of the background (default)
err_option = 'array_rms' # Use a pixel-wise RMS array of the background
err_option = 'none' # Use 'thresh' as an absolute thresholdIt is advisable to adapt the error estimation to the studied image: e.g. if the background is reasonably uniform, using a global value should be sufficient. In contrast, if the background changes drastically in different regions of the image, a pixel-wise RMS would be preferred.
Data Acquisition
Create a new history for this tutorial. You can rename the default unnamed history.
{% snippet faqs/galaxy/histories_create_new.md %}
Run the {% tool DESI Legacy Survey %} tool.
- Important: Choose the Data Product Image.
The default values are used for this tutorial. The history now contains the
.fitsimage file that is used as input for the source-extractor tool. {: .hands_on}
Once you’ve selected the source-extractor tool, choose the input file named: DESI Legacy Survey -> Image fits. After the tool has finished running, several output images and data products will be available:
- The background subtracted image with detected sources highlighted by red ellipses
- The estimated background
- The background RMS
- The segmentation map
- A catalog table listing the detected sources along with measured parameters such as flux (i.e. sum of member pixels) , position, size, and shape
The original image is published by Legacy Surveys / D. Lang (Perimeter Institute). The Legacy Surveys are described in {% cite legacy-survey-astronomy %}.
Ellipse drawing
The tool already provides as output an image with ellipses around detected objects. Nevertheless, if you want to create a figure by yourself you can use the table of detected sources returned by the tool
objectsin the following way:from matplotlib.patches import Ellipse import matplotlib.pyplot as plt fig, ax = plt.subplots() for i in range(len(objects)): e = Ellipse(xy=(objects['x'][i], objects['y'][i]), width=6*objects['a'][i], height=6*objects['b'][i], angle=objects['theta'][i] * 180. / np.pi) e.set_facecolor('none') e.set_edgecolor('red') ax.add_artist(e)
{: .hands_on}
Bright stars can skew background estimation and obscure nearby faint sources. In the previous output, some central sources were missed due to bright star interference.
A simple mask can help. Here's an example:
This mask can be easily created with:
import numpy as np
import tifffile
mask = np.zeros((360,360))
mask[270:325, :] = 1
mask[239:, :200] = 1
tifffile.imwrite("mask.tiff", mask)Upload the mask to Galaxy, select it in the source-extractor tool, and re-run.
You can observe that the central sources are now detected and also the background dynamic range has decreased, due to the mask.
An important output of this tool is the segmentation map of the detected sources:

This map can be used as the seed image required by [Voronoi segmentation tutorial]({% link topics/imaging/tutorials/voronoi-segmentation/tutorial.md %}). In this case, you can observe that the two bright stars still have an important effect on the source detection. Therefore, to improve the results, you can try: better masking, using the array RMS as a relative error in thresholding or different background mesh sizes.




