Skip to content

ASB-29533: Adding option to save filename checker output in alternate format (csv, fits, excel, html)#32

Merged
astrojimig merged 6 commits into
devfrom
ASB-29533_db_outputs
Jan 26, 2026
Merged

ASB-29533: Adding option to save filename checker output in alternate format (csv, fits, excel, html)#32
astrojimig merged 6 commits into
devfrom
ASB-29533_db_outputs

Conversation

@astrojimig

Copy link
Copy Markdown
Collaborator

This MR adds some alternative options to save the output of the filename checker. In addition to the results.db db file, the output can now be saved in "csv", "fits", "excel", or "html" formats with the (optional) --output_format= flag

You can test this out in the /TUTORIAL folder by running:

mct check_filenames mct-tutorial --directory='tutorial-data/' --output_format='html'

The new write_to_alternate_format() function is the main change behind this. It works by converting the database table to a pandas DataFrame and then writing the output to the specified format (which are all built into pandas).

The excel and html versions of the outputs are color-coded so that "PASS" shows up as green, "FAIL" shows up as red, etc. Here's an example of what it looks like:
html_example

Comment thread pyproject.toml
{ name = "Mikulski Archive for Space Telescopes", email = "mast_contrib@stsci.edu" },
]
dependencies = [
"astropy >= 7.2.0",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

astropy and pandas are big dependencies. Since, as far as I can tell, they are needed only for the specific write methods, is it worth making them optional dependencies? or will that cause too much of a headache for users? (I don't have a strong opinion here, just posing the idea)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Astropy will definitely be strong requirement for the metadata checker, so I'm leaning towards leaving this here. We can revisit later if others feel strongly about it!

@zclaytor zclaytor left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me, @astrojimig. I'm really glad to see other output formats added. I have added one comment about optional dependencies, but it doesn't block the merge. Thanks for doing this!

@jinmiyoon jinmiyoon left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@astrojimig Thanks for this awesome feature. I didn't have much time to review so my review is quite surface level though I am sure you did a good job. I left a few inline comments. The only notable comment would be about FITS format. I am not fully sure about its usage for inspecting the output files. It seems at least to me a bit cumbersome to inspect all the parameters and verdicts to check. But if you have some ideas how to utilize the output in its format for inspection, would you share that in the readme or tutorial, but this request is an option.

Comment thread TUTORIAL/tutorial_readme.md Outdated
| `-e` or `--exclude` | File pattern to exclude from testing, for example '*.jpg' to test all files except the jpgs | None |
| `-n` or `--max_n` | Maximum number of files to check, for testing purposes. | None (all files) |
| `-db` or `--dbFile` | Name of Results database file | `results_<hlsp_name>.db` |
| `-f` or `--output_format` | Write output to alternate format. Currently supports "csv", "fits", "html" or "excel" | `db` |

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not fully sure how useful fits format would be for this file. It is okay to have it an option, and yet I am curious to know how I could effectively use it for inspection.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fits file has two Table extensions, which contains the all the same information from the filename checker output. You can open it however you want, with Python, or with a VSCode extension, or TOPCAT, etc.

For example, in Python you can see which files failed inspection with something like this:

>>> import astropy.io.fits as fits
>>> results = fits.open('results_mct-tutorial.fits') # Open File
>>> results.info() # Print Info
Filename: results_mct-tutorial.fits
No.    Name      Ver    Type      Cards   Dimensions   Format
  0  PRIMARY       1 PrimaryHDU       4   ()      
  1  FILENAMES     1 BinTableHDU     17   7R x 4C   [1A, 61A, 12A, K]   
  2  FIELDS        1 BinTableHDU     25   58R x 8C   [61A, 12A, 12A, 4A, 4A, 4A, 12A, 12A]   

>>> # Print filenames which failed the check
>>> failed_files = results[1].data['filename'][results[1].data['final_verdict']=='FAIL']
>>> print(failed_files)
['hlsp_mct-tutorial_jwst_nirspec_GALAXY1_multi_v1_spec.fits']

I agree it's not the most practical format for this, but I thought that it would be a good option to include for astronomers more comfortable with fits files than any other format. I hope that helps!

Comment thread mast_contributor_tools/filename_check/fc_app.py Outdated
@astrojimig astrojimig merged commit f539100 into dev Jan 26, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants