Skip to content

Latest commit

 

History

History
176 lines (134 loc) · 6.3 KB

File metadata and controls

176 lines (134 loc) · 6.3 KB

Test Data Samples for Catalog Transformers

This document lists small healpix samples (<100MB) for each catalog to use for testing and verification.

Data Repository Base URL

https://users.flatironinstitute.org/~polymathic/data/MultimodalUniverse/v1/

Small Healpix Samples by Catalog

Spectroscopic Catalogs

SDSS (CONFIRMED - 23MB)

GAIA (CONFIRMED - 16MB)

DESI

VIPERS

DESI_PROVABGS (MCMC posteriors)

Photometric/Galaxy Morphology Catalogs

GZ10 (Galaxy Zoo)

Time Series / Lightcurve Catalogs

PLAsTiCC (Simulated transients)

TESS (Exoplanet lightcurves)

Supernova Catalogs

Foundation

SNLS (Supernova Legacy Survey)

PS1_SNE_IA (Pan-STARRS1 Type Ia)

YSE (Young Supernova Experiment)

Swift_SNE_IA (Swift Type Ia)

CFA (CFA Supernova)

DES_Y3_SNE_IA (DES Year 3 Type Ia)

CSP (Carnegie Supernova Project)

Alert/Transient Catalogs

BTSbot (Bright Transient Survey)

Imaging Catalogs

SSL_LegacySurvey (Self-supervised learning)

HSC (Hyper Suprime-Cam)

LegacySurvey (DESI Legacy Survey)

JWST (James Webb Space Telescope)

IFU Datacubes

MaNGA (SDSS-IV MaNGA IFU)

Notes

  1. All healpix=0 directories have been verified as accessible (HTTP 200 response)
  2. Actual file sizes could not be determined without downloading due to access restrictions on directory listings
  3. The SDSS healpix=583 at 23MB is confirmed working as a reference point
  4. For catalogs noted as "may be larger", alternative healpix values may be needed if >100MB

Alternative Healpix Values (if needed)

If any catalog's healpix=0 exceeds 100MB, try these alternatives in order:

  • sdss: healpix=583 (confirmed 23MB)
  • Others: Try sequential values 1, 2, 3, 4, 5, 10, 50, 100, 500, etc.

Download Instructions

To download a specific healpix for testing:

# Example for SDSS healpix=583
wget -r -np -nH --cut-dirs=1 -R "index.html*" -q \
  https://users.flatironinstitute.org/~polymathic/data/MultimodalUniverse/v1/sdss/sdss/healpix=583/

# Generic pattern:
wget -r -np -nH --cut-dirs=1 -R "index.html*" -q \
  https://users.flatironinstitute.org/~polymathic/data/MultimodalUniverse/v1/{catalog}/{subcatalog}/healpix={N}/

Verification Workflow

For each catalog:

  1. Download the specified healpix
  2. Process using the catalog's download script (datasets library)
  3. Transform using the corresponding transformer class
  4. Compare the two outputs using the generalized compare.py script
  5. Document any discrepancies or issues