Skip to content

arsenelupin14/iris-solar-uv-data

Repository files navigation

IRIS Solar UV Data

CI License: MIT Python 3.11+

Research-first repository for an IRIS Level 2 solar-ultraviolet workflow with a deliberate SJI-first scope. It covers remote archive discovery, raw-data indexing, per-OBS analysis, duplicate-obsid comparison, and forced-ROI detector audits with a clear expansion path toward synchronized spectroscopy + imaging work.

Highlights

  • SJI-first archive pipeline for querying, downloading, indexing, and analyzing IRIS Level 2 observations
  • Detector v0 audit trail with per-OBS summaries, event catalogs, merged catalogs, pair comparisons, and forced-ROI reruns
  • Real duplicate-obsid validation across one two-window family (4000005156) and one three-window family (4202100004)
  • Current checkpoint on real data: 7 validated SJI 1400 observations, 9 merged event rows, and 4 audited duplicate-window pairs
  • GitHub-browsable figure index under reports/README.md and reports/figures/README.md

Project Status

The repository now includes:

  • a reproducible data layout for raw, metadata, derived, and report outputs
  • a small Python CLI for layout validation and metadata indexing
  • a remote archive query/download CLI for IRIS Level 2 files
  • a starter configuration for the first SJI 1400 Å use case
  • documentation for the first practical workflow
  • a first-pass SJI analysis CLI with target/background ROI comparison and event-candidate export

Raw data are not committed yet. The current repository state is designed to be the staging ground for the first few OBS directories, including repeated obsid cases.

Current Checkpoint: 2026-03-25

The repository has now been exercised on seven real SJI 1400 observations:

  • 20130903_041222_4182010156
  • 20130903_062635_4000005156
  • 20130903_113935_4000005156
  • 20130903_080941_4000257165
  • 20130903_124144_4202100004
  • 20130903_161444_4202100004
  • 20130903_202024_4202100004

What is already proven to work:

  • remote candidate discovery and download for real Level 2 observations
  • metadata indexing for multiple local FITS files
  • per-OBS quicklook generation
  • per-OBS SJI analysis outputs with target/background ROI comparison
  • first-pass event detection on real data

Current multi-OBS status:

  • data/metadata/iris_file_index.csv contains 7 valid SJI 1400 entries
  • the duplicate-obsid pair 4000005156 is now present locally as two distinct OBS directories:
    • 20130903_062635_4000005156
    • 20130903_113935_4000005156
  • the duplicate-obsid family 4202100004 is now present locally as three distinct OBS directories:
    • 20130903_124144_4202100004
    • 20130903_161444_4202100004
    • 20130903_202024_4202100004
  • the detector currently returns 0 candidates for 20130903_041222_4182010156, 1 candidate for 20130903_062635_4000005156, 0 candidates for 20130903_113935_4000005156, 5 candidates for 20130903_080941_4000257165, 1 candidate for 20130903_124144_4202100004, 1 candidate for 20130903_161444_4202100004, and 1 candidate for 20130903_202024_4202100004
  • data/derived/catalogs/sji_event_candidates_merged_v0.csv now merges 9 event rows from 7 per-OBS catalogs
  • the audited duplicate windows now have side-by-side per-OBS summaries and repeatable pairwise comparison CSVs under data/derived/sji/
  • for the 4000005156 pair, the shorter 20130903_113935_4000005156 window shows 0 candidates versus 1 in 20130903_062635_4000005156, with a higher threshold and strongly shifted auto-selected ROIs
  • for the first 4202100004 pair (20130903_124144_4202100004 vs 20130903_161444_4202100004), the two 600-frame windows both show 1 baseline candidate, but the target ROI center still shifts by about 464 px while the background ROI shift stays near 1 px
  • for 20130903_124144_4202100004 vs 20130903_202024_4202100004, the baseline detector also reports 1 vs 1, but the target ROI shift drops to about 106 px while the background ROI shift jumps to about 1004 px
  • for 20130903_161444_4202100004 vs 20130903_202024_4202100004, the baseline detector again reports 1 vs 1, with a target ROI shift of about 374 px and a background ROI shift of about 1005 px
  • a new sji-audit-obs-pair workflow can now rerun duplicate windows under forced target/background ROI combinations and write one audit matrix plus one scenario-level comparison table
  • the 4202100004 forced-ROI audits now show a more nuanced pattern than before: target ROI choice still decides which window remains event-positive, but the third window also makes background ROI choice materially relevant once the auto-selected background shifts by about 1004 px
  • across all audited duplicate-window pairs, detector v0 still points first to target ROI selection as the main bottleneck, but it now also needs an explicit cross-OBS ROI consistency guard instead of treating background choice as negligible
  • data/derived/sji/sji_duplicate_obsid_pair_summary_v0.csv now provides one cross-pair audit summary for four audited duplicate-window pairs, including all three pairings inside the 4202100004 family
  • per-OBS figures and CSV outputs are now being written with obs_dir-specific filenames to avoid overwriting prior results

Recent multi-OBS hardening:

  • raw downloads now land under data/raw/iris/<obs_dir>/, not just obsid folders
  • the metadata index now includes obs_dir explicitly alongside obsid
  • per-OBS event catalogs now include obs_dir, which removes ambiguity once repeated obsid values are present
  • the CLI now includes merge-event-catalogs for building one first-pass multi-OBS event table

Scientific Direction

IRIS Level 2 is the default science-ready input for this project.

The near-term focus is:

  1. Use-case A: SJI-only
  2. Channel priority: SJI 1400 Å
  3. Event target: brightenings / compact transition-region activity
  4. Output target: metadata index, ROI time series, event catalog, figures

The spectroscopy path remains planned but is intentionally deferred until the SJI archive and metadata workflow are stable.

Repository Layout

configs/
  sji_1400_starter.toml
data/
  raw/
    iris/
      <obs_dir>/
        *.fits or *.fits.gz
  metadata/
  derived/
    sji/
      pair_audits/
    spec/
    catalogs/
    timeseries/
docs/
  project_layout.md
  references.md
  workflow_sji_first.md
reports/
  figures/
scripts/
  build_metadata_index.py
  merge_event_catalogs.py
  sji_audit_obs_pair.py
  sji_analyze.py
  sji_compare_obs.py
  sji_quicklook.py
src/
  iris_solar_uv_data/

Rationale:

  • keep original Level 2 FITS files under data/raw/iris/<obs_dir>/
  • generate a lightweight metadata table under data/metadata/
  • keep analysis-ready but non-raw products in data/derived/
  • keep presentation outputs in reports/figures/

Minimal Metadata Schema

The starter index captures fields that are immediately useful for archive management:

  • obsid
  • obs_dir
  • product_type
  • channel_or_window
  • start_time
  • end_time
  • exposure_s
  • cadence_s
  • solar_x_arcsec
  • solar_y_arcsec
  • pixel_scale_x_arcsec
  • pixel_scale_y_arcsec
  • shape
  • path

This is intentionally lightweight. It is enough to find candidate files, shortlist observations, and support later derived products.

Quick Start

  1. Create an environment and install the package when you are ready for analysis work.
python3 -m venv .venv
source .venv/bin/activate
pip install -e .[analysis]
  1. Validate the repository layout.
PYTHONPATH=src python3 -m iris_solar_uv_data validate-layout
  1. Place Level 2 FITS files under an OBS-specific directory, for example:
data/raw/iris/20220402_041952_3600260076/
  1. Build or refresh the metadata index.
PYTHONPATH=src python3 -m iris_solar_uv_data index-raw

Or use the wrapper:

python3 scripts/build_metadata_index.py

Remote Query And Download

Query a date or date range directly from the remote Level 2 archive:

PYTHONPATH=src python3 -m iris_solar_uv_data query-archive \
  --start-date 2022-04-02 \
  --end-date 2022-04-02 \
  --channel 1400

This writes candidate observations to:

data/metadata/iris_remote_candidates.csv

Download one selected observation directory:

PYTHONPATH=src python3 -m iris_solar_uv_data download-obs \
  --obs-dir 20220402_041952_3600260076 \
  --sji-channel 1400

Download SJI plus raster:

PYTHONPATH=src python3 -m iris_solar_uv_data download-obs \
  --obs-dir 20220402_041952_3600260076 \
  --sji-channel 1400 \
  --include-raster

Files are stored under:

data/raw/iris/<obs_dir>/

First SJI Quicklook

Once one SJI 1400 FITS file is available locally, generate the first sanity products with:

PYTHONPATH=src python3 -m iris_solar_uv_data sji-quicklook \
  --input data/raw/iris/<obs_dir>/iris_l2_..._SJI_1400_t000.fits.gz \
  --roi 200,250,40,40 \
  --frame-index 0

Or use the wrapper:

python3 scripts/sji_quicklook.py \
  --input data/raw/iris/<obs_dir>/iris_l2_..._SJI_1400_t000.fits.gz \
  --roi 200,250,40,40

Outputs:

  • reports/figures/sji_quicklook.png
  • data/derived/timeseries/sji_roi_timeseries.csv

SJI Analysis And Event v0

The next step above the sanity quicklook is a one-OBS exploratory analysis that:

  • resolves a target ROI manually or from the brightest window in a reference frame
  • resolves a comparison background ROI manually or from a dim non-overlapping window
  • writes target/background/net light curves
  • writes per-frame summary statistics
  • writes a first event-candidate CSV using a median + MAD threshold with a minimum peak rise

Run it with:

PYTHONPATH=src python3 -m iris_solar_uv_data sji-analyze \
  --input data/raw/iris/<obs_dir>/iris_l2_..._SJI_1400_t000.fits.gz \
  --roi-mode peak-window \
  --roi-size 40,40 \
  --reference-frame peak

Or use the wrapper:

python3 scripts/sji_analyze.py \
  --input data/raw/iris/<obs_dir>/iris_l2_..._SJI_1400_t000.fits.gz

Default outputs:

  • reports/figures/sji_analysis.png
  • data/derived/timeseries/sji_analysis_timeseries.csv
  • data/derived/sji/sji_frame_statistics.csv
  • data/derived/catalogs/sji_event_candidates.csv
  • data/derived/sji/sji_analysis_summary.csv

For multi-OBS work, prefer per-OBS filenames keyed by obs_dir, for example:

  • reports/figures/sji_analysis_20130903_080941_4000257165.png
  • data/derived/catalogs/sji_event_candidates_20130903_080941_4000257165.csv
  • data/derived/sji/sji_analysis_summary_20130903_080941_4000257165.csv

Compare Duplicate-OBSID Windows

When the same obsid appears under multiple obs_dir windows, compare the per-OBS summaries directly:

PYTHONPATH=src python3 -m iris_solar_uv_data sji-compare-obs \
  --first-obs-dir 20130903_062635_4000005156 \
  --second-obs-dir 20130903_113935_4000005156

Or use the wrapper:

python3 scripts/sji_compare_obs.py \
  --first-obs-dir 20130903_062635_4000005156 \
  --second-obs-dir 20130903_113935_4000005156

Default output:

  • data/derived/sji/sji_obs_pair_comparison_<first>__<second>.csv

When you need to separate auto-ROI effects from the physical signal, rerun the pair under all target/background combinations sourced from the two baseline summaries:

PYTHONPATH=src python3 -m iris_solar_uv_data sji-audit-obs-pair \
  --first-obs-dir 20130903_062635_4000005156 \
  --second-obs-dir 20130903_113935_4000005156

Or use the wrapper:

python3 scripts/sji_audit_obs_pair.py \
  --first-obs-dir 20130903_062635_4000005156 \
  --second-obs-dir 20130903_113935_4000005156

Default outputs:

  • data/derived/sji/pair_audits/sji_obs_pair_roi_audit_<first>__<second>.csv
  • data/derived/sji/pair_audits/sji_obs_pair_roi_comparisons_<first>__<second>.csv
  • rerun summaries, event catalogs, time series, frame statistics, and figures under data/derived/sji/pair_audits/<first>__<second>/ and reports/figures/pair_audits/<first>__<second>/

Merge Event Catalogs

Merge per-OBS event CSV files into one first-pass cross-OBS table:

PYTHONPATH=src python3 -m iris_solar_uv_data merge-event-catalogs

Or use the wrapper:

python3 scripts/merge_event_catalogs.py

Default merged output:

  • data/derived/catalogs/sji_event_candidates_merged_v0.csv

Practical First Milestone

For the first sanity dataset, the repository should support this sequence cleanly:

  1. add one short Level 2 SJI observation
  2. build a metadata index
  3. inspect one frame with an ROI overlay
  4. extract one ROI intensity time series to CSV
  5. scale to a small multi-OBS archive

Notes

  • astropy is required for FITS header extraction.
  • matplotlib is required for quicklook plotting.
  • sunpy is only needed once automated retrieval is added.
  • the PyPI package for IRISpy is irispy-lmsal, while the import name is irispy
  • the remote query/download commands use only the Python standard library

Next Step

The next practical step is to prototype detector v1 against the three-window 4202100004 family. The current evidence now argues for a stricter target ROI rule plus an explicit cross-OBS ROI consistency guard, especially when the background ROI relocates by about 1000 px; threshold normalization looks secondary to ROI stability at this stage.

About

Reproducible IRIS Level 2 workflow for archive discovery, metadata indexing, per-OBS quicklooks, and duplicate-window ROI audits.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages