JP-4129: Capture pre-pipeline CRDS checks in ASDF log (try 2) by pllim · Pull Request #309 · spacetelescope/stpipe

pllim · 2026-04-07T19:43:37Z

Resolves JP-4129

This PR addresses Capture pre-pipeline CRDS checks in ASDF log jwst#9864
Close JP-4129: Capture pre-pipeline CRDS checks in ASDF log #304

Blocked by:

TODO:

Fix the problem in strun
Write tests

Tasks

update or add relevant tests
update relevant docstrings and / or docs/ page
Does this PR change any API used downstream? (if not, label with no-changelog-entry-needed)
- write news fragment(s) in changes/: echo "changed something" > changes/<PR#>.<changetype>.rst (see changelog readme for instructions)
- run regression tests with this branch installed (stpipe@git+https://github.com/pllim/stpipe.git@log-all-the-things-porque-no-los-dos)
  - jwst regression test
    - https://github.com/spacetelescope/RegressionTests/actions/runs/24103349581 (passed with JP-4129: Allow run to accept kwargs (Capture pre-pipeline CRDS checks in ASDF log) jwst#10437)
    - https://github.com/spacetelescope/RegressionTests/actions/runs/25520758464
  - romancal regression test
    - https://github.com/spacetelescope/RegressionTests/actions/runs/24103582716 (passed)
    - https://github.com/spacetelescope/RegressionTests/actions/runs/25520810950

codecov · 2026-04-07T20:12:18Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.73%. Comparing base (6517a37) to head (9a34957).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #309      +/-   ##
==========================================
+ Coverage   95.71%   95.73%   +0.01%     
==========================================
  Files          35       35              
  Lines        3433     3447      +14     
==========================================
+ Hits         3286     3300      +14     
  Misses        147      147

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

for Capture pre-pipeline CRDS checks in ASDF log (try 2) spacetelescope/stpipe#309

pllim · 2026-04-07T20:57:14Z

@braingram does this approach still give you the duplicates you reported in #304 (comment) ?

braingram · 2026-04-08T13:37:47Z

Would you post some examples of log differences with this PR for some representative jwst and romancal pipeline and step runs?

pllim · 2026-04-08T14:00:43Z

I can give a jwst example. My env is not set up to run roman stuff.

pllim · 2026-04-08T14:47:59Z

For JWST, I tested using https://github.com/spacetelescope/jwst-pipeline-notebooks/tree/main/notebooks/MIRI/Imaging notebook that David Law mentioned from a previous logging ticket I worked on (JP-4128). I focused on the Stage 3 processing, as follows, though you can run whatever steps you wish to test:

if doimage3:
    i2d_result = Image3Pipeline.call(
        association_im3,
        output_dir=image3_dir,
        steps=image3dict,
        save_results=True
    )
else:
    print('Skipping Spec3 processing')

In the notebook, when that cell is run, you can immediately see these logs in the cell output. This part of the behavior is the same with and without this patch:

2026-04-08 10:35:52,566 - CRDS - INFO -  DEV calibration SW identified. Defaulting to Edit Context.
2026-04-08 10:35:54,260 - stpipe.step - INFO - PARS-TWEAKREGSTEP parameters found: crds_cache/references/jwst/miri/jwst_miri_pars-tweakregstep_0020.asdf
2026-04-08 10:35:54,274 - stpipe.step - INFO - PARS-OUTLIERDETECTIONSTEP parameters found: crds_cache/references/jwst/miri/jwst_miri_pars-outlierdetectionstep_0096.asdf
2026-04-08 10:35:54,281 - stpipe.step - INFO - PARS-RESAMPLESTEP parameters found: crds_cache/references/jwst/miri/jwst_miri_pars-resamplestep_0001.asdf
2026-04-08 10:35:54,289 - stpipe.step - INFO - PARS-SOURCECATALOGSTEP parameters found: crds_cache/references/jwst/miri/jwst_miri_pars-sourcecatalogstep_0018.asdf
2026-04-08 10:35:54,305 - stpipe.step - INFO - Image3Pipeline instance created.
2026-04-08 10:35:54,306 - stpipe.step - INFO - AssignMTWcsStep instance created.
2026-04-08 10:35:54,308 - stpipe.step - INFO - TweakRegStep instance created.
2026-04-08 10:35:54,310 - stpipe.step - INFO - SkyMatchStep instance created.
2026-04-08 10:35:54,312 - stpipe.step - INFO - OutlierDetectionStep instance created.
2026-04-08 10:35:54,313 - stpipe.step - INFO - ResampleStep instance created.
2026-04-08 10:35:54,314 - stpipe.step - INFO - SourceCatalogStep instance created.
2026-04-08 10:35:54,703 - stpipe.step - INFO - Step Image3Pipeline running with args ('./mir_im_demo_data/Obs001/jw01040-a3001_image3_00002_asn.json',).
2026-04-08 10:35:54,714 - stpipe.step - INFO - Step Image3Pipeline parameters are:
...

After that is done, now I have a ./mir_im_demo_data/Obs001/stage3/image3_association_i2d.fits output file and I print the captured log as follows:

for line in i2d_f.cal_logs.calwebb_image3:
    print(line)

With this patch, the log messages that JP-4129 said were missing now show up:

2026-04-08T14:35:54.260 - stpipe.step - INFO - PARS-TWEAKREGSTEP parameters found: jwst_miri_pars-tweakregstep_0020.asdf
2026-04-08T14:35:54.274 - stpipe.step - INFO - PARS-OUTLIERDETECTIONSTEP parameters found: jwst_miri_pars-outlierdetectionstep_0096.asdf
2026-04-08T14:35:54.281 - stpipe.step - INFO - PARS-RESAMPLESTEP parameters found: jwst_miri_pars-resamplestep_0001.asdf
2026-04-08T14:35:54.289 - stpipe.step - INFO - PARS-SOURCECATALOGSTEP parameters found: jwst_miri_pars-sourcecatalogstep_0018.asdf
2026-04-08T14:35:54.305 - stpipe.step - INFO - Image3Pipeline instance created.
2026-04-08T14:35:54.306 - stpipe.step - INFO - AssignMTWcsStep instance created.
2026-04-08T14:35:54.308 - stpipe.step - INFO - TweakRegStep instance created.
2026-04-08T14:35:54.310 - stpipe.step - INFO - SkyMatchStep instance created.
2026-04-08T14:35:54.312 - stpipe.step - INFO - OutlierDetectionStep instance created.
2026-04-08T14:35:54.313 - stpipe.step - INFO - ResampleStep instance created.
2026-04-08T14:35:54.314 - stpipe.step - INFO - SourceCatalogStep instance created.
2026-04-08T14:35:54.703 - stpipe.step - INFO - Step Image3Pipeline running with args ('.jw01040-a3001_image3_00002_asn.json',).
2026-04-08T14:35:54.714 - stpipe.step - INFO - Step Image3Pipeline parameters are:
...

Without this patch (e.g., using stpipe from current main), you will see this instead:

2026-04-08T14:46:21.363 - stpipe.step - INFO - Step Image3Pipeline running with args ('.jw01040-a3001_image3_00002_asn.json',).
2026-04-08T14:46:21.374 - stpipe.step - INFO - Step Image3Pipeline parameters are:
...

braingram · 2026-04-08T15:24:51Z

Thanks. I'd like to see what this does for romancal runs before approving.

Also, would you add some tests for this change in behavior?

braingram · 2026-04-08T16:56:49Z

Testing with romancal and running one of the L1 regtest files

strun roman_elp ~/test_bigdata/roman-pipeline/dev/WFI/image/r0000101001001001001_0001_wfi01_f158_uncal.asdf

I see the following in the log output:

2026-04-08 12:07:22,538 - CRDS - ERROR -  Error determining best reference for 'pars-dqinitstep'  =   No match found.
2026-04-08 12:07:22,539 - CRDS - ERROR -  Error determining best reference for 'pars-saturationstep'  =   No match found.
2026-04-08 12:07:22,539 - CRDS - ERROR -  Error determining best reference for 'pars-refpixstep'  =   No match found.
2026-04-08 12:07:22,539 - CRDS - ERROR -  Error determining best reference for 'pars-darkdecaystep'  =   Unknown reference type 'pars-darkdecaystep'
2026-04-08 12:07:22,540 - CRDS - ERROR -  Error determining best reference for 'pars-wfi18transientstep'  =   Unknown reference type 'pars-wfi18transientstep'
2026-04-08 12:07:22,540 - CRDS - ERROR -  Error determining best reference for 'pars-linearitystep'  =   No match found.
2026-04-08 12:07:22,540 - CRDS - ERROR -  Error determining best reference for 'pars-darkcurrentstep'  =   No match found.
2026-04-08 12:07:22,541 - CRDS - ERROR -  Error determining best reference for 'pars-rampfitstep'  =   No match found.
2026-04-08 12:07:22,541 - CRDS - ERROR -  Error determining best reference for 'pars-assignwcsstep'  =   No match found.
2026-04-08 12:07:22,541 - CRDS - ERROR -  Error determining best reference for 'pars-flatfieldstep'  =   No match found.
2026-04-08 12:07:22,541 - CRDS - ERROR -  Error determining best reference for 'pars-photomstep'  =   No match found.
2026-04-08 12:07:22,542 - CRDS - ERROR -  Error determining best reference for 'pars-sourcecatalogstep'  =   Unknown reference type 'pars-sourcecatalogstep'
2026-04-08 12:07:22,542 - CRDS - ERROR -  Error determining best reference for 'pars-tweakregstep'  =   No match found.
2026-04-08 12:07:22,542 - CRDS - ERROR -  Error determining best reference for 'pars-exposurepipeline'  =   No match found.
2026-04-08 12:07:22,547 - stpipe.step - INFO - ExposurePipeline instance created.
2026-04-08 12:07:22,547 - stpipe.step - INFO - DQInitStep instance created.
2026-04-08 12:07:22,547 - stpipe.step - INFO - SaturationStep instance created.
2026-04-08 12:07:22,548 - stpipe.step - INFO - RefPixStep instance created.
2026-04-08 12:07:22,548 - stpipe.step - INFO - DarkDecayStep instance created.
2026-04-08 12:07:22,548 - stpipe.step - INFO - WFI18TransientStep instance created.
2026-04-08 12:07:22,548 - stpipe.step - INFO - LinearityStep instance created.
2026-04-08 12:07:22,549 - stpipe.step - INFO - DarkCurrentStep instance created.
2026-04-08 12:07:22,549 - stpipe.step - INFO - RampFitStep instance created.
2026-04-08 12:07:22,549 - stpipe.step - INFO - AssignWcsStep instance created.
2026-04-08 12:07:22,550 - stpipe.step - INFO - FlatFieldStep instance created.
2026-04-08 12:07:22,550 - stpipe.step - INFO - PhotomStep instance created.
2026-04-08 12:07:22,550 - stpipe.step - INFO - SourceCatalogStep instance created.
2026-04-08 12:07:22,551 - stpipe.step - INFO - TweakRegStep instance created.
2026-04-08 12:07:22,581 - stpipe.step - INFO - Step ExposurePipeline running with args ('/Users/bgraham/test_bigdata/roman-pipeline/dev/WFI/image/r0000101001001001001_0001_wfi01_f158_uncal.asdf',).
2026-04-08 12:07:22,589 - stpipe.step - INFO - Step ExposurePipeline parameters are:

However the cal_logs is missing messages prior to running with args...:

>> m.meta.cal_logs
AsdfListNode(["2026-04-08T16:07:22.581Z :: stpipe.step :: INFO :: Step ExposurePipeline running with args ('/Users/bgraham/test_bigdata/roman-pipeline/dev/WFI/image/r0000101001001001001_0001_wfi01_f158_uncal.asdf',).",
              "2026-04-08T16:07:22.589Z :: stpipe.step :: INFO :: Step ExposurePipeline parameters are:\n  pre_hooks: []\n  post_hooks: []\n  output_file: None\n  output_dir: None\n  output_ext:
...

Would you rerun your test this time using strun?

pllim · 2026-04-08T17:28:19Z

Re: strun -- Good catch. Looks like I have to also patch here:

stpipe/src/stpipe/cmdline.py

Line 469 in 1459190

step.run(*positional)

Also, question: Why doesn't the strun interface use .call?

pllim · 2026-04-08T20:30:14Z

Follow-up on strun problem: Turns out to be more complicated than I thought. I need the step_class to set up log capture, but the function that gives me the step_class also sets up the CRDS stuff that emits the missing log messages.

stpipe/src/stpipe/cmdline.py

Line 457 in 1459190

step, step_class, positional, debug_on_exception = just_the_step_from_cmdline(

To avoid breaking existing behavior, I guess I can call that line twice, once to just get the step_class and set up log.record_logs(), and again to run it within the log capture, but that is very inefficient. Any ideas?

and also add a change log first to reserve PR num [ci skip]

Co-authored-by: Melanie Clarke <mclarke@stsci.edu>

[ci skip]

pllim · 2026-05-09T15:21:56Z

+            log_names=step_class.get_stpipe_loggers(),
+            formatter=step_class._log_records_formatter,
+        ) as log_records:
+            step, step_class, positional, debug_on_exception = (


The only non-destructive way I can think of right now is calling just_the_step_from_cmdline twice but this does not seem good enough because:

It is calling the same function twice

It causes original log to screen to repeat some stuff twice but finally it captures stuff only once in the datamodel log

It still misses the CRDS logs because the CRDS logs already emitted in the first call

I am still stuck here:

stpipe/src/stpipe/cmdline.py

Lines 282 to 285 in 14c9627

config = config_parser.load_config_file(config_file)

step_class, name = Step._parse_class_and_name(

config, config_file=config_file

)

I cannot figure out an easy way to get the step_class before load_config_file. I need the former to build the log context manager, but the latter emits log messages.

Or maybe we can just give up on fixing this for strun and tell the user to call it the Pythonic way if they really care about capturing those few lines of log messages in datamodels.

The lines in cmdline.py in the above comment are no longer needed following the removal of step_script. I took a stab at cleaning some of it up in #317 which allowed attaching the recording handler earlier in the process. The whole submodule is quite convoluted so maybe we want to do even more cleanup (or split #317 up). I'm open to suggestions.

I don't know why the CLI is designed this way. As an outsider, I am curious why CLI doesn't just use Step.call or something from the Step class directly. Why all this extra processing that is only available via CLI?

Some of it is unavoidably linked to the CLI/strun if we want to support mapping the spec to the argparser to get things like useful --help messages. strun also supports a few things that call does not including:

running from a config/asdf file

saving parameters (although this is a bit weird since it then tries to run the step which I wouldn't expect)

"debug" mode

Using Step.call could simplify things and seems worth looking into. One complication is that we'll still have to handle merging config and CLI arguments (since they can be used together when both are provided). We can't just defer to Step.from_config_file/section (since that only uses the config file). We might be able to use the config_file argument to call but I'm not sure if it's equivalent. It's very confusing that there so many ways to do purportedly the same thing.

running from a config/asdf file

I think this is the source of my problem. It reads the config first in order to even find step class. But I need the step class for logging. So when it runs this way, by the time I get the step class to set up logging, the config logs are already gone. I don't think I can win here.

It's very confusing that there so many ways to do purportedly the same thing.

Exactly!

pllim · 2026-05-22T12:15:08Z

Update: I am guessing that this PR will have a lot of conflicts after #317 so I am prepared to close this without merge and then rethink my approach, if there is even a problem left to be solved then. This PR is in a "wait and see" mode. Just want to let you know I am aware of the situation. Thanks for your help!

pllim · 2026-05-27T16:30:47Z

Superseded by #317 thanks to Brett Graham. Melanie Clarke will close out the JIRA ticket as complete. No more work is needed, so closing this without merge. Thanks, all!

pllim mentioned this pull request Apr 7, 2026

JP-4129: Capture pre-pipeline CRDS checks in ASDF log #304

Closed

7 tasks

pllim added a commit to pllim/jwst that referenced this pull request Apr 7, 2026

JP-4129: Allow run to accept kwargs

6868e7d

for Capture pre-pipeline CRDS checks in ASDF log (try 2) spacetelescope/stpipe#309

pllim mentioned this pull request Apr 7, 2026

JP-4129: Allow run to accept kwargs (Capture pre-pipeline CRDS checks in ASDF log) spacetelescope/jwst#10437

Closed

10 tasks

stscijirahub mentioned this pull request Apr 7, 2026

Capture pre-pipeline CRDS checks in ASDF log spacetelescope/jwst#9864

Open

pllim marked this pull request as ready for review April 7, 2026 22:10

pllim requested a review from a team as a code owner April 7, 2026 22:10

pllim force-pushed the log-all-the-things-porque-no-los-dos branch from 23526ae to 221f113 Compare April 8, 2026 14:34

pllim marked this pull request as draft April 8, 2026 20:38

melanieclarke reviewed May 4, 2026

View reviewed changes

Comment thread src/stpipe/step.py Outdated

pllim and others added 3 commits May 7, 2026 16:26

Not functional change

34b8e3a

and also add a change log first to reserve PR num [ci skip]

The fix hopefully

8675254

Preserve original run signature

9a34957

Co-authored-by: Melanie Clarke <mclarke@stsci.edu>

pllim force-pushed the log-all-the-things-porque-no-los-dos branch from 221f113 to 9a34957 Compare May 7, 2026 20:36

melanieclarke mentioned this pull request May 8, 2026

JP-4129: Add CRDS to the stpipe loggers spacetelescope/jwst#10523

Merged

10 tasks

hackfix for cmdline.py

e4131c2

[ci skip]

pllim commented May 9, 2026

View reviewed changes

melanieclarke mentioned this pull request May 22, 2026

capture pre-pipeline/step logs and cleanup cmdline #317

Merged

7 tasks

pllim closed this May 27, 2026

pllim deleted the log-all-the-things-porque-no-los-dos branch May 27, 2026 16:30

	config = config_parser.load_config_file(config_file)
	step_class, name = Step._parse_class_and_name(
	config, config_file=config_file
	)

Conversation

pllim commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tasks

Uh oh!

codecov Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

pllim commented Apr 7, 2026

Uh oh!

braingram commented Apr 8, 2026

Uh oh!

pllim commented Apr 8, 2026

Uh oh!

pllim commented Apr 8, 2026

Uh oh!

braingram commented Apr 8, 2026

Uh oh!

braingram commented Apr 8, 2026

Uh oh!

pllim commented Apr 8, 2026

Uh oh!

pllim commented Apr 8, 2026

Uh oh!

Uh oh!

pllim May 9, 2026

Choose a reason for hiding this comment

Uh oh!

pllim May 9, 2026

Choose a reason for hiding this comment

Uh oh!

pllim May 9, 2026

Choose a reason for hiding this comment

Uh oh!

braingram May 19, 2026

Choose a reason for hiding this comment

Uh oh!

pllim May 19, 2026

Choose a reason for hiding this comment

Uh oh!

braingram May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pllim May 20, 2026

Choose a reason for hiding this comment

Uh oh!

pllim commented May 22, 2026

Uh oh!

pllim commented May 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pllim commented Apr 7, 2026 •

edited

Loading

codecov Bot commented Apr 7, 2026 •

edited

Loading

braingram May 20, 2026 •

edited

Loading