Skip to content

Fix logic to determine non-desired variables in split-vars#864

Open
ceblanton wants to merge 6 commits intomainfrom
862.split-netcdf-fix
Open

Fix logic to determine non-desired variables in split-vars#864
ceblanton wants to merge 6 commits intomainfrom
862.split-netcdf-fix

Conversation

@ceblanton
Copy link
Copy Markdown
Contributor

@ceblanton ceblanton commented Apr 13, 2026

The existing logic used both known patterns and a list of variables whose number of dimensions look like metadata. The problem was that the matching logic was applied to the patterns, which was fine, but also the "short var" list, e.g. "a" and "b". This caused any variable with "b" such as "drybc" to be excluded from the output.

Describe your changes

Issue ticket number and link (if applicable)

Fixes #862 (replace XXX with the issue number and GitHub will autolink the PR to the issue)

Checklist before requesting a review

  • I ran my code
  • I tried to make my code readable
  • I tried to comment my code
  • I wrote a new test, if applicable
  • I wrote new instructions/documentation, if applicable
  • I ran pytest and inspected it's output
  • I ran pylint and attempted to implement some of it's feedback
  • No print statements; all user-facing info uses logging module

Note: If you are a code maintainer updating the tag or releasing a new fre-cli version, please use the release_procedure.md template. To quickly use this template, open a new pull request, choose your branch, and add ?template=release_procedure.md to the end of the url.

The existing logic used both known patterns and a list of
variables whose number of dimensions look like metadata.
The problem was that the matching logic was applied to the patterns,
which was fine, but also the "short var" list, e.g. "a" and "b".
This caused any variable with "b" such as "drybc" to be excluded from the output.
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.15%. Comparing base (dea2038) to head (49ba283).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #864      +/-   ##
==========================================
+ Coverage   84.14%   84.15%   +0.01%     
==========================================
  Files          71       71              
  Lines        4975     4981       +6     
==========================================
+ Hits         4186     4192       +6     
  Misses        789      789              
Flag Coverage Δ
unittests 84.15% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
fre/pp/split_netcdf_script.py 74.30% <100.00%> (+1.11%) ⬆️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dea2038...49ba283. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread fre/pp/split_netcdf_script.py Outdated
Comment thread fre/pp/split_netcdf_script.py Outdated
Comment thread fre/pp/split_netcdf_script.py Outdated
Comment thread fre/pp/split_netcdf_script.py Outdated
Comment thread fre/pp/split_netcdf_script.py Outdated
Comment thread fre/pp/split_netcdf_script.py Outdated
Comment thread fre/pp/split_netcdf_script.py Outdated
Copy link
Copy Markdown
Contributor

@cwhitlock-NOAA cwhitlock-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to make sure I'm understanding the problem...

Formula terms like "a" and "b" that have a single dimension were previously getting classified as shortvars_to_exclude and then added to the VAR_EXCLUDE_PATTERNS, which lead to all variables with an a or a b in the string to be classed as metadata-like variables. The solution was to say that we wanted different levels of pattern-matching for the VAR_EXCLUDE_PATTERNS and the shortvars_to_exclude - VAR_EXCLUDE_PATTERNS should be a pattern-matching and shortvars_to_exclude is an exact match.

This looks perfectly reasonable given my understanding of what's going on - I might want to add a bit more documentation on the var classification, but that's a separate PR.

Comment thread fre/pp/split_netcdf_script.py Outdated
@ceblanton ceblanton requested a review from singhd789 April 15, 2026 21:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

'fre pp split-netcdf' incorrectly skips some aerosol variables

4 participants