Skip to content

Conversation

@charles-turner-1
Copy link
Collaborator

Change Summary

Fixes to address issues on this forum post

TLDR; updates the Cmip6Builder to separate datasets on member if the ensemble kwarg is passed as true. Reasoning being that ensemble runs of a model will all be on the same grid (and so mergeable), but should not be merged as part of the same dataset.

Related issue number

None, but see forum post

Please add any other relevant info below:

This is something we never really noticed as an issue before, but I think the changes to mergeability based file_id have created it. This will probably also need to be updated in the AccessEsm15Builder.

@dougiesquire can you confirm I'm barking up the right tree with the intent of the ensemble kwarg here?

@codecov
Copy link

codecov bot commented Oct 2, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.15%. Comparing base (12c1c63) to head (94a9d5c).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #510   +/-   ##
=======================================
  Coverage   99.14%   99.15%           
=======================================
  Files          17       17           
  Lines        1516     1531   +15     
=======================================
+ Hits         1503     1518   +15     
  Misses         13       13           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@access-hive-bot
Copy link

This pull request has been mentioned on ACCESS Hive Community Forum. There might be relevant details there:

https://forum.access-hive.org.au/t/building-intake-catalog-parser-returns-no-valid-assets-error/5320/13

@charles-turner-1
Copy link
Collaborator Author

@dougiesquire Any chance you could take a look at this when you get the chance?

@dougiesquire
Copy link
Collaborator

Added to the todo list

@dougiesquire
Copy link
Collaborator

Sorry @charles-turner-1, I've taken only a cursory look before going on leave, but yes this looks good to me. If you wanted someone to test you might be able rope in @jemmajeffree who has asked about this functionality on CMIP data in the past

@jemmajeffree
Copy link

What do you need me to do?

Comment on lines 1062 to 1077
if cls.ensemble:
with xr.open_dataset(
file,
chunks={},
decode_cf=False,
decode_times=False,
decode_coords=False,
) as ds:
member_id = ds.attrs.get("realization_index", None)
if member_id is None:
raise ParserError(
f"Cannot determine member for file {file} - "
"realization_index attribute missing"
)
ncinfo_dict["member"] = f"r{int(member_id)}"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use some clever refactoring of the base builder to avoid opening the file more than once?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just pushed an update so we now use a @cache version of open_dataset.

All the builder tests still pass in my local tests and from 300 repeats of the 2 cmip6 test_builder_build tests the @cache version is ~40% quicker.

@charles-turner-1
Copy link
Collaborator Author

I think I wrote this as a comment but it seems to have disappeared - I'm thinking it might be worth cherry-picking the caching stuff into a separate feature branch.

NB. CI is failing because pixi doesn't seem to play nicely with commits straight from github. I'll try to work that one out ASAP.

@charles-turner-1 charles-turner-1 changed the title Cmip6Builder Ensemble bugfix Cmip6Builder Ensemble bugfix & dataset open caching Oct 31, 2025
@charles-turner-1 charles-turner-1 merged commit b99ee26 into main Oct 31, 2025
12 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

6 participants