Skip to content

Cache cached data elements in cfdm.read #334

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

davidhassell
Copy link
Contributor

Fixes #313

@davidhassell davidhassell added the performance Relating to speed and memory performance label Apr 7, 2025
@davidhassell davidhassell added this to the NEXTVERSION milestone Apr 7, 2025
Copy link
Member

@sadielbartholomew sadielbartholomew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question, but overall perfect so please merge once answered. Thanks.

@@ -11273,6 +11288,9 @@ def _cache_data_elements(self, data, ncvar):

elements[index] = value

# Cache the cached data elements for this variable
g["cached_data_elements"][ncvar] = elements
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we run also run data._set_cached_elements(elements) in the line after this? It that (still) necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Sadie - data._set_cached_elements(elements) caches the elements on the Data object, but the way the code is at the moment, all metadata netCDF variables get read twice (once as a potential field, and once as metadata), so we want to cache the elements at the netcdfread.py level, too, so that the second read doesn't have to go back to the file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, hang on! But elements aren't cached for Field constructs, so there should be no double-reading at all ... In that case maybe this issue is not an issue at all! I'll have a think and a dig ....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Relating to speed and memory performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve performance in cfdm.read by caching any array values retrieved from disk
2 participants