Skip to content

feat: add tree to virtual array conversion #1393

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

pfackeldey
Copy link
Collaborator

This waits for scikit-hep/awkward#3364 (and a corresponding awkward release).

I may likely not have attempted the most optimal solution here. Happy for feedback & input.

@ianna
Copy link
Collaborator

ianna commented Apr 11, 2025

@pfackeldey - what is the plan for this PR? thanks!

@ikrommyd
Copy link
Contributor

@pfackeldey - what is the plan for this PR? thanks!

I think we were discussing with Peter to add some caching support. Uproot will deserialize each electron branch for example separately with its own offsets. All those will have the same count_branch however so it's probably best to not deserialize the same offsets dozens of times. It's probably best to cache count_branch deserialization result (length) and use it for the other branches that have the same count_branch.

@ariostas
Copy link
Collaborator

I think we were discussing with Peter to add some caching support.

Isn't there already some caching being done in Uproot? When trying to read the count_branch multiple times it should already be hitting the cache

@ikrommyd
Copy link
Contributor

ikrommyd commented Apr 11, 2025

I think we were discussing with Peter to add some caching support.

Isn't there already some caching being done in Uproot? When trying to read the count_branch multiple times it should already be hitting the cache

Yeah I need to try if it's hitting it, haven't done that yet. Will do today. Do you the best way to log that (the number of deserializations per branch)?

@ariostas
Copy link
Collaborator

Do you the best way to log that (the number of deserializations per branch)?

I'm not sure. I've just skimmed the code since at some point I'll have to do that for RNTuples

@pfackeldey
Copy link
Collaborator Author

@pfackeldey - what is the plan for this PR? thanks!

I'm not sure. I'm not a big fan of this implementation, but I also don't know how it can be done in a better way. I was hoping for some input here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants