Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse array in get_squeeze_factors to avoid excessive memory allocation #1588

Merged
merged 1 commit into from
Feb 13, 2025

Conversation

tamuri
Copy link
Collaborator

@tamuri tamuri commented Feb 12, 2025

Profiling revealed that even under small population (10k) sizes, this method was using several gigabytes of temporary allocations for squeeze factor lists. This PR reuses an array to prevent repeatedly allocating memory. Allocating these small amounts of memory can memory fragmentation.

  • Add a numpy array, and grow it as needed to accommodate number of events being processed.
  • Reuse the numpy array to store squeeze factors.
  • We don't need to overwrite the array, because we always fill items 0 to n, where n is number of events.
  • No need to check whether every officer is unavailable. Break on the first match.
  • The store is only written to inside the method, transparent to caller.

…tion

- Add a numpy array, and grow it as needed to accommodate number of events being processed
- No need to check whether every officer is unavailable. Break on the first match
- The store is only used inside the method, transparent to caller
Copy link
Collaborator

@marghe-molaro marghe-molaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I think typically the required size will be much smaller than 500 (I think (!) usually max 3-4 officers per appt footprint, and max 3-4 appt footprints per event, so something ten times smaller (50) should already be more than enough. But I doubt reducing it would make much difference!).
Interesting to learn that a for loop can be faster than the any() function!

@tamuri
Copy link
Collaborator Author

tamuri commented Feb 13, 2025

Looks good to me. I think typically the required size will be much smaller than 500 (I think (!) usually max 3-4 officers per appt footprint, and max 3-4 appt footprints per event

The _get_squeeze_factors_store array is used to hold the squeeze factors calculated for every HSI event for the day. These can be quite a few, especially at larger population sizes.

@tamuri tamuri merged commit 16fbf7b into master Feb 13, 2025
62 checks passed
@tamuri tamuri deleted the tamuri/hs-get_squeeze_factors branch February 13, 2025 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants