Skip to content

fix: #761, don't store mails - store their IDs#896

Draft
isosphere wants to merge 4 commits into
kdeldycke:mainfrom
isosphere:main
Draft

fix: #761, don't store mails - store their IDs#896
isosphere wants to merge 4 commits into
kdeldycke:mainfrom
isosphere:main

Conversation

@isosphere

@isosphere isosphere commented Oct 27, 2025

Copy link
Copy Markdown

Summary

WIP. Successfully hashes all mails and executes actions with < 200 MB of RAM for 107k emails.

Requires a bit more testing to ensure I didn't mess anything up. I've run it on my own mail archives.

Solves out of memory condition by not storing mail instances in memory.

This PR fixes #761.

Previously we would require 570 kB - 1.8 MB per email in memory to complete execution. This was primarily due to this line:

self.mails.setdefault(mail_hash, set()).add(mail)

... which kept the mail reference alive outside of the inner loop of iteritems, which kept the object in memory. I solved this by storing a reference to the mail box + mail id instead, and fetching those later as needed.

Preliminary checks

New Features Submissions:

  • Does your submission pass tests?

Changes to Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

@isosphere

This comment was marked as resolved.

@kdeldycke kdeldycke added the ✨ enhancement Improvement or change to an existing feature label Oct 27, 2025
@kdeldycke

Copy link
Copy Markdown
Owner

Thanks @isosphere for trying to tackle this issue!

... which kept the mail reference alive outside of the inner loop of iteritems, which kept the object in memory. I solved this by storing a reference to the mail box + mail id instead, and fetching those later as needed.

I'm OK with this approach. In fact I am OK with any approach as long as all unittests are passing! 😅

If you manage to find a way to shave off some memory consumption, I'll be happy to merge your PR and cut a release.

@kdeldycke

Copy link
Copy Markdown
Owner

It's been a long time since I dived into the core of that project, so I trust you to make sensible choices, and will not be too much nitpicky about the code. But I'll make the effort of properly releasing it once this PR is merged.

@kdeldycke kdeldycke mentioned this pull request Oct 27, 2025
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

✨ enhancement Improvement or change to an existing feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Infamous memory usage

2 participants