Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement storage- and relationship-aware cleanup #973
Implement storage- and relationship-aware cleanup #973
Changes from 16 commits
bf0841b
3b4cbaa
e43b561
7b5b295
e6226f7
6031478
9635aba
98edb6c
14cc6da
d83bbf6
152e704
0d70211
2eb6a1d
8ded8b5
1d61cc3
90bf1e2
a726394
15ef7d2
6693116
5b6206c
15aea8b
0a22ed4
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We certainly stand to gainfrom not having to initialize and tear down the pool every time.
But it seems that the calls to
cleanup_files_batched
themselves are not concurrent. Is that on purpose? Like we are now it seems that at any given time there would always be at most 1 pool running anywayThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, it is running only with one pool at a time.
The regular deletions are still running serially in topological sort order. So it does not concurrently delete two unrelated models.
I was thinking about that a bit, but I thought it would be too complex to actually implement. Also reusing the thread pool would mean that I would have to introduce some kind of "waitpool" to make sure all the file deletes are complete before returning, which still keeping the pool alive.
I wanted to avoid that complexity for now.
Check warning on line 62 in services/cleanup/models.py
services/cleanup/models.py#L62
Check warning on line 65 in services/cleanup/models.py
services/cleanup/models.py#L65
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[very nit] maybe a
NamedTuple
would make the code more ergonomic here (but less performant for sure)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t fully understand? Would you prefer to not destruct the tuple into variables right away, but rather access them through a
NamedTuple
object? I’m not sure that would make things more readable 🤔Check warning on line 135 in services/cleanup/models.py
services/cleanup/models.py#L135