Skip to content

Fix moved_records not cleaned #1076

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

ksiv
Copy link

@ksiv ksiv commented Oct 5, 2024

Added separate thread to clean records from moved_events that are used for source_path acquisition with some delay for events not to get cleaned before they have a chance to be used as for source_path acquisition

Dict optimized to hold smaller object with only src_path and timestamp.

This PR supposed to deal with #1041
And possibly partly with #587

Konstantin Ivanov added 4 commits October 4, 2024 00:07
… to make sure events remove after they can be used as a "source". Dict content changed from holding whole event to hold only src_path and time, since only src_path is used, time added to use it as a threshold to clean.
Extra test folder deleted

 On branch fix_moved_records_not_cleaned
 Changes to be committed:
	modified:   src/watchdog/observers/inotify_c.py
	modified:   tests/test_inotify_c.py
	deleted:    tests/threaded/conftest.py
	deleted:    tests/threaded/test_inotify_c.py
 Changes to be committed:
	modified:   src/watchdog/observers/inotify_c.py
@conara
Copy link

conara commented Mar 3, 2025

Are you still working on this? Currently I have an application that consumes after some time a lot of memory caused by this bug. So what is a safe approach to clean this?

@conara
Copy link

conara commented Mar 4, 2025

The proposed solution feels a bit bulky for me. Can you explain the reason these events are cached and why it is safe to remove them after x amount of time?

@ksiv
Copy link
Author

ksiv commented Apr 28, 2025

Are you still working on this?

No. I do not. The reason is I could not get through the pipeline check and It get no attention for a long time. Longer than my attention span allowed.
just left it here in case someone would be interested in the topic later.

@ksiv
Copy link
Author

ksiv commented Apr 28, 2025

The proposed solution feels a bit bulky for me.

Thank you for your evaluation of my code I will try to improve my approach in the future.

Can you explain the reason these events are cached and why it is safe to remove them after x amount of time?

These events are cached because some of them are needed to reconstruct a compound event like a move to a different folder, here FolderA_OUT event is merged with FolderB_IN one. After such event is reconstructed it is safe to remove it.

I have chosen post-cleaning as the fastest approach.
Updated :The downside of this solution is it still not clean memory it does not need anymore. I think I introduced it where the comment points.

Sorry for a late reply, I hope it helps some.

@@ -208,8 +220,9 @@ def remember_move_from_event(self, event: InotifyEvent) -> None:
"""Save this event as the source event for future MOVED_TO events to
reference.
"""
self._moved_from_events[event.cookie] = event

path_and_time = EventPathAndTime(event.src_path)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a parasite key

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self._moved_from_events[event.cookie] = EventPathAndTime(event.src_path)
"""like this should be ok
"""

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants