-
-
Notifications
You must be signed in to change notification settings - Fork 738
fix: add events_checker() fixture, use for tests #1151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hmm this seems like a critical change: events order is actually relevant to test. We need to be sure that the library will emit events in order, else we will have weird issues. And not testing that, we will move that implicit check to downstream projects: this is not OK. I get it that tests are quite flaky, and this is a problem for the comminity. The proposed change reduce the code to write, that's cool and more readable. Honestly, I do not know what to do with that right now. Lets move forward on the rest of the stuff, and look back to this one later. Maybe others have opinions too. Anyway, thanks a lot the for effort you've put here. I am not throwing away everything but delaying for now. |
ngoldbaum
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like flaky is only used in three places once this PR is applied:
goldbaum at Nathans-MBP in ~/Documents/watchdog on events-checker!
± rg flaky
requirements-tests.txt
2:flaky==3.8.1
tests/utils.py
94: Provides some robustness for the otherwise flaky nature of asynchronous notifications.
tests/test_0_watchmedo.py
176:@pytest.mark.flaky(max_runs=5, min_passes=1)
tests/test_delayed_queue.py
10:@pytest.mark.flaky(max_runs=5, min_passes=1)
21:@pytest.mark.flaky(max_runs=5, min_passes=1)
Are those still needed?
It's also not clear to me when one would use the new EventChecker vs the old EventQueue. Should all the tests that still use EventQueue get migrated?
If not, maybe it's worth adding some text to docs/source/hacking.rst that explains when you'd want to use one or the other and also just generally explains how to work with the test suite.
Reading over the PR, it occurs to me that you could make save a bit of boilerplate by making the events_checker a context manager. Then there's no need to explicitly call check_events - exiting the context manager does that automatically.
I only got about halfway through test_mitter.py before running out of steam. If the missing checks weren't intentional it may be a good idea to go over test_emitter.py one more time to make sure all the old checks are still there.
tests/utils.py
Outdated
|
|
||
| class _EventsChecker: | ||
| # If True, output verbose debugging to stderr. | ||
| DEBUG = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's a personal style thing but I'd tend to leave out disabled debug code once I have the code working. IMO code that is disabled statically and not tested is destined to inevitably bitrot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove it but when you have tests failing, it's extremely useful to have this debugging output to see what events are expected and what events are generated. Maybe it should just always be turned on. I'd like to keep the debug logic in any case since I think you'd likely re-write it down the road when adding a new test or trying to debug a failing one.
I tried changing the code to be DEBUG = os.environ.get('EVENTS_CHECKER_DEBUG') == '1' but it doesn't work with tox. It seems tox will sanitize the process environment before running tests.
| expect_event(FileCreatedEvent(p("a"))) | ||
|
|
||
| if not platform.is_windows(): | ||
| expect_event(DirModifiedEvent(p())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did this check get intentionally deleted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup. I can put it back though. I think that event doesn't appear on other platforms as well so the condition might need to be if platform.is_linux() instead. I'll check. My logic was that you really care about the FileCreatedEvent event being generated.
| checker.add(FileCreatedEvent, "a_\udce4") | ||
| if not platform.is_windows(): | ||
| event = event_queue.get(timeout=5)[0] | ||
| assert os.path.normpath(event.src_path) == os.path.normpath(p("")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this assert preserved?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's checked by checker.add(DirModifiedEvent, ".")
|
|
||
| event = event_queue.get(timeout=5)[0] | ||
| assert event.src_path in [p("dir1"), p("dir2")] | ||
| assert isinstance(event, DirModifiedEvent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this check intentionally skipped?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new checking code is as follows and I think it matches the spirit of the original test. The order of the two DirModifiedEvents is not specified by the original test logic.
checker = events_checker()
if not platform.is_windows():
checker.add(FileMovedEvent, "dir1/a", dest_path="dir2/b")
checker.add(DirModifiedEvent, "dir1")
checker.add(DirModifiedEvent, "dir2")
else:
checker.add(FileDeletedEvent, "dir1/a")
checker.add(FileCreatedEvent, "dir2/b")
checker.check_events()
tests/test_emitter.py
Outdated
| assert event.src_path in [p("dir1"), p("dir2")] | ||
| assert isinstance(event, DirModifiedEvent) | ||
| checker.add(DirModifiedEvent, "dir1") | ||
| checker.add(DirModifiedEvent, "dir2") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only getting checked on Windows now, but there were checks outside this if statement that are missing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New code is as follows. Again I think that matches the spirit of the original test:
checker = events_checker()
if not platform.is_windows():
checker.add(FileMovedEvent, "dir1/file", dest_path="dir2/FILE")
checker.add(DirModifiedEvent, "dir1")
checker.add(DirModifiedEvent, "dir2")
else:
checker.add(FileDeletedEvent, "dir1/file")
checker.add(FileCreatedEvent, "dir2/FILE")
checker.check_events()
| if platform.is_windows(): | ||
| expected_events = [a_deleted, d_created] | ||
|
|
||
| if platform.is_bsd(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't needed anymore because we no longer care about ordering, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup.
tests/test_emitter.py
Outdated
| @pytest.mark.flaky(max_runs=5, min_passes=1, rerun_filter=rerun_filter) | ||
|
|
||
| checker = events_checker() | ||
| for _ in range(times): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the old code used times * 4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, using times is correct. We add 4 expected events each time through loop. Old code consumed events one-by-one and so needed the 4x.
| DirCreatedEvent: times, | ||
| DirModifiedEvent: times * 2, | ||
| DirDeletedEvent: times, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any way to get these count checks back using EventChecker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The checker is already doing that. New code is as follows:
checker = events_checker()
for _ in range(times):
checker.add(DirCreatedEvent, "dir1/sub_dir1")
checker.add(DirModifiedEvent, "dir1")
checker.add(DirDeletedEvent, "dir1/sub_dir1")
checker.add(DirModifiedEvent, "dir1")
The checker uses a list not a set for the expected events. So there will be times*2 occurrences of DirModifiedEvent expected. Using times rather than times*4 in the loop is correct.
I'm not intimately familiar with the OS APIs being used but, based on my readings, we cannot assume they give you file change events in some specific order. It depends on the OS and even the underlying filesystem being used. So checking for event ordering in tests doesn't seem like a good idea. Also, I think telling users that events are returned in some defined order when the underlying API can't assure that seems bad. The users will build unreliable software as well. I can understand that watchdog is doing a "best effort" in trying to return events in some order. So we want the unit tests to try to check that we haven't broken that. I can modify the PR to keep checking order, that's not a hard change. I'll modify the
That's fine and I appreciate the prompt feedback. It's a large PR and so I know that's a problem for reviewing. Also, the change in what we are actually checking makes it more difficult in determining if it's okay. What I could do to make progress is to turn on the ordering checking by default and then see which tests become unreliable and on what platforms. Then we can potentially deal with these on a case-by-case basis. An example where I think it doesn't make sense to check the order. I haven't tested this but I would assume those two events can be swapped, depending on OS and filesystem. rm("dir1/a")
expect_event(FileDeletedEvent("dir1/a"))
expect_event(DirModifiedEvent("dir1")) |
|
Some progress on improving this. It seems that on Linux (with inotify?) the ordering of events is deterministic. For Windows (at least on my Windows 10 VM) they are also deterministic. On my MacBook, they are not (get both events out of order and sometimes unexpected events). I'm working on an improved version of this patch. |
5c16703 to
7cbc5ec
Compare
|
Revised version, passing on MacOS, Linux and Windows for me. I didn't test BSD. The I switching to using it as a context manager, as suggested by Nathan. It only saves one line of code but it seems slightly cleaner to me. I also removed the This branch has been rebased on top of the double-buffer winapi change. That's required to make the Windows tests reliable, based on my testing. Once GH-1152 is passing CI and is merged, I can rebase this again if needed or merge with "master". |
I suspect they are not but I was going to leave them in until we can confirm, via CI runs, that they can be removed.
It would probably be cleanest to migrate all tests. However, it takes some work since some test files don't use the same fixtures. In order to use
I think
Good idea, changed.
Now that the fixture is checking for exactly the expected list of events, I think it's unlikely things were missed (otherwise the test would be failing). My only concern now is that different OS versions or filesystems might generate different sequences of events. I tested on MacOS Sonoma, Windows 10, Debian Linux on a XFS filesystem. Perhaps if CI passes we can consider it good enough? |
|
That's way better with ordering preserved, thank you! |
This makes the unit tests more reliable by allowing events to be received in different orders and by allowing additional (non-expected) events.
This matches the original test logic, before addition of the events_checker() fixture.
Rather than allow events in any order, match the order of the events as added. On MacOS, fsevents does not produce events in a determistic order so allow any ordering on that platform.
This reduces the code a little. Use `ec` as the short local name of the checker instance. Add `verbose` argument to `events_checker()` that turns on verbose debugging output.
Remove events that are sometimes not generated.
7cbc5ec to
93b8d86
Compare
|
CI testing will hopefully be fixed now. There was some MacOS failures because the CI runs return different events vs what I get on my MacBook. E.g. for But in CI it only gives the On Windows, the The Windows Note that the I see one failure yet on Linux 3.13t but I think that's unrelated to this change. It is |
|
If 3.13t is annoying to support, it's ok to skip IMO. Anyone newly working with the free-threaded will probably be using 3.14t. |
This makes the unit tests more reliable by using a helper class to handle the checking of received events. On the MacOS platform, events are allowed to be received in different orders and additional (non-expected) events are also allowed. On other platforms (Linux, Windows, BSD), the checker confirms that exactly the events expected are received and confirms the ordering.
There is a new pytest fixture
events_checker(). Typical usage is as follows:I did not change all unit tests to use this fixture, mostly just the ones in
test_emitter. This change removes most of thepytest.mark.flaky()markers since the tests are more reliable.