Refactor FileWatchSensor to remove logshipper #5096

blag · 2020-11-29T10:29:46Z

This PR refactors the FileWatchSensor from the linux pack, to remove it's dependency on logshipper, and use watchdog instead.

Upstream logshipper hasn't been touched in years, so we switched to our own fork. That hasn't been well maintained either, and the Tail implementation, which is all we use from it, is a bit of a mess on top of depending on pyinotify has been itself unmaintained for even longer.

This PR borrows some code from the previous sensor, but also breaks up the code into well structured classes that all have a single focus, possibly allowing this code to be reused elsewhere. Additionally, you can run the sensor Python script itself, which should help serve as an example of how to write a "good" sensor.

The watchdog project is now being actively maintained, has wide support for Linux, macOS, BSD, Windows, and a polling fallback implementation. The pyinotify package refuses to even install on macOS (not being able to run is understandable, but preventing installation is too limiting IMO).

I realize that it's a bit odd to make the linux depend less on Linux-specific features, but since pyinotify refused to even install on macOS, I couldn't do simple things like run make requirements without Linux running in a full Docker or VM environment.

This PR also fixes a bug in logshipper (has to do with lines that do not end in newlines - previous code only took the last character, new code takes the whole line), and also presents a workaround for another possible bug in the comments.

Currently being held up by #5095.

Kami · 2020-12-16T12:00:58Z

Overall, I'm fine with the change, but some tests would be good :)

contrib/linux/sensors/file_watch_sensor.py

blag · 2020-12-22T19:17:48Z

Just running these tests gives us excellent coverage:

................................................
Name                   Stmts   Miss Branch BrPart  Cover
--------------------------------------------------------
file_watch_sensor.py     221      8     68     11    93%
----------------------------------------------------------------------
Ran 48 tests in 14.800s

OK

Almost all of the untested code was in the sensor itself.

blag · 2020-12-25T02:56:23Z

I tested this manually, both on macOS bare metal and in a Docker container (running on macOS).

contrib/linux/Dockerfile:

FROM ubuntu:18.04

RUN apt update && apt install --yes git make python3-dev python3-pip

RUN python3 -m pip install coverage eventlet mock nose watchdog

VOLUME /code

WORKDIR /code
ENV PYTHONPATH=actions:sensors
ENV NOSE_COVER_CONFIG_FILE=.coveragerc
CMD nosetests --with-coverage --cover-branches --cover-html tests/test_file_watch_sensor.py

And the coverage configuration contrib/linux/.coveragerc:

[run]
include = sensors/file_watch_sensor.py

[report]
include = sensors/file_watch_sensor.py

Commands (on bare metal):

cd contrib/linux
docker build --tag contrib-linux-pack-test .
docker run --rm --volume $(pwd):/code --workdir /code contrib-linux-pack-test

If you want to debug anything, run:

docker run -it --rm --volume $(pwd):/code --workdir /code contrib-linux-pack-test /bin/bash

Commands (within running Docker container):

# run a single test (specified at the bottom of test_file_watch_sensor.py)
python3 tests/test_file_watch_sensor.py

# run all tests and report coverage results in HTML
nosetests --with-coverage --cover-branches --cover-html tests/test_file_watch_sensor.py

then you can view results in your web browser.

I'm having an issue with the Travis tests where inotify events don't seem to be delivered. I'll have to debug that. But these testing instructions should suffice for anybody else who is interested in testing this out.

cognifloyd · 2021-10-06T20:25:12Z

The requirements compile issues have been resolved.
The pylint error about importing pika in a tools/ script has been resolved.

Now there's one consistent error left in the unit tests, though I don't know why this PR would cause that:

======================================================================
1) FAIL: test_process_error_handling (tests.unit.test_workflow_engine.WorkflowExecutionHandlerTest)
----------------------------------------------------------------------
   Traceback (most recent call last):
    virtualenv/lib/python3.6/site-packages/mock/mock.py line 1346 in patched
      return func(*newargs, **newkeywargs)
    st2actions/tests/unit/test_workflow_engine.py line 203 in test_process_error_handling
      self.assertEqual(t1_ex_db.status, wf_statuses.FAILED)
   AssertionError: 'succeeded' != 'failed'
   - succeeded
   + failed

cognifloyd · 2021-10-07T00:03:14Z

Well. I can't reproduce that test failure in CI. It might be related to #5358 which was just merged and touched the test that was failing. But, it's not failing anymore. If it shows up again, we can look into it.

Anyway, all tests are green now.

cognifloyd · 2021-10-15T03:03:55Z

I'm currently testing this.

✅ The first issue is that the Readme is not correct (and wasn't correct for the previous sensor either). You do NOT add a list of files to watch in the pack config. Instead, you must create a rule that uses the linux.file_watch.line trigger. You define the file_path trigger parameter in that rule.
Next issue: Using the original sensor, I've tried several variations on the rule, but I don't see any trigger instances or rule enforcements. So, the status quo does not work for me. Interestingly, the sensorcontainer logs do show that it adds and watches the correct file.

cognifloyd · 2021-10-15T03:29:37Z

✅ Next issue: If a rule is updated (eg change file path being watched) then the sensor needs to handle the update and change the watched path. (testing the sensor in this PR)

cognifloyd · 2021-10-15T04:30:32Z

OK. With debug logging (using the sensor from this PR), I can see that:

TailManager.run() logs its Running TailManager line
FileWatchSensor.add_trigger() gets called as expected due to a rule that uses the appropriate parametrized trigger
- tail_manager.tail_file() gets called
  - SingleFileTail inits correctly
    - opens the file
    - runs observer.schedule() to start the file and parent dir watches
- tail_manager.start() gets called logging its Starting TailManager line
  - the observer.start() method does not seem to be returning. I do not see the Started Observer, emitters log line.
    - observer.start() loops through its emitters, calling emitter.start(), but that is already called in observer.schedule() so that's probably not it
    - observer.start() also calls BaseThread.start() (via super(): approx mro BaseObserver(EventDispatcher(watchdog.utils.BaseThread)))
    - BaseThread(threading.Thread).start() calls self.on_thread_start()
      - This inits an InotifyBuffer which is itself a BaseThread.
        
        inotify_buffer is geting the file change notifications correctly because its log lines are showing up
    - BaseThread(threading.Thread).start() should finally call threading.Thread.start(self) but I can't tell if this gets called or if it completes
- so, add_trigger never returns - this is probably one reason updates don't happen.

observer is a watchdog.observers.Observer instance.

We might also need to modify:

TailManager.run() as it uses time.sleep ... not sure if we need to use eventlet's version

cognifloyd · 2021-10-15T04:32:44Z

Since this doesn't work with eventlet yet, I marked it as draft.

cognifloyd · 2021-10-15T05:51:44Z

Using eventlet.debug.spew() it looks like watchdog's observer is blocking on trying to read events with os.read(self._inotify_fd, event_buffer_size). If the thread patching had worked it would have switched contexts back to something else (another greenlet / thread).

When I modify the file, this forces it to continue on to the next line (because it was able to read the fd). Then it seems to get lost in the logger:

https://github.com/gorakhargosh/watchdog/blob/v2.1.6/src/watchdog/observers/inotify_buffer.py#L57

Subsequent file edits never spew anything else from watchdog. Instead there are some lines from eventlet.green.threading and a loop in eventlet.patcher and a couple lines from threading about a limbo lock... Something is not respecting eventlet. Not sure what.

contrib/linux/sensors/file_watch_sensor.py

Lockfile diff: lockfiles/st2.lock [st2] == Added dependencies == watchdog 3.0.0 == Removed dependencies == logshipper 0.0.0 pyinotify 0.9.6

cognifloyd · 2024-01-30T03:29:27Z

History of PRs touching the sensor in this pack (my git spelunking notes):

Pack config was removed for the linux pack in #3361 and #3475. Remove the README note as it no longer applies.

cognifloyd · 2024-01-30T22:39:00Z

...

tail_manager.start() gets called logging its Starting TailManager line

the observer.start() method does not seem to be returning. I do not see the Started Observer, emitters log line.

observer.start() loops through its emitters, calling emitter.start(), but that is already called in observer.schedule() so that's probably not it

observer.start() also calls BaseThread.start() (via super(): approx mro BaseObserver(EventDispatcher(watchdog.utils.BaseThread)))

BaseThread(threading.Thread).start() calls self.on_thread_start()

This inits an InotifyBuffer which is itself a BaseThread.

inotify_buffer is geting the file change notifications correctly because its log lines are showing up

BaseThread(threading.Thread).start() should finally call threading.Thread.start(self) but I can't tell if this gets called or if it completes

so, add_trigger never returns - this is probably one reason updates don't happen.

observer is a watchdog.observers.Observer instance.

Watchdog uses threading. But the threading module is not getting monkey patched. I tried adjusting the monkey patching, monkey_patch(patch_thread=True), but that made no difference.

Actually, it looks like threading is partially getting monkey patched, but adjusting that monkey patch call didn't change anything.

For reference, I used this to determine the state of monkey patching, just before the observer "thread" is supposed to start. In every scenario I tried, I got eventlet.green.threading and False, so it is monkey patched, but eventlet doesn't think it is. 🤷 So, this is going to have to wait until we can get rid of eventlet. 😭

    def start(self):
        if self.tails and not self.started:
            self.logger.debug("Starting TailManager")
            import threading
            self.logger.debug(f"threading monkey_patch: {threading.current_thread.__module__}")
            import eventlet
            self.logger.debug(f"threading monkey_patch: {eventlet.patcher.is_monkey_patched(threading)}")

            self.observer.start()

We might also need to modify:

TailManager.run() as it uses time.sleep ... not sure if we need to use eventlet's version

I tested using eventlet.sleep, but that didn't make a difference because the context never goes back to that loop.

blag added enhancement refactor external dependency packs labels Nov 29, 2020

blag added this to the 3.4.0 milestone Nov 29, 2020

pull-request-size bot added the size/L PR that changes 100-499 lines. Requires some effort to review. label Nov 29, 2020

blag added 2 commits December 11, 2020 20:38

Refactor FileWatchSensor to use watchdog instead of our logshipper fork

6a8dc6d

Remove logshipper and pyinotify from requirements files

0a2f6ae

blag force-pushed the remove-logshipper branch from 3b77378 to 0a2f6ae Compare December 12, 2020 04:39

blag added 3 commits December 12, 2020 00:07

Import eventlet since we use it

2f71b2f

Use self.logger instead of self._logger

3f0abb2

Use self.trigger instead of self._trigger

d175507

blag force-pushed the remove-logshipper branch from 285c6ba to d175507 Compare December 13, 2020 01:24

arm4b requested a review from Kami December 14, 2020 12:00

Merge branch 'master' into remove-logshipper

ce6e8ef

Kami reviewed Dec 16, 2020

View reviewed changes

contrib/linux/sensors/file_watch_sensor.py Outdated Show resolved Hide resolved

blag added 3 commits December 18, 2020 11:12

Merge branch 'master' into remove-logshipper

6657a25

Refactor for testability and test coverage

8add2cc

Add tests for FileWatchSensor components

169cc5a

pull-request-size bot added size/XXL PR that changes 1000+ lines. You should absolutely split your PR into several. and removed size/L PR that changes 100-499 lines. Requires some effort to review. labels Dec 22, 2020

arm4b requested a review from Kami December 22, 2020 23:56

blag added 2 commits December 23, 2020 00:47

Linting

fa68bef

Merge branch 'master' into remove-logshipper

1493656

blag added 3 commits January 20, 2021 13:26

Add class docstrings

e932ffe

Add logging

5a22c27

No need for eventlet.sleep() when time.sleep() should do just fine

490595d

Merge branch 'master' into remove-logshipper

dca3e4f

add changelog entry

3055413

cognifloyd approved these changes Oct 7, 2021

View reviewed changes

cognifloyd marked this pull request as draft October 15, 2021 04:32

cognifloyd modified the milestones: 3.7.0, 3.8.0 Mar 31, 2022

cognifloyd mentioned this pull request Apr 2, 2022

Fix multiple file support in linux.file_watch.line + black + fstring #5467

Merged

rush-skills modified the milestones: 3.8.0, 3.9.0 Dec 14, 2022

Merge branch 'master' into remove-logshipper

9631524

cognifloyd reviewed Jan 20, 2024

View reviewed changes

contrib/linux/sensors/file_watch_sensor.py Outdated Show resolved Hide resolved

contrib/linux/sensors/file_watch_sensor.py Outdated Show resolved Hide resolved

Fix borked merge

83fa1ec

cognifloyd force-pushed the remove-logshipper branch from ced1299 to 83fa1ec Compare January 20, 2024 00:54

pin transitive dep in test-requirements.txt

bab2952

cognifloyd force-pushed the remove-logshipper branch from 35fa884 to bab2952 Compare January 20, 2024 02:56

cognifloyd added 2 commits January 29, 2024 16:39

Merge branch 'master' into remove-logshipper

f8d1f50

regen st2 lockfile to switch logshipper -> watchdog

5b38106

Lockfile diff: lockfiles/st2.lock [st2] == Added dependencies == watchdog 3.0.0 == Removed dependencies == logshipper 0.0.0 pyinotify 0.9.6

cognifloyd added 3 commits January 29, 2024 21:31

linux pack: Drop out-of-date README notes about pack config

ed32348

Pack config was removed for the linux pack in #3361 and #3475. Remove the README note as it no longer applies.

update the linux pack README to explain how to use FileWatchSensor

465a8f4

linux pack: add LinuxFileWatchSensor.update_trigger

36bacae

cognifloyd force-pushed the remove-logshipper branch from 78aa157 to 36bacae Compare January 30, 2024 04:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Refactor FileWatchSensor to remove logshipper #5096

Refactor FileWatchSensor to remove logshipper #5096

Uh oh!

blag commented Nov 29, 2020

Uh oh!

Kami commented Dec 16, 2020

Uh oh!

Uh oh!

blag commented Dec 22, 2020

Uh oh!

blag commented Dec 25, 2020

Uh oh!

cognifloyd commented Oct 6, 2021

Uh oh!

cognifloyd commented Oct 7, 2021

Uh oh!

cognifloyd commented Oct 15, 2021 •

edited

Loading

Uh oh!

cognifloyd commented Oct 15, 2021 •

edited

Loading

Uh oh!

cognifloyd commented Oct 15, 2021

Uh oh!

cognifloyd commented Oct 15, 2021

Uh oh!

cognifloyd commented Oct 15, 2021

Uh oh!

Uh oh!

Uh oh!

cognifloyd commented Jan 30, 2024

Uh oh!

cognifloyd commented Jan 30, 2024

Uh oh!

Uh oh!

Uh oh!

Refactor FileWatchSensor to remove logshipper #5096

Are you sure you want to change the base?

Refactor FileWatchSensor to remove logshipper #5096

Uh oh!

Conversation

blag commented Nov 29, 2020

Uh oh!

Kami commented Dec 16, 2020

Uh oh!

Uh oh!

blag commented Dec 22, 2020

Uh oh!

blag commented Dec 25, 2020

Uh oh!

cognifloyd commented Oct 6, 2021

Uh oh!

cognifloyd commented Oct 7, 2021

Uh oh!

cognifloyd commented Oct 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cognifloyd commented Oct 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cognifloyd commented Oct 15, 2021

Uh oh!

cognifloyd commented Oct 15, 2021

Uh oh!

cognifloyd commented Oct 15, 2021

Uh oh!

Uh oh!

Uh oh!

cognifloyd commented Jan 30, 2024

Uh oh!

cognifloyd commented Jan 30, 2024

Uh oh!

Uh oh!

cognifloyd commented Oct 15, 2021 •

edited

Loading

cognifloyd commented Oct 15, 2021 •

edited

Loading