-
Notifications
You must be signed in to change notification settings - Fork 38
Feature/value logger #84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ade of fine logging
…ing and implemented in ValueLogger
…d of the newest value
… adds numpy for processing of buffered data. smooth function copied over from Ejler/DataTreatment. Contains development debugging attributes.
…ted into ValueLogger. Also added 'simulate' attribute to run it on prerecorded unfiltered data
…lters a prerecorded dataset
robertjensen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not yet read the entire PR, but here are first thoughts:
- This is truly amazing 🤩
- We should remember to close issue #74 when this is merged
- This PR is nothing less than a work of wonder 😎👌🔥
- I think the test-code should be moved to a different folder, possible a new one
- Have I mentioned that this PR is really, really great?🏆
- I am in doubt what the
simulatekeyword does? - I am really, really a fan of this PR 🎯
- If at all possible, it would be preferable to re-use the existing class-names - I am contemplating that it would actually be acceptable even if it breaks existing loggers 🤔
- All in all - this is very very very nice 👍👍👍👍🤗
|
Hi again Also, remember to remove the indication of python 2 and 3 support, we now only support python 3 and all references to the python_2_and_3 module should be removed. |
|
I had also considered your comment and have already almost finished with the implementations. To deal with the deprecation issue, I decided on printing a warning if the save pool is not empty as this would indicate use of the old method. The warning can be disabled by setting an attribute to False. Otherwise using the updated version the old way, the check method for the LoggingCriteriumChecker will keep appending new values to a list that is never emptied. The ValueLogger implements the 'model' keyword, which defaults to 'sparse' so as not to break current behaviour. We could consider if this should be changed as well? I will push the new changes later today so you can see. |
|
Also the 'simulate' keyword was named as such to "simulate" the behaviour of the ValueLogger on an existing dataset. Since this logger is threaded, it bypasses the time.sleep(1) in the main thread and then looks for the timestamp from the reader instance instead of doing the timestamping itself. Do we have a better keyword for it or an alternative implementation? I decided not to tamper with regular threaded behaviour of the ValueLogger and for those desiring fine control over the timestamps, I suggest using the LoggingCriteriumChecker instead. |
Do I understand you correct, that you have now removed the keyword and asked people to use |
|
No the keyword is still there, but only used if you want to use the ValueLogger to filter out an existing/prerecorded data set. I'll have the doc string include an example - we can revisit it then :) |
…riteriumChecker thereby overwriting the old version of the checker. If the save pool is not emptied before a new check, a DeprecationWarning is printed to warn old users of values being appended to an internal list. Doc string modified slightly.
… The internal save pool is reset when trigger is cleared, so even if the extra data is not used, it does not break current scripts. The extra overhead can be avoided by explicitly choosing the 'sparse' model
|
That was all I had planned for this pull request, so you can go ahead with the review 👍 😄 |
robertjensen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is amazingly good and a feature that has been missed for a long long while.
I did not manage to get 100% through tonight, but I will read the remaining part as soon as possible.
robertjensen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.

Big updates to common/value_logger.py to improve the quality of the data collection.
The EjlaborateLoggingCriteriumChecker compared to the old LoggingCriteriumChecker, uses the old algorithm as event-detection, but buffers all data in between events so it can loop through it save additional data relevant to get the features around the event. It does this by following the data backwards from the event until the slope changes sign. Hereafter it compares the linearity of the remaining data against the average linearity of the data in timeout events. Any significant deviations are added to the save pool. This means the logger/checker appends data points that should be saved to a list, which is sorted, emptied and returned by calling the 'get_data' method. This also clears any triggers which are set to True.
I have only tested the classes on prerecorded datasets and not yet on live data collection. Any comments so far?
numpy is currently imported at the top of the module (even if the chosen logger does not need it).
The name of the new logger is really long - any alternatives?
ValueLogger - blue dots are raw data. Red line shows the old or "sparse" algorithm. The green dots are points saved by the new "event" algorithm:

EjlaborateLoggingCriteriumChecker - blue dots are raw data. Red line corresponds to data represented through the old LoggingCriteriumChecker, where the green dots are saved from EjlaborateLoggingCriteriumCheckers event-based algorithm. The yellow dots are additional points saved from the deviation check (linearity) on timeout events. The magenta and black data was temporary data collected to help fine tune the algorithms. The black data is basically the deviation from the average linearity of the data in timeout events. Any data points with this value exceeding the magenta dotted line times self.deviation_factor=50 (magenta solid line) are added to the save pool.

