Skip to content

Conversation

@Ejler
Copy link
Member

@Ejler Ejler commented Nov 25, 2025

Big updates to common/value_logger.py to improve the quality of the data collection.

  • smaller bug fixes in LoggingCriteriumChecker
  • new class defined as EjlaborateLoggingCriteriumChecker (this should deprecate the old one, but would break current scripts if replaced - hence the semi-new name)
  • ValueLogger updated with part of the algorithms of the Ejlaborate one.
  • Doc string added to include usage examples and stats

The EjlaborateLoggingCriteriumChecker compared to the old LoggingCriteriumChecker, uses the old algorithm as event-detection, but buffers all data in between events so it can loop through it save additional data relevant to get the features around the event. It does this by following the data backwards from the event until the slope changes sign. Hereafter it compares the linearity of the remaining data against the average linearity of the data in timeout events. Any significant deviations are added to the save pool. This means the logger/checker appends data points that should be saved to a list, which is sorted, emptied and returned by calling the 'get_data' method. This also clears any triggers which are set to True.

I have only tested the classes on prerecorded datasets and not yet on live data collection. Any comments so far?
numpy is currently imported at the top of the module (even if the chosen logger does not need it).
The name of the new logger is really long - any alternatives?

ValueLogger - blue dots are raw data. Red line shows the old or "sparse" algorithm. The green dots are points saved by the new "event" algorithm:
ValueLogger_simple_rough_pressure_4

EjlaborateLoggingCriteriumChecker - blue dots are raw data. Red line corresponds to data represented through the old LoggingCriteriumChecker, where the green dots are saved from EjlaborateLoggingCriteriumCheckers event-based algorithm. The yellow dots are additional points saved from the deviation check (linearity) on timeout events. The magenta and black data was temporary data collected to help fine tune the algorithms. The black data is basically the deviation from the average linearity of the data in timeout events. Any data points with this value exceeding the magenta dotted line times self.deviation_factor=50 (magenta solid line) are added to the save pool.
rough_pressure_1
rough_pressure_3

Copy link
Member

@robertjensen robertjensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not yet read the entire PR, but here are first thoughts:

  • This is truly amazing 🤩
  • We should remember to close issue #74 when this is merged
  • This PR is nothing less than a work of wonder 😎👌🔥
  • I think the test-code should be moved to a different folder, possible a new one
  • Have I mentioned that this PR is really, really great?🏆
  • I am in doubt what the simulate keyword does?
  • I am really, really a fan of this PR 🎯
  • If at all possible, it would be preferable to re-use the existing class-names - I am contemplating that it would actually be acceptable even if it breaks existing loggers 🤔
  • All in all - this is very very very nice 👍👍👍👍🤗

@robertjensen
Copy link
Member

Hi again
I have now decided with myself that my personal oppinion is to break the historic behaviour and simply keep the names even if it breaks the logger syntax - I am quite convinced this would be the best solution in the long run.

Also, remember to remove the indication of python 2 and 3 support, we now only support python 3 and all references to the python_2_and_3 module should be removed.

@Ejler
Copy link
Member Author

Ejler commented Dec 3, 2025

I had also considered your comment and have already almost finished with the implementations. To deal with the deprecation issue, I decided on printing a warning if the save pool is not empty as this would indicate use of the old method. The warning can be disabled by setting an attribute to False. Otherwise using the updated version the old way, the check method for the LoggingCriteriumChecker will keep appending new values to a list that is never emptied.

The ValueLogger implements the 'model' keyword, which defaults to 'sparse' so as not to break current behaviour. We could consider if this should be changed as well?

I will push the new changes later today so you can see.

@Ejler
Copy link
Member Author

Ejler commented Dec 3, 2025

Also the 'simulate' keyword was named as such to "simulate" the behaviour of the ValueLogger on an existing dataset. Since this logger is threaded, it bypasses the time.sleep(1) in the main thread and then looks for the timestamp from the reader instance instead of doing the timestamping itself. Do we have a better keyword for it or an alternative implementation? I decided not to tamper with regular threaded behaviour of the ValueLogger and for those desiring fine control over the timestamps, I suggest using the LoggingCriteriumChecker instead.

@robertjensen
Copy link
Member

Also the 'simulate' keyword was named as such to "simulate" the behaviour of the ValueLogger on an existing dataset. Since this logger is threaded, it bypasses the time.sleep(1) in the main thread and then looks for the timestamp from the reader instance instead of doing the timestamping itself. Do we have a better keyword for it or an alternative implementation? I decided not to tamper with regular threaded behaviour of the ValueLogger and for those desiring fine control over the timestamps, I suggest using the LoggingCriteriumChecker instead.

Do I understand you correct, that you have now removed the keyword and asked people to use LoggingCriteriumChecker instead? If this is the case, then I fully agree.

@Ejler
Copy link
Member Author

Ejler commented Dec 3, 2025

No the keyword is still there, but only used if you want to use the ValueLogger to filter out an existing/prerecorded data set. I'll have the doc string include an example - we can revisit it then :)

Ejler added 4 commits December 3, 2025 15:16
…riteriumChecker thereby overwriting the old version of the checker. If the save pool is not emptied before a new check, a DeprecationWarning is printed to warn old users of values being appended to an internal list. Doc string modified slightly.
… The internal save pool is reset when trigger is cleared, so even if the extra data is not used, it does not break current scripts. The extra overhead can be avoided by explicitly choosing the 'sparse' model
@Ejler
Copy link
Member Author

Ejler commented Dec 3, 2025

That was all I had planned for this pull request, so you can go ahead with the review 👍 😄

Copy link
Member

@robertjensen robertjensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazingly good and a feature that has been missed for a long long while.

I did not manage to get 100% through tonight, but I will read the remaining part as soon as possible.

Copy link
Member

@robertjensen robertjensen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants