Skip to content

Conversation

@edeno
Copy link
Collaborator

@edeno edeno commented Apr 28, 2025

StateScript log parsing and processing module.

This module provides tools for parsing, interpreting, and processing .stateScriptLog files generated by Trodes (https://docs.spikegadgets.com/en/latest/basic/StateScript.html). It handles the conversion of Trodes timestamps, alignment with external time sources, interpretation of Digital Input/Output (DIO) or analog states, and processing of various common log line formats.

A main task for the user is to convert their trodes time to a more precise time using a DIO reference. This is accomplished using the calculate_time_offset method to align a statescript event to a DIO. The get_events_dataframe method then outputs a dataframe with timestamp_sync and the events parsed into different types.

No NWB conversion yet but there might be a simple way to do this.
Not sure where this would go? It is a time series of events so perhaps using https://github.com/rly/ndx-events? @rly ?

One challenge is technically the user can print anything but there tend to be common formats explained below:

Notes

Source Files:
- Log files parsed by this module typically have the .stateScriptLog extension.
- These files are generated by Trodes during data acquisition sessions.

Timestamp Information:
- The primary timestamp (<timestamp_int>) found in these logs is a 64-bit integer.
- It represents the number of milliseconds elapsed since the start of the
Trodes recording session.
- This is often referred to as the 'Trodes timestamp'.

Log Line Formats:
.stateScriptLog files usually contain lines adhering to several common formats.
The module aims to parse lines matching these structures:

``ts_int_int`` : `<timestamp_int> <int> <int>`
    Represents timestamp and two integers. These integers often function as
    bitwise masks representing the state of DIO pins.
    Example: ``1817158 128 512``

``ts_str_int`` : `<timestamp_int> <str> <int>`
    Represents timestamp, a string label, and an integer value. Frequently
    used for user-defined messages logging DIO pin state changes (e.g., pin name and state).
    Example: ``8386500 DOWN 3``

``ts_str_eq_int`` : `<timestamp_int> <str> = <int>`
    Represents timestamp and a named integer variable assignment, useful for
    tracking counters or state variables within the StateScript.
    Example: ``3610855 totRewards = 70``

``ts_str`` : `<timestamp_int> <str...>`
    Represents timestamp followed by one or more space-separated strings.
    Commonly used for logging event markers or descriptive text messages.
    Example: ``1678886401 LOCKEND``

``comment_or_empty`` : Lines starting with `#` or completely empty lines.
    Lines starting with '#' are treated as comments. Empty lines may also occur.
    These are typically ignored during data extraction.
    Example: ``# Starting new trial block``

``unknown`` : Lines that do not conform to the patterns listed above.
    These might include initial header lines, formatting variations, or unexpected entries.
    Example: ``initiated``

Component Definitions:
- <timestamp_int>: 64-bit integer; milliseconds since session start (Trodes timestamp).
- <int>: Integer value; often used as a bitwise mask for DIO pin states.
- <str>: String value; can represent an event name, variable name, message component, etc.
- <str...>: Denotes one or more space-separated strings.

Example log file snippet:

#
648028 UP 2
648028 2 0
~~~
~~~
~~~
~~~
~~~
~~~
~~~
~~~
~~~
~~~
~~~
~~~
~~~
~~~
~~~
~~~
~~~
~~~
648083 lastPort = -1 to currPort = 1
~~~
658285 DOWN 2
658285 0 0
~~~
658312 contingency = 0
658313 trialThresh = 78
658313 timeMaxOut = 30
658313 timeElapsed = 0
658314 totalPokes = 1
658314 totalRewards = 0
658315 countPokes1 = 0
658315 countRewards1 = 0
658316 portProbs1 = 90
658316 countPokes2 = 1
658317 countRewards2 = 0
658337 portProbs2 = 10
658338 countPokes3 = 0
658338 countRewards3 = 0
658339 portProbs3 = 50
~~~
665780 UP 1
665780 1 0
~~~
~~~
~~~
~~~
~~~
665808 1 8388608

Sample dataframe output

image

To-Do

  • Better API for time alignment
  • Improve notebook documentation
  • How to store in NWB/Spyglass
  • detect max_DIO from data

@edeno edeno requested a review from Copilot April 28, 2025 16:26
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces statescript parsing tools by updating the documentation and function signatures in the behavioral events processing code.

  • Updated the docstring in _get_channel_name_map to reflect a nested dictionary structure.
  • Provides clarification on the expected mapping between hardware events and their human-readable names.

@edeno edeno requested a review from Copilot April 28, 2025 21:24
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds statescript parsing tools with an updated conversion function to support detailed metadata definitions.

  • Updated the return type of _get_channel_name_map to a nested dictionary.
  • Modified the docstring to include the new key structure with "name" and "comments".
Comments suppressed due to low confidence (1)

src/trodes_to_nwb/convert_dios.py:12

  • The function now returns a nested dictionary instead of a flat mapping. Update the docstring summary to clearly reflect that each hardware event maps to a dictionary with 'name' and 'comments' keys.
def _get_channel_name_map(metadata: dict) -> dict[str, dict[str, str]]:

@codecov
Copy link

codecov bot commented Apr 29, 2025

Codecov Report

Attention: Patch coverage is 75.91973% with 72 lines in your changes missing coverage. Please review.

Project coverage is 87.05%. Comparing base (73e14e6) to head (b300cb9).

Files with missing lines Patch % Lines
src/trodes_to_nwb/convert_statescript.py 75.83% 72 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #115      +/-   ##
==========================================
- Coverage   89.32%   87.05%   -2.28%     
==========================================
  Files          12       13       +1     
  Lines        1471     1769     +298     
==========================================
+ Hits         1314     1540     +226     
- Misses        157      229      +72     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@edeno edeno requested review from Copilot and samuelbray32 April 29, 2025 11:43
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the log parsing module for StateScript by enhancing the metadata mapping for behavioral events.

  • Updated the return type of _get_channel_name_map to a nested dictionary structure.
  • Clarified the documentation to accurately reflect the updated mapping format.

@edeno edeno marked this pull request as ready for review April 30, 2025 16:32
Copy link
Collaborator

@samuelbray32 samuelbray32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this PR include something like add_coverted_statescript(file, nwb) that uses these to add a DynamicTable to the nwb file?

)


def _parse_int(s: str) -> Optional[int]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make more sense to name this function _is_int and return a True/False value? converting to int in the function where it's called is straightforward and the current name doesn't communicate as clearly how it's being used below

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is still parsing a string to an int though right?

df["timestamp"].astype(float) / self.MILLISECONDS_PER_SECOND
)
# Ensure original timestamp remains integer
df["timestamp"] = df["timestamp"].astype(int)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more useful conversion might be getting to unix time since nwb conversion makes a link between trodes timestamps and that already

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the timestamp_sync column is the converted timestamps. You need to run calculate_time_offset to align it to a specific DIO.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be confused. Are these timestamps not the same as the trodes timestamps we're using in convert position?

@edeno
Copy link
Collaborator Author

edeno commented Apr 30, 2025

Can this PR include something like add_coverted_statescript(file, nwb)[sic] that uses these to add a DynamicTable to the nwb file?

I guess the question is whether we want to make these specific behavioral events or if we want to just dump the whole thing into a table.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants