Skip to content

Add mixin class for data types with un-named (sub)group properties, e.g. ProcessingModule #705

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 37 commits into
base: main
Choose a base branch
from

Conversation

ehennestad
Copy link
Collaborator

@ehennestad ehennestad commented May 1, 2025

Motivation

fix #625

How to test the behavior?

processingModule = types.core.ProcessingModule();
processingModule.add('MyTimeSeries', types.core.TimeSeries);
processingModule.MyTimeSeries
ans = 

  TimeSeries with properties:

     starting_time_unit: 'seconds'
    timestamps_interval: 1
        timestamps_unit: 'seconds'
                   data: []
              data_unit: ''
               comments: 'no comments'
                control: []
    control_description: ''
        data_continuity: ''
        data_conversion: 1
            data_offset: 0
        data_resolution: -1
            description: 'no description'
          starting_time: []
     starting_time_rate: []
             timestamps: []

Cases to write unit tests for:

Edge cases mentioned in issue

Adding data types to a container type

  • Data type is added to a container, i.e aProcessingModule.add('namedDataType')
    • Is the data type added to the underlying types.untyped.Set?
  • Is the data type added as a dynamic property to the container type (i.e a ProcessingModule) on nwbRead?

Removing data types from a container type

  • Test that data object is removed from underlying subgroup (Set object) when using remove on the parent data object
  • Test that data object is removed from the parent data object if the object is removed from the subgroup (Set object)

Checklist

  • Have you ensured the PR description clearly describes the problem and solutions?
  • Have you checked to ensure that there aren't other open or previously closed Pull Requests for the same change?
  • If this PR fixes an issue, is the first line of the PR description fix #XX where XX is the issue number?

Try using redefinesParen
Add HasGroup mixin class to classes with anonymous subgroups
Fix HasGroups with redefinesDot. This is one option, but only supported from R2021b and later
Utility method for listing names of generated classes for neurodata types
Add TypeName property as a convenience for getting the short name of a data type
- Renamed to HasUnnamedGroups
- Use dynamic props instead of overriding indexing
- add "add" method
Use callback functions instead of event listeners. The dependency of container groups and their sets are very specific, and an event/listener system would be too general for this case. I.e a Set does not generally have to notify about Items added/removed.
- Add callback function properties accessible by matnwb.mixin.HasUnnamedGroups
- Add add method
- Add optional inputs for controlling behavior of set method
Rename and reorder methods for better logical composition
- Add the HasUnnamedGroups mixin to the relevant data type classes
Add minimal support for contained "types.untyped.Anon" objects add and NotImplemented warnings/errors in cases where these are not supported
Suppress warning
Rename variable depnm to superclassNames
Copy link

codecov bot commented May 1, 2025

Codecov Report

Attention: Patch coverage is 97.76952% with 6 lines in your changes missing coverage. Please review.

Project coverage is 94.85%. Comparing base (16862b7) to head (bba8564).

Files with missing lines Patch % Lines
+matnwb/+mixin/HasUnnamedGroups.m 97.39% 5 Missing ⚠️
+types/+untyped/Anon.m 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #705      +/-   ##
==========================================
- Coverage   95.05%   94.85%   -0.21%     
==========================================
  Files         161      163       +2     
  Lines        5868     6124     +256     
==========================================
+ Hits         5578     5809     +231     
- Misses        290      315      +25     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ehennestad
Copy link
Collaborator Author

This is currently possible in MatNWB, but creates an invalid group in the NWB-file where two different data type objects are merged in the same ground:

nwbFile = tests.factory.NWBFile();
module = types.core.ProcessingModule();

nwbFile.processing.set('test', module);

T = struct2table(struct('name', {'a', 'b'}, 'value', {1, 2}));
dynamicTable = util.table2nwb(T);

nwbFile.processing.get('test').description = 'a test';
nwbFile.processing.get('test').nwbdatainterface.set('test', tests.factory.TimeSeriesWithTimestamps)
nwbFile.processing.get('test').dynamictable.set('test', dynamicTable)

nwbExport( nwbFile, 'test_same_name_in_different_subgroups.nwb')

Easy to prevent for this new syntax, but not for the legacy syntax.

ehennestad added 3 commits May 3, 2025 22:11
Improve variable naming
Remove unused code
Simplify some code sections
Create nameExists method to check i name is already used in subgroup containers
@ehennestad ehennestad marked this pull request as ready for review May 4, 2025 19:56
Add verification the test that object is added and removed on the underlying subgroup object when using methods on the parent object
@ehennestad ehennestad requested a review from bendichter May 4, 2025 21:33
@ehennestad ehennestad changed the title Add mixin class for data types with un-named (sub)group properties, i.e ProcessingModule Add mixin class for data types with un-named (sub)group properties, e.g. ProcessingModule May 5, 2025
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • renamed depnm to superclassNames
  • add matnwb.mixin.HasUnnamedGroups as superclass to data types with un-named subgroups

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convenience method for listing class names (or short names) of all generated types

@bendichter
Copy link
Contributor

Do we need all of this renaming logic? What if someone tries to add two objects, Time-Series and Time_Series?

Why can't we just change module.datainterfaces.get("Foo") to module.get("Foo") or module("Foo")? Then we don't need to mess with any of this name changing stuff. We can expose module.Foo if the name allows it, and if not then revert to the other syntax.

@ehennestad
Copy link
Collaborator Author

ehennestad commented May 5, 2025

Do we need all of this renaming logic? What if someone tries to add two objects, Time-Series and Time_Series?

That will look like this:

module = types.core.ProcessingModule('description', 'a module with two timeseries');
module.add('time_series', types.core.TimeSeries)
module.add('time-series', types.core.TimeSeries)

>> module

module = 

Warning: The following named elements of "ProcessingModule" are remapped to have valid MATLAB names, but will be written to file with
their actual names:
       ValidName        ActualName
    _______________    _____________

    "time_series_1"    "time-series"
 
  ProcessingModule with properties:

      description: 'a module with two timeseries'
    time_series_1: [1×1 types.core.TimeSeries]
      time_series: [1×1 types.core.TimeSeries]

Which is actually quite similar to how MATLAB does it in e.g readtable.

given a test.csv file:

time-series	time_series
1	2
3	4
>> readtable('test.csv')
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the
table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names. 

ans =

  2×2 table

    time_series    time_series_1
    ___________    _____________

         1               2      
         3               4      

Why can't we just change module.datainterfaces.get("Foo") to module.get("Foo") or module("Foo")? Then we don't need to mess with any of this name changing stuff. We can expose module.Foo if the name allows it, and if not then revert to the other syntax.

First, I would like to avoid using module.get because its not a very common MATLAB syntax, and it's difficult for a user to know when to use module.get and when to use standard dot-indexing, i.e module.desription.

Implementing parenthesis indexing is quite straightforward with the matlab.mixin.indexing.RedefinesParen class, but it is only supported from R2021b. It is possible to override subsref for compatibility in older releases, but I would really not recommend doing that.

Also, one syntax for "valid" names and another syntax for other names I think would be confusing.

In summary, the renaming logic would provide full support for reading files from pynwb where names are specified with spaces or or other symbols that are not supported in MATLAB but also take care of edge cases like the one you used as an example, and in general be easier to use as there would only be one type of indexing syntax to remember

@bendichter
Copy link
Contributor

You're not warning on the add command? I'm really not into magically renaming. Now the order in which the data elements are added to the file object matters and the user needs to learn the renaming rule. I think module.get('my-name') is so much simpler and sufficiently solves the problem without getting into tricky territory. I understand it's a bit unusual syntax but I've just been in too many situations where renaming causes more confusion than it is worth.

@ehennestad
Copy link
Collaborator Author

ehennestad commented May 5, 2025

You're not warning on the add command?

Good point, there could be an extra warning on the add command.

I'm really not into magically renaming

It's not really renaming, just creating an alias which is used for the dynamic properties that enables dot-indexing and autocompletion. module.nwbdatainterface.get('my-name') would still work and creating a shortcut as module.get('my-name') would be a quick addition.

How about keeping/providing both modes?

A typical MATLAB user would likely specify names using valid camelCase or snake_case and not encounter the name remapping at all. If they loaded a file created with python using different naming convention, their preferred modes might vary. For example, I prefer to explore new objects in the command window, in which the alias-names are presented and where I can use them for dot-indexing. If someone else prefers to use the "real" names they could do so with module.get('my-name')

@ehennestad
Copy link
Collaborator Author

ehennestad commented May 8, 2025

  • 1. warning on read if name aliases are used
  • 2. .get syntax
  • 3. name mapping should be publicly accessible
  • 4. how-to-guide

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: Change implementation of containers types to simplify API
2 participants