Something I've had in the back of my mind for a while.
Open to other suggestions, or if people can think of some consequences I haven't accounted for.
Might even be a 1.0 consideration.
The problem
There are 3 types of indices that are supported in the package: tuples; 'symmetric'; and 'all'. However, the line between 'all' and 'symmetric' is quite blurred in places, and there are inconsistencies in how we handle these.
Docstring & returned values inconsistency (and inefficiency)
The main culprit is spectral_connectivity_epochs and spectral_connectivity_time. Here, the default value for indices is None, which the docstrings describe as corresponding to "connectivity between all channels". However, these functions don't compute all-to-all connectivity when indices=None, they compute the lower-triangular part (i.e., assuming the connectivity is symmetric), but then store this in what is a flattened all-to-all connectivity array.
The assumption of symmetric connectivity is fair to make (only the DPLI method isn't; Granger causality is an exception because this requires indices is specified as a tuple), but it's curious why in the docstring it is: a) framed as being "connectivity between all channels"; and b) stored in an array where half the entries are zeros.
This also extends to the phase_slope_index function (which internally uses spectral_connectivity_epochs), where according to the docstring indices=None computes results for all connections, but only computes lower-triangular connections and stores this in an all-to-all connectivity array. There's the added issue that true all-to-all connectivity makes sense for PSI over symmetric connectivity, since it's a directed measure.
Undocumented supported indices
As an aside, indices=None actually propagates into the connectivity object, where the only accepted values are stated to be tuples, 'symmetric', or 'all'. It just so happens that the logic for checking indices inadvertently treats None as "all".
Lack of validation
When indices="symmetric" or a tuple, the connectivity objects check that the size of the connectivity data is as expected. No such checks are performed when indices="all", which also is the default value for the connectivity objects.
This breaks when you try to return the data in a 'dense' form, and will also break things with the new plotting functionality I'm working on.
Proposed solution
I'd say the least controversial points are fixing the undocumented/unvalidated behaviour, so stop indices=None being accepted by connectivity classes, and add checks in them for the data size when indices="all".
How 'all' vs, 'symmetric' connectivity gets handled in spectral_connectivity_epochs, spectral_connectivity_time, and phase_slope_index is less trivial.
For spectral_connectivity_epochs and spectral_connectivity_time, I would propose doing away with the default indices=None and making the default 'symmetric' (also changing the docstring description). Not only would it be a more accurate representation of the data we return, but we would also reduce memory consumption by not storing the lower-triangular data in a flattened all-to-all array.
On top of this, I would add the option for indices="all" to return true all-to-all connectivity, as you might want for DPLI.
In terms of what having 'symmetric' as default would break for users, for those using connectivity.get_data('raveled'), they would get a flattened lower-triangular rather than an all-to-all array. I'd argue this is an improvement long-term, since they don't have to sift through those redundant zero-valued entries, but it could require adapting existing code. In contrast, nothing should change for those using get_data('dense'), since we would internally pad this to an all-to-all array.
phase_slope_index could take advantage of the indices="all" option of spectral_connectivity_epochs to actually return all-to-all connectivity results, though this would change the behaviour of existing code.
Something I've had in the back of my mind for a while.
Open to other suggestions, or if people can think of some consequences I haven't accounted for.
Might even be a 1.0 consideration.
The problem
There are 3 types of
indicesthat are supported in the package: tuples; 'symmetric'; and 'all'. However, the line between 'all' and 'symmetric' is quite blurred in places, and there are inconsistencies in how we handle these.Docstring & returned values inconsistency (and inefficiency)
The main culprit is
spectral_connectivity_epochsandspectral_connectivity_time. Here, the default value forindicesisNone, which the docstrings describe as corresponding to "connectivity between all channels". However, these functions don't compute all-to-all connectivity whenindices=None, they compute the lower-triangular part (i.e., assuming the connectivity is symmetric), but then store this in what is a flattened all-to-all connectivity array.The assumption of symmetric connectivity is fair to make (only the DPLI method isn't; Granger causality is an exception because this requires
indicesis specified as a tuple), but it's curious why in the docstring it is: a) framed as being "connectivity between all channels"; and b) stored in an array where half the entries are zeros.This also extends to the
phase_slope_indexfunction (which internally usesspectral_connectivity_epochs), where according to the docstringindices=Nonecomputes results for all connections, but only computes lower-triangular connections and stores this in an all-to-all connectivity array. There's the added issue that true all-to-all connectivity makes sense for PSI over symmetric connectivity, since it's a directed measure.Undocumented supported
indicesAs an aside,
indices=Noneactually propagates into the connectivity object, where the only accepted values are stated to be tuples, 'symmetric', or 'all'. It just so happens that the logic for checkingindicesinadvertently treatsNoneas"all".Lack of validation
When
indices="symmetric"or a tuple, the connectivity objects check that the size of the connectivity data is as expected. No such checks are performed whenindices="all", which also is the default value for the connectivity objects.This breaks when you try to return the data in a 'dense' form, and will also break things with the new plotting functionality I'm working on.
Proposed solution
I'd say the least controversial points are fixing the undocumented/unvalidated behaviour, so stop
indices=Nonebeing accepted by connectivity classes, and add checks in them for the data size whenindices="all".How 'all' vs, 'symmetric' connectivity gets handled in
spectral_connectivity_epochs,spectral_connectivity_time, andphase_slope_indexis less trivial.For
spectral_connectivity_epochsandspectral_connectivity_time, I would propose doing away with the defaultindices=Noneand making the default 'symmetric' (also changing the docstring description). Not only would it be a more accurate representation of the data we return, but we would also reduce memory consumption by not storing the lower-triangular data in a flattened all-to-all array.On top of this, I would add the option for
indices="all"to return true all-to-all connectivity, as you might want for DPLI.In terms of what having 'symmetric' as default would break for users, for those using
connectivity.get_data('raveled'), they would get a flattened lower-triangular rather than an all-to-all array. I'd argue this is an improvement long-term, since they don't have to sift through those redundant zero-valued entries, but it could require adapting existing code. In contrast, nothing should change for those usingget_data('dense'), since we would internally pad this to an all-to-all array.phase_slope_indexcould take advantage of theindices="all"option ofspectral_connectivity_epochsto actually return all-to-all connectivity results, though this would change the behaviour of existing code.