Skip to content

A document suggestion on Inputs section for the easy of use. #37

@jnjnnjzch

Description

@jnjnnjzch

TL; DR: Add a hint or reminder to document and GUI, "Change st_bin into ST_BIN_MS * SAMPLE_RATE to load files that count in time samples instead of seconds, i.e. kilosort outputs"


Hi, I have a little suggestion on loading .npy files to rastermap.

In the Inputs section, it says:

If you have a spike_times.npy and spike_clusters.npy, create your time-binned data matrix with, where the bin size st_bin is in milliseconds (assuming your spike times are in seconds)

In my usage, the two .npy files are generated by kilosort or mountainsort. Those two files not always count in seconds. Instead, they count in sampling rates. In this case, st_bin is no longer in milliseconds as preferred.

To know how to make this correct, we can look into the function io.load_spike_times:

def load_spike_times(fname, fname_cluid, st_bin=100):
    st = np.load(fname).squeeze()
    clu = np.load(fname_cluid).squeeze()
    spks = csr_array((np.ones(len(st), "uint8"), 
                    (clu, np.floor(st / st_bin * 1000).astype("int"))))
    spks = spks.todense().astype("float32")
    return spks

Here a sparse array is created, where the y-coordinate is the cluster_id, while the x-coordinate is the spike time rounded in bin size, and the value is always 1.

For example, assume the spike before rounding is:

cluster_id spike_time (in seconds) value
57 38.051 1
57 38.127 1

after rounding process with st_bin=1000 as an example, it becomes:

cluster_id spike_time (in seconds) value
57 38 1
57 38 1

Then after the todense() process, there will be a 2 on (57,38).

Now check the key process of rounding

np.floor(st / st_bin * 1000)

Here st is the spike time (assumed to be in seconds), st_bin is wanted bin_size in milliseconds, while 1000 is a convertion from milliseconds to seconds.

When st is no longer in seconds, it becomes st=ST_IN_S * SAMPLE_RATE, the convertion meets a problem as st / st_bin * 1000 = ST_IN_S * SAMPLE_RATE / st_bin * 1000, a SAMPLE_RATE is lead into the calculation and cause the st_bin become much smaller than expected.

The fix method is to correspondingly multiply a SAMPLE_RATE to st_bin, and there is st_bin = ST_BIN_MS * SAMPLE_RATE.
After elimination of SAMPLE_RATE, it becomes ST_IN_S / ST_BIN_MS *1000, which is the original idea of the codes.


The codes are correct, but the document might point out that spike_times.npy and spike_clusters.npy should be in seconds more obviously to avoid misunderstanding.

As far as I know, issue #33 meets the same problem in st_bin setting, trying to load the result files of kilosort directly into rastermap. I didn't find a hint in GUI either. Besides, it might be difficult to analysis how to change st_bin when this problem occurs.

Thus, for the sake of easy-use, I suggest to add a hint both on the README.md file and GUI panel to hint the user: "May need change st_bin into ST_BIN_MS * SAMPLE_RATE to load files that count in time samples instead of seconds, i.e. kilosort outputs".

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions