-
Notifications
You must be signed in to change notification settings - Fork 54
Description
TL; DR: Add a hint or reminder to document and GUI, "Change st_bin into ST_BIN_MS * SAMPLE_RATE to load files that count in time samples instead of seconds, i.e. kilosort outputs"
Hi, I have a little suggestion on loading .npy files to rastermap.
In the Inputs section, it says:
If you have a
spike_times.npyandspike_clusters.npy, create your time-binned data matrix with, where the bin sizest_binis in milliseconds (assuming your spike times are in seconds)
In my usage, the two .npy files are generated by kilosort or mountainsort. Those two files not always count in seconds. Instead, they count in sampling rates. In this case, st_bin is no longer in milliseconds as preferred.
To know how to make this correct, we can look into the function io.load_spike_times:
def load_spike_times(fname, fname_cluid, st_bin=100):
st = np.load(fname).squeeze()
clu = np.load(fname_cluid).squeeze()
spks = csr_array((np.ones(len(st), "uint8"),
(clu, np.floor(st / st_bin * 1000).astype("int"))))
spks = spks.todense().astype("float32")
return spksHere a sparse array is created, where the y-coordinate is the cluster_id, while the x-coordinate is the spike time rounded in bin size, and the value is always 1.
For example, assume the spike before rounding is:
| cluster_id | spike_time (in seconds) | value |
|---|---|---|
| 57 | 38.051 | 1 |
| 57 | 38.127 | 1 |
after rounding process with st_bin=1000 as an example, it becomes:
| cluster_id | spike_time (in seconds) | value |
|---|---|---|
| 57 | 38 | 1 |
| 57 | 38 | 1 |
Then after the todense() process, there will be a 2 on (57,38).
Now check the key process of rounding
np.floor(st / st_bin * 1000)Here st is the spike time (assumed to be in seconds), st_bin is wanted bin_size in milliseconds, while 1000 is a convertion from milliseconds to seconds.
When st is no longer in seconds, it becomes st=ST_IN_S * SAMPLE_RATE, the convertion meets a problem as st / st_bin * 1000 = ST_IN_S * SAMPLE_RATE / st_bin * 1000, a SAMPLE_RATE is lead into the calculation and cause the st_bin become much smaller than expected.
The fix method is to correspondingly multiply a SAMPLE_RATE to st_bin, and there is st_bin = ST_BIN_MS * SAMPLE_RATE.
After elimination of SAMPLE_RATE, it becomes ST_IN_S / ST_BIN_MS *1000, which is the original idea of the codes.
The codes are correct, but the document might point out that spike_times.npy and spike_clusters.npy should be in seconds more obviously to avoid misunderstanding.
As far as I know, issue #33 meets the same problem in st_bin setting, trying to load the result files of kilosort directly into rastermap. I didn't find a hint in GUI either. Besides, it might be difficult to analysis how to change st_bin when this problem occurs.
Thus, for the sake of easy-use, I suggest to add a hint both on the README.md file and GUI panel to hint the user: "May need change st_bin into ST_BIN_MS * SAMPLE_RATE to load files that count in time samples instead of seconds, i.e. kilosort outputs".