Skip to content

Conversation

@chrishalcrow
Copy link
Member

@chrishalcrow chrishalcrow commented Nov 6, 2025

This PR adds a converter that takes kilosort output and makes an analyzer from it. Ideal for using the gui on your sorter output.

Unsure how to test this - any good ideas?

To do:

  • Figure out tests
  • Decide on default compute_extras
  • Test sparsity when some channels missing
  • Check no extra metadata leaks due to recording=generated
  • Implement Kilosort version guesser

@chrishalcrow chrishalcrow added the enhancement New feature or request label Nov 7, 2025
Copy link
Collaborator

@JoeZiminski JoeZiminski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @chrishalcrow this is awesome! This has made many of my dreams come true. For sure this must be restricted to KS4 as you have done, as it simply not possible on older versions 🥲. Please see some suggestions but feel free to ignore any not helpful. I'm not so familiar with the internal workings of the sorting_analyzer so can' comment much on that aspect. *

On the testing front, I wonder if something like 1) have some kilosort output 2) read in the locations, amplitudes from .npy file on the test side and compare them to that in the loaded sorting analyzer. Similar for templates.npy, maybe a few templates can be cross-checked. It's not ideal because it is essentially re-computing everything in the test environment, but at least it will protect against regressions. Maybe this could be added to the kilosort4 test suite? I guess a danger is that the ks4 output format change in some way, and this would catch that. Otherwise it might be necessary to manually generate some mock KS4 data which might be a pain.

*[EDIT]
I think my only worry on that was if adding the mock recording object stores some metadata on the analyzer under the hood that is not erased when ._recording is set to None. I took a quick look at the code and it doesn't seem like it does now. But maybe in future this could cause a problem, but I think there is no workaround 🤔 maybe an assert somewhere they key metadata fields are empty?

def kilosort_output_to_analyzer(folder_path, compute_extras=False, unwhiten=True) -> SortingAnalyzer:
"""
Load kilosort output into a SortingAnalyzer.
Output from kilosort version 4.1 and above are supported.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to check the version from the output? If not we should ask them to add on the KS repo as this would be a useful general addition. But maybe we could check that the kilosortX.log is not < 4? (IIRC that the logs are formatted in this way)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have asked on KiloSort

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you've run directly from kilosort, you'll have the version in kilosort4.log. So we could check there, and if it's not there, we have a guess...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great that sounds good, if this tool is only for kilosort4 then just checking for the existing of that log file should do (unless it's extended to other versions)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, the log files only appeared at v4.0.33. Thinking of other ways to check...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know of any files which KS2/2.5 defo don't have in their output, that KS4 does?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, I've added a _guess_kilosort_version function, to isolate this logic. Let's compare some outputs and see if we can make it reasonable.

template_extension = ComputeTemplates(sa)

whitened_templates = np.load(phy_path / "templates.npy")
wh_inv = np.load(phy_path / "whitening_mat_inv.npy")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is only reliable on KS4 (KS2.5-3 the inverse whitening matrix is scaled identity). It is clear from the docstring that this is only for KS4 but I'm sure people will try on other versions. Checking the log file is probably the easiest way to catch this, but another possible safety net could be to check that the wh_inv is not all zeros off diagonal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so we could try to check the version number, then if it's less than 4, we only allow whitened templates?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would work, I am not 100% confident without checking across versions, I will try and sorted and send some older datasets tomorrow! I think indeed unwhitened would not be possible for ks2.5 and ks3

@chrishalcrow
Copy link
Member Author

*[EDIT] I think my only worry on that was if adding the mock recording object stores some metadata on the analyzer under the hood that is not erased when ._recording is set to None. I took a quick look at the code and it doesn't seem like it does now. But maybe in future this could cause a problem, but I think there is no workaround 🤔 maybe an assert somewhere they key metadata fields are empty?

It does save some metadata related to the probe, and some basic info like sampling frequency in rec_attributes, but this is required for the analyzer to work at all. So: we do want to keep that, and we don't know what to protect against in the future. So I'm not sure how to protect against future problems. I'd say it's probably fine.

@chrishalcrow
Copy link
Member Author

Hey @JoeZiminski . Just realised that the amplitudes saved by Kilosort are the "Per-spike amplitudes, computed as the L2 norm of the PC features for each spike." https://kilosort.readthedocs.io/en/latest/export_files.html

So I don't think it's fair to call them spike_amplitudes as we do in spikeinterface? Thoughts?

@alejoe91
Copy link
Member

Hey @JoeZiminski . Just realised that the amplitudes saved by Kilosort are the "Per-spike amplitudes, computed as the L2 norm of the PC features for each spike." https://kilosort.readthedocs.io/en/latest/export_files.html

So I don't think it's fair to call them spike_amplitudes as we do in spikeinterface? Thoughts?

Agree!

@chrishalcrow chrishalcrow added this to the 0.104.0 milestone Nov 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants