Skip to content

Conversation

@nikhilchandra
Copy link

feat: extract_raw_waveforms() uses joblib's 'Parallel' class to speed up the processing of multiple units' worth of data. The default engine that joblib uses - 'loky' - is process-based. Some use-cases work better with another engine, 'threads'. I have introduced a new parameter called 'joblib_backend_preference' with the default value 'processes'. This should produce no change in behavior if left untouched. Should the user wish to use the thread-based engine, they can change this parameter's value to 'threads'.

… up the processing of multiple units' worth of data. The default engine that joblib uses - 'loky' - is process-based. Some use-cases work better with another engine, 'threads'. I have introduced a new parameter called 'joblib_backend_preference' with the default value 'processes'. This should produce no change in behavior if left untouched. Should the user wish to use the thread-based engine, they can change this parameter's value to 'threads'.
@Julie-Fabre
Copy link
Owner

Hi Nikhil, thanks a lot! That looks good. Could you make sure a default parameter value is set when using it in extractRawWaveforms? (e.g. if a user is using a custom parameter file that is missing this parameter value, can it default to previous behavior : 'threads')? Thanks a lot!

@nikhilchandra
Copy link
Author

Hi Julie, in default_parameters.py, there is a new parameter 'joblib_backend_preference' that accepts string values of 'processes' or 'threads' and defaults to 'processes' - leaving this as is will allow BombCell to work as it already does.

Correspondingly, in extract_raw_waveforms, just before we use joblib.Parallel to run process_a_unit() on available units, I pull this parameter from the param dictionary with this code:

        prefer = param["joblib_backend_preference"]
        all_waveforms = Parallel(n_jobs=-1, verbose=10, mmap_mode="r", max_nbytes=None, prefer=prefer)

@Julie-Fabre
Copy link
Owner

Hi Nikhil,
Thanks! Lots of users copy the default parameter file and then make their own - so there will be no default 'joblib_backend_preference' value in that case. Can you make sure we have a fallback default option when it is called in extract_raw_waveforms? Thanks a lot!

… extract_raw_waveforms.py to prevent breaks for users who maintain their own copy of default_parameters.py
@nikhilchandra
Copy link
Author

I've made the change you requested. However, it might be a good idea to recommend that instead of directly modifying default_parameters.py, they should write a convenience function (maybe in custom_parameters.py?) that takes the param dictionary as input, modifies it, and returns it once more. This would eliminate the need for this kind of check.

@nikhilchandra
Copy link
Author

nikhilchandra commented Nov 20, 2025

Hi @Julie-Fabre, just checking in. I haven't heard from you in a while, is there anything you'd like me to do so this can get approved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants