Replaced HPS pitch estimation with SWIPE (ultra-slim) #330

a5hun · 2025-06-15T22:13:46Z

Replaced the Harmonic Product Spectrum (HPS) pitch estimator with an ultra-slim implementation of SWIPE (Sawtooth Waveform Inspired Pitch Estimator), adapted from libf0 and based on Arturo Camacho's original algorithm:

https://ufdcimages.uflib.ufl.edu/UF/E0/02/15/89/00001/camacho_a.pdf

SWIPE is an extremely accurate and performant pitch estimator, working on log-spaced frequency bins for precision, and it works well for most voices. Arturo's original implementation and both libf0 versions (full and slim) use multiple FFT sizes [128, 256, 512, 1024, 2048, and 4096] to reduce the pitch strength loss for different frequencies when using suboptimal window sizes, but pitch estimation is still very good with just a single FFT size (4096 for sr > 44100). I've also slightly modified the kernels to weigh the fundamental and first harmonic evenly, which helps prevent cases where a higher strength first harmonic (as in a sung or voiced "eh" ⟨e⟩) is falsely detected as the pitch.

To better align with other voice pitch trackers, a new frequency scale has been added, centered around C and with ticks for each note on the chromatic scale.

tlecomte · 2025-08-04T10:20:25Z

Thank you very much @a5hun for your contribution, it's very much appreciated.

I have a couple questions:

should we keep both pitch estimators with a setting ? Or is swipe strictly always better than hps ?
the code uses scipy for interp1d and CubicSpline. Unfortunately we cannot use scipy directly because that would make the friture package much bigger. Possible alternatives would be to use methods that already exist in friture source (I think there may be one for interp1d), or copy the relevant piece of code from scipy, or reimplement it independently.

a5hun · 2025-08-25T17:34:16Z

Apologies for the delayed response! Here's the current HPS pitch tracker compared with the SWIPE-like algo on Salvatore Fisichella's "O muto asil del pianto":

Pitch and time resolution are much better with my SWIPE implementation, so my vote would be to replace HPS.

Scipy's interp1d and CubicSpline can be replaced with numpy's np.interp without hurting the resolution of the pitch tracker, so I'll make the changes. Is there any way to not plot zero/NaN pitch estimates? That would clean up the plot a little bit for unvoiced sections.

tlecomte · 2025-09-13T08:11:00Z

Thanks for screenshots for comparison, it seems to be indeed more precise.

Thanks also for the move from scipy to numpy.

Regarding hiding zeroes or Nan in plots, I think this is something that could be addressed. I would also allow to cleanup other plots like the long-time level widget.

tlecomte · 2025-09-13T08:58:53Z

@a5hun Would you be able to rebase your changes please? There are merge conflicts following changes I made in pitch_tracker.py... Sorry about that!

tlecomte · 2025-09-14T09:41:13Z

(I'm also curious if it could be realistic to use CREPE or SwiftF0)

Pitch tracker now works with the latest friture changes. Kernel generation is slightly different, tuned with vocal stems (male/female) to return fewer spurious voiced pitches.

a5hun · 2025-09-22T06:50:34Z

Not sure this was what you asked, exactly. A git rebase seems to be more difficult than writing a SWIPE-like pitch tracker! Everything on my branch should now be current with friture's master.

CREPE is too slow for realtime. I've spent a lot of time testing out various pitch tracking algorithms with high quality isolated vocals (https://cambridge-mt.com/ms3/mtk/), and I settled on a customized SWIPE-ish one because it's fast, very precise, and can work on independent audio frames. I really like pYIN, but it's also too slow. AI models can produce good results (CREPE is good, SPICE not so much, and I'm playing with SwiftF0 now), but they're either too slow or don't work well on 4096 samples.

https://github.com/lars76/pitch-benchmark/
I know SWIPE looks bad in this (not sure what SPTK is doing exactly with their SWIPE algo), but my version performs nearly identically to SwiftF0 on most voices. Here's a 16 second clip from the vocal stem on Jesse Joy's 'Release':

For this clip, calc time for my SWIPE algo is 0.122 seconds. SwiftF0 takes 1.52 seconds. That's at 4096 FFT length, hop at 1/4 that.

I tried to implement SwiftF0 into Friture (it's very probably the better pitch estimator), but I ran into a lot of issues. It needs librosa and scipy (and more!), and I can't get onnxruntime to work without crashing. If you can untangle the dependencies, it's really easy to implement in code, but I don't know enough about which versions of what that friture requires.

tlecomte · 2025-10-06T17:36:27Z

Thank you very much @a5hun, that looks great!

I will merge as is.

I am also curious if there are some algorithms that are capable of handling multi-sources, because I can imagine that it could be interesting on a setting with several voices and instruments.

a5hun closed this Sep 22, 2025

a5hun force-pushed the master branch from 4b37548 to a04bf9b Compare September 22, 2025 05:31

Rebased Pitch Tracker to friture/master

1d0cd60

Pitch tracker now works with the latest friture changes. Kernel generation is slightly different, tuned with vocal stems (male/female) to return fewer spurious voiced pitches.

a5hun reopened this Sep 22, 2025

tlecomte merged commit 24661ae into tlecomte:master Oct 7, 2025
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Replaced HPS pitch estimation with SWIPE (ultra-slim) #330

Replaced HPS pitch estimation with SWIPE (ultra-slim) #330

Uh oh!

a5hun commented Jun 15, 2025

Uh oh!

tlecomte commented Aug 4, 2025 •

edited

Loading

Uh oh!

a5hun commented Aug 25, 2025

Uh oh!

tlecomte commented Sep 13, 2025

Uh oh!

tlecomte commented Sep 13, 2025

Uh oh!

tlecomte commented Sep 14, 2025

Uh oh!

a5hun commented Sep 22, 2025

Uh oh!

tlecomte commented Oct 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Replaced HPS pitch estimation with SWIPE (ultra-slim) #330

Replaced HPS pitch estimation with SWIPE (ultra-slim) #330

Uh oh!

Conversation

a5hun commented Jun 15, 2025

Uh oh!

tlecomte commented Aug 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

a5hun commented Aug 25, 2025

Uh oh!

tlecomte commented Sep 13, 2025

Uh oh!

tlecomte commented Sep 13, 2025

Uh oh!

tlecomte commented Sep 14, 2025

Uh oh!

a5hun commented Sep 22, 2025

Uh oh!

tlecomte commented Oct 6, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tlecomte commented Aug 4, 2025 •

edited

Loading