Skip to content

Commit 9bbaa88

Browse files
palonsoclaude
andcommitted
Add base-freesound-small and base-freesound-large models
Add the new Freesound-trained models to tests and README. No metrics available yet. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent c110659 commit 9bbaa88

2 files changed

Lines changed: 6 additions & 0 deletions

File tree

README.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,13 +133,16 @@ Output:
133133
| **multifeature** | audio | 18.75 | 24000 | .467 | 1.76 | .938 | .734 | .833 | .623 |
134134
| **multifeature-25hz** | audio | 25 | 24000 | .463 | 1.79 | .932 | .728 | .848 | .628 |
135135
| **multifeature-25hz-fsq** | audio | 25 | 24000 | .463 | 1.71 | **.940**| **.749**| **.855**| .628 |
136+
| **base-freesound-small** | mel | 15.63 | 16000 | - | - | - | - | - | - |
137+
| **base-freesound-large** | mel | 15.63 | 16000 | - | - | - | - | - | - |
136138

137139
> **Note:** Different models were trained with different sample rates.
138140
> It is responsibility of the user to ensure that the input audio is sampled at the correct rate.
139141
140142
OMAR-RQ models are offered in different configurations, each with its own strengths and weaknesses.
141143
Models based on mel spectrogram (**base** and **multicodebook**) tend to perform better on semantic tasks such as auto-tagging, structure recognition, and difficulty estimation.
142144
On the other hand, **multifeature-24hz-fsq** offers the best performance in tonal and temporal tasks such as pitch and chord estimation, and beat tracking.
145+
The **base-freesound-small** and **base-freesound-large** models were trained with [Freesound](https://freesound.org/) data.
143146

144147
### Hugging Face Model IDs
145148

@@ -148,6 +151,8 @@ On the other hand, **multifeature-24hz-fsq** offers the best performance in tona
148151
- [mtg-upf/omar-rq-multifeature](https://huggingface.co/mtg-upf/omar-rq-multifeature)
149152
- [mtg-upf/omar-rq-multifeature-25hz](https://huggingface.co/mtg-upf/omar-rq-multifeature-25hz)
150153
- [mtg-upf/omar-rq-multifeature-25hz-fsq](https://huggingface.co/mtg-upf/omar-rq-multifeature-25hz-fsq)
154+
- [mtg-upf/omar-rq-base-freesound-small](https://huggingface.co/mtg-upf/omar-rq-base-freesound-small)
155+
- [mtg-upf/omar-rq-base-freesound-large](https://huggingface.co/mtg-upf/omar-rq-base-freesound-large)
151156

152157
## Pre-training OMAR-RQ models
153158

tests/test_omar_rq.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
"mtg-upf/omar-rq-multifeature-25hz",
1717
"mtg-upf/omar-rq-multifeature-25hz-fsq",
1818
"mtg-upf/omar-rq-base-freesound-small",
19+
"mtg-upf/omar-rq-base-freesound-large",
1920
]
2021

2122

0 commit comments

Comments
 (0)