Skip to content

Commit b5a2f28

Browse files
Merge pull request #205 from McVyp/fix-whisper-sampling-rate-doc
Add audio resampling to fix Whisper sampling rate mismatch
2 parents 5e4717e + 283cb2c commit b5a2f28

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

chapters/en/chapter1/preprocessing.mdx

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -152,6 +152,13 @@ Next, you can write a function to pre-process a single audio example by passing
152152
```py
153153
def prepare_dataset(example):
154154
audio = example["audio"]
155+
156+
if audio["sampling_rate"] != 16000:
157+
audio_array = librosa.resample(
158+
audio["array"], orig_sr=audio["sampling_rate"], target_sr=16000
159+
)
160+
audio = {"array": audio_array, "sampling_rate": 16000}
161+
155162
features = feature_extractor(
156163
audio["array"], sampling_rate=audio["sampling_rate"], padding=True
157164
)

0 commit comments

Comments
 (0)