Skip to content

Commit 57b779c

Browse files
Merge pull request #228 from huggingface/update-dataset-duration
change the way to get audio duration
2 parents f4d27e5 + 20470e5 commit 57b779c

File tree

1 file changed

+3
-1
lines changed

1 file changed

+3
-1
lines changed

chapters/en/chapter1/preprocessing.mdx

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,9 @@ dataset. However, we can create one, filter based on the values in that column,
9595

9696
```py
9797
# use librosa to get example's duration from the audio file
98-
new_column = [librosa.get_duration(path=x) for x in minds["path"]]
98+
new_column = [
99+
librosa.get_duration(y=x["array"], sr=x["sampling_rate"]) for x in minds["audio"]
100+
]
99101
minds = minds.add_column("duration", new_column)
100102

101103
# use 🤗 Datasets' `filter` method to apply the filtering function

0 commit comments

Comments
 (0)