Skip to content

Commit 31ebf33

Browse files
authored
Merge pull request #5589 from avidale/patch-1
Update the pointer to the Hokkien benchmark dataset
2 parents 1a54ad1 + 8a7519f commit 31ebf33

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

examples/hokkien/README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ See our interactive [Demo page](https://huggingface.co/spaces/facebook/Hokkien_T
1111
We create and release a Hokkien-English parallel speech dataset that is available for benchmarking Hokkien<>English speech to speech translation systems. The dataset was derived from TAT-Vol1-eval-lavalier (dev) and TAT-Vol1-test-lavalier (test) based on [Taiwanese Across Taiwan (TAT) corpus](https://sites.google.com/speech.ntut.edu.tw/fsw/home/tat-corpus), which contained audio recordings and transcripts in Taiwanese Hokkien.
1212
We created the parallel dataset by first concatenating neighboring sentences to form longer utterances, translating the Hokkien text transcriptions into English via Hokkien-English bilinguals, and recording the English translations with human voices. Below are some summary statistics of the dataset.
1313

14-
The dataset is available [HERE](https://sites.google.com/nycu.edu.tw/speechlabx/tat_s2st_benchmark).
14+
The benchmark dataset is available at https://sites.google.com/nycu.edu.tw/sarc/tat_s2st_benchmark.
1515

1616
## Open Sourced English-Hokkien S2ST Models
1717

0 commit comments

Comments
 (0)