You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+4-1Lines changed: 4 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,10 +6,11 @@ WhisperS2T is an optimized lightning-fast speech-to-text pipeline tailored for t
6
6
7
7
## Benchmark and Technical Report
8
8
9
-
Stay tuned for a technical report comparing WhisperS2T against other whisper pipelines. Meanwhile, check some quick benchmarks on A30 GPU.
9
+
Stay tuned for a technical report comparing WhisperS2T against other whisper pipelines. Meanwhile, check some quick benchmarks on A30 GPU. See `scripts/` directory for the benchmarking scripts that I used.
10
10
11
11

12
12
13
+
**NOTE:** I ran all the benchmarks with `without_timestamps` parameter as `True`. Setting `without_timestamps` as `False` may improve the WER of HuggingFace pipiline at the expense of additional inference time.
13
14
14
15
## Features
15
16
@@ -77,6 +78,8 @@ print(out[0][0])
77
78
78
79
Check this [Documentation](docs.md) for more details.
79
80
81
+
**NOTE:** For first run the model may give slightly slower inference speed. After 1-2 runs it will give better inference speed. This is due to the JIT tracing of the VAD model.
82
+
80
83
81
84
## Acknowledgements
82
85
-[**OpenAI Whisper Team**](https://github.com/openai/whisper): Thanks to the OpenAI Whisper Team for open-sourcing the whisper model.
0 commit comments