Open
Conversation
This commit introduces support for Soniox's speech-to-text API for both English and multilingual benchmarks. A new `soniox/` directory has been added, containing the necessary scripts to run the evaluations: - `run_eval.py`: for English benchmarks (async and real-time). - `run_eval_ml.py`: for multilingual benchmarks (async and real-time). - `run_soniox.sh` and `run_soniox_ml.sh`: shell scripts to run the evaluations. - `requirements.txt`: specifies the dependencies for the Soniox integration.
- Fix audio decoding by disabling automatic torchcodec dependency and manually loading from bytes - Add proper PYTHONPATH to shell script to locate normalizer module - Ensure compatibility with datasets library without requiring torchcodec installation 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Remove hardcoded SONIOX_API_KEY from shell scripts for security - Add environment variable checks with helpful error messages - Fix run_eval.py dataset loading to avoid torchcodec dependency - Update data_utils.prepare_data() to support undecoded audio - Add comprehensive requirements.txt with all dependencies - Create .env.example template for API key configuration 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Collaborator
|
Hi @MyButtermilk |
Author
|
Thanks for looking into it. Maybe it works when you try again? You get 200 USD free credits at Soniox for subscribing. That should be easily enough for running the whole benchmark for English only and multi-lingual, considering that the price is only 0,10 USD per hour for async and 0,12 USD per hour for realtime transcription. |
Open
Author
|
@Deep-unlearning Did you look at the PR? |
Author
|
@Deep-unlearning any update on the pr? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
@anej-soniox:
Here a PR to test Soniox on the Open ASR benchmark. I tried running it, and it worked fine until the transcription stopped after around seven hours of transcriptions. I think there is something blocking too many async API calls. Maybe you can run it on your hardware side. It would be beneficial for Soniox to be present in the quite popular OpenASR benchmark.