This example shows how to use ORT to do speech recognition using the Wav2Vec 2.0 model.
It is heavily inspired by this PyTorch example.
The application lets the user make an audio recording, then recognizes the speech from that recording and displays a transcript.
See the general prerequisites here.
Additionally, you will need to be able to record audio, either on a simulator or a device.
The model should be generated in this location: <this directory>/SpeechRecognition/model
See instructions here for how to generate the model.
For example, with the model generation script dependencies installed, from this directory, run:
../model/gen_model.sh ./SpeechRecognition/modelFrom this directory, run:
pod installOpen the generated SpeechRecognition.xcworkspace file in Xcode to build and run the example.
