Description
I need a script to understand model performance, evaluate its accuracy against a reference text, and log the results to a CSV file. It should append data to CSV file that contains the following columns:
date: The current date in YYYY-MM-DD format.
timestamp: A timestamp in HHMMSS format.
model: The name or path of the model (Whisper in this case).
model_name: The name of the Whisper model used (e.g., tiny.en).
model_dir: The directory where the model is stored.
input_file: The path to the input audio file.
output_dir: The directory where the output is stored.
start_time: The start time of the transcription process.
end_time: The end time of the transcription process.
duration: The time taken for the transcription process (end time - start time).
wer: The Word Error Rate between the hypothesis and reference texts.
mer: The Match Error Rate between the hypothesis and reference texts.
wil: The Word Information Lost metric.
wip: The Word Information Preserved metric.
cer: The Character Error Rate.
floating_point_format: Use PyTorch or TensorFlow to detect FP16/FP32.
executed_command: The command that was executed (for logging purposes).