Skip to content

Model performance metrics #55

Open
@dmarcus-wire

Description

@dmarcus-wire

I need a script to understand model performance, evaluate its accuracy against a reference text, and log the results to a CSV file. It should append data to CSV file that contains the following columns:

date: The current date in YYYY-MM-DD format.
timestamp: A timestamp in HHMMSS format.
model: The name or path of the model (Whisper in this case).
model_name: The name of the Whisper model used (e.g., tiny.en).
model_dir: The directory where the model is stored.
input_file: The path to the input audio file.
output_dir: The directory where the output is stored.
start_time: The start time of the transcription process.
end_time: The end time of the transcription process.
duration: The time taken for the transcription process (end time - start time).
wer: The Word Error Rate between the hypothesis and reference texts.
mer: The Match Error Rate between the hypothesis and reference texts.
wil: The Word Information Lost metric.
wip: The Word Information Preserved metric.
cer: The Character Error Rate.
floating_point_format: Use PyTorch or TensorFlow to detect FP16/FP32.
executed_command: The command that was executed (for logging purposes).

Metadata

Metadata

Assignees

Labels

CrawlExperimenting, model training, and evaluating metricsMVP

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions