Skip to content

Model performance metrics #55

Open
@dmarcus-wire

Description

@dmarcus-wire

I need a script to understand model performance, evaluate its accuracy against a reference text, and log the results to a CSV file. It should append data to CSV file that contains the following columns:

date: The current date in YYYY-MM-DD format.
timestamp: A timestamp in HHMMSS format.
model: The name or path of the model (Whisper in this case).
model_name: The name of the Whisper model used (e.g., tiny.en).
model_dir: The directory where the model is stored.
input_file: The path to the input audio file.
output_dir: The directory where the output is stored.
start_time: The start time of the transcription process.
end_time: The end time of the transcription process.
duration: The time taken for the transcription process (end time - start time).
wer: The Word Error Rate between the hypothesis and reference texts.
mer: The Match Error Rate between the hypothesis and reference texts.
wil: The Word Information Lost metric.
wip: The Word Information Preserved metric.
cer: The Character Error Rate.
floating_point_format: Use PyTorch or TensorFlow to detect FP16/FP32.
executed_command: The command that was executed (for logging purposes).

Metadata

Metadata

Assignees

No one assigned

    Labels

    CrawlExperimenting, model training, and evaluating metrics

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions