Skip to content
This repository was archived by the owner on Jun 19, 2025. It is now read-only.
This repository was archived by the owner on Jun 19, 2025. It is now read-only.

Separate engine from linguistic data and come up with a set of conventions for training models  #2592

@ftyers

Description

@ftyers

At the moment parts of the data are mixed with parts of the code. It would be good to have full separation of data and code and a set of conventions for formatting/organising the data such that training DeepSpeech would be as easy as "get a directory of data in the right format and execute something like: DeepSpeech.py --train ../commonvoice-fra or DeepSpeech.py --train ../commonvoice-chv --transfer-from ../commonvoice-eng .

This is not a call to immediate action, but more of a placeholder for future work.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions