Skip to content

🔰 Request: Support Importing YouTube JSON and VTT #29

@natelawrence

Description

@natelawrence

In a recent hunt for more ASR providers who offer per-word timecodes I found some that I already knew of and a few I hadn't heard of before. Among all providers is YouTube.

You can generate automatic captions test data with a free YouTube account.
To obtain per-word timecodes for any YouTube video that has automatic captions generated/published, open your web browser's Developer Tools [F12], click on the Network tab, search for timed, load your video of choice, and enable automatic captions on the YouTube view page's video player. You will see exactly one URL appear in your filter for 'timed' and if followed, this will download a text file to your Downloads folder, titled f.txt. You can then rename it with the filename of your video and a JSON extension.

If you want to harvest per-word time data for automatic captions on a YouTube video on your own channel, you can use YouTube Studio to download your video's automatic captions in VTT format.

Or, if you would like test data that is already generated:
ASR Timed Text Format Test 2 (View Link)
ASR Timed Text Format Test 2 [YouTube].zip (contains JSON and VTT variants)

Being able to import YouTube's per-word captions format would allow YouTube users to bring their YouTube automatic captions transcripts and finish editing them in HyperAudio Lite Editor, if desired.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions