Skip to content

omgitsgio/ytvacious

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

YTVacious

Description:

YTVacious is a browser extension for Chrome (or any Chromium-based browsers) which I created as part of my final submission for the course CS50x. It takes inspiration from the MPV add-on MPVacious. The workflow is the same and it semi-automates sentence mining for language learners (primarly Japanese following the AJATT method) from YouTube videos.

The extension integrates with Anki and streamlines the creation of content-rich sentence mining cards by extracting the audio, a screenshot and the text of a given subtitle segment. When used in conjuction with tools like Goldendict and Rikaitan (for dictionary definitions and translations), it can become a very powerful tool for sentence mining on the fly.

If you don't know what AJATT or sentence mining are, I would suggest to read more here.

Minimum requirements:

  • Chrome (or any chromium-based browser)
  • Anki with the AnkiConnect add-on installed
  • A language-learning oriented note (see usage below)

Suggested additional requirements:

If building from source:

Make sure you are running at least node v18

node --version

If not, run:

nvm install 18

To install dependencies and build from source:

npm install
npm run build

Open Chrome -> Extensions -> Manage Extensions -> Load unpacked -> Select the 'dist' folder created by the above

Usage:

On your first use please review the settings page accessible from either the chrome extensions menu, or if you open the extension's popup you will see a button called "open settings". Here you can review your anki settings to allow the extension to communicate with your anki. If you use the Japanese Sentences note type in anki from Ajatt-Tools, you don't really need to change anything unless you'd like to use a different deck name or add some tags.

The rest is pretty basic. Just open the popup and load your desired subtitles. This will load a custom subtitles container which allows for some customisation. You may use the following shortcuts to:

  • Show/Hide 'V'
  • Size Down/Up 'E/R'
  • Move Down/Up 'U/Y'
  • Delay Down/Up 'G/H'
  • New Note GUI 'S'
  • Update Last Note 'X'

The extension will automatically copy to clipboard the currently visible subtitles line. This is a feature which comes handy if you are using the same worflow as MPVacious with Rikaitan or any dictionary which supports word parsing and Anki integration.

The 2 main functions are:

  • Note creation (shortcut 's'): when you need to create a simple note based on the content of the current subtitle.
  • Last note updating (shortcut 'x'): if you start by creating a word-definition note from Rikaitan (or similar) and want to add the context from the currently visible subtitle. This will always pick the last note so be careful before updating if you create multiple notes. Please leave the video alone while the recording is captured. Due to the limitations of a chrome extension, the audio recording is performed by levereaging the Media Capture and Streams API. This is a 1:1 recording of what you hear from your speakers. Therefore when it is triggered it will bring the video back to the start of the subtitle and play until the end of it. If you interfere with the video (e.g. muting, changing volume, pausing) during this process, the final recording will also be affected.

Design choices and structure walkthrough (for CS50x final project requirements)

This project was inspired by the AJATT (All Japanese All The Time) learning philosophy, which is applicable to the learning of any language. One of the main pillars of this is language immersion, which consists in immersing oneself as much as possible in the language. Coupled with immersion, there's a need to review and learn new words which you come across during the process. This is where Anki comes into play. However, finding material with reliable subtitles can be a time-consuming task. YouTube offers readily available content and some decent auto-generated subtitles, which can be leveraged to create content-rich sentence mining notes with Anki. One should however not overly rely on these subtitles and prioritize ones which are human-created and curated, at least until AI solutions get better.

To create content-rich notes, audio, screenshot and text are essential. My initial thought process started by pondering solutions to implement:

  • audio/video processor (like ffmpeg)
  • youtube video and subs downloader (like yt-dl)
  • ability to communicate with AnkiConnect (simple JSON-RPC calls)
  • some subtitles processing and cleaning
  • ease of use / portability

At first I thought that due to the need to process and record media, I would have had to opt for some script that operates locally and uses yt-dl to download the video, turn it into audio, crop the small segment needed and then use ffmpeg to give some format conversion flexibility. However, this approach would have been redundant as users could simply download the video along with the subtitles and use MPV with MPVacious. My goal was to simplify this as much as possible so users don't need to install anything new.

At this point choosing a browser extension was clearly the best option. It can leverage the browser's sync and APIs, and everyone has one. I had to learn some more advanced JavaScript since CS50 covered it only briefly, plus Chrome extensions utilize some very specific APIs and have many restrictions. I was initially planning to use the wasm versions of ffmpeg and yt-dl. However, I found this guide which explains the use of the MediaStream API and introduced me to the new idea of recording each chunk "live" and not downloading the entire video and processing it every time the user wants to save a note. This approach seemed annoying at first. However, even by using the extension myself, I found that repeating the segment which I want to save is a nice feature that contributes to successful memorization and seems bearable within 10 seconds (which would be quite long for an Anki note).

media-handler.js

So I started from here. I created the media-handler.js module and took most of the code from the guide previously cited, adapting it for audio recording in segments and base64 encoding for Anki. After creating and debugging more modules, I quickly realized that import statements do not work in browser extensions like they normally would in plain JavaScript web pages, so I opted for webpack dependencies management.

bubbles.js

Next, I implemented bubbles.js, youtube-subs-scraper.js and parts of content.js. I found bubbles.js from this project. This code is capable of adding custom subs to an HTML5 video and add some customization to it. It took a while to make it work but after recycling some code from the original project and some reiteration with claude.ai, I was able to get it to work decently.

youtube-subs-scraper.js

youtube-subs-scraper.js comes from this project and was quite straightforward to use. It just took some testing to understand what objects it was expecting/returning. This is the only external module which I used as-is.

content.js

content.js was developed in stages, but at this point most of the structure was defined. The core function is to respond to requests from the Chrome API onMessage messaging system, mainly from popup.js which is where the user interaction happens and the subtitles list is loaded via the subs scraper. Based on the language selected, content.js receives the message and calls handleLoadSubtitles() which loads the subs using bubbles.js. From here, subs are constantly monitored and the current status is kept in the currentSubState object, which serves as storage for primarily start and end time of the subs segment. This information is then used when the user triggers one of the 2 Anki functions:

  • Send to GUI: which simply populates a new note via Anki's GUI with the media from the subs segment

  • Send to last: which is used together with a local dictionary with the ability to listen to the subs copied to clipboard by the extension (like Rikaitan) and create definition cards. This function will simply add the media and the text from the video/subs.

popup.js

This module is the user entry-point and is behind popup.html. The structure is very simple and only includes a selector, 2 buttons, a status section and a quick start guide. The selector is populated with the list of available subs by sending a request (with the Chrome API messaging system) to youtube-subs-scraper.js. When the user loads the selected language, another message is sent to content.js to initiate the process explained above.

config.js

Here I wrote the logic to handle settings loading (which is behind settings.html). The code leverages the Chrome storage API and the restoring of the default values. I left an unused function to export to JSON the current settings. This might be implemented in the future for a backup functionality (i.e. import-export). It is worth noting however that I opted for the sync method which saves the settings in the user's Chrome account and across devices.

background.js

I really struggled to understand how to make this useful for the project initially. Chrome can run some code locally in the background as opposed to injecting it in the web page. This became very useful when I had to develop a resetting mechanism for bubbles. In fact, subs from the previous video were displayed in the next video if selected from the same page (e.g. recommendations). This happened because YouTube does not refresh the page when a new video is selected or auto-played. However, the URL will change. The background script monitors for these changes and forces a full refresh to get rid of previously loaded subs. I'm aware this is not the most elegant solution, but it works and I've implemented a control to prevent a full refresh if subs are not loaded. This script was also quite useful for AnkiConnect as explained below.

anki-connect.js

This was the most challenging but rewarding part of the coding of this project. Finally, I was able to see what was generated by all the sections above in a practical application. Here I mostly took inspiration from the AnkiConnect documentation. I spent some time finding a workaround due to CORS restrictions which I didn't want to force (to make everything as user-friendly as possible). AnkiConnect would not accept requests from anything other than localhost by default, so instead of generating these requests directly from anki-connect.js, I leveraged again the Chrome API messaging system, which sends requests from anki-connect.js to background.js to AnkiConnect itself. The second main challenge was to manage smaller details like focusing the new note after the data was sent, working around some AnkiConnect limitations, etc. For this, I took inspiration from the MPVacious project itself.

helpers.js

I was initially struggling to use import statements so my instinct was to initially avoid external references as much as possible. However, after implementing webpack successfully, I started to expand the project more and became less shy with references. Helpers is an attempt at tidying up some areas of the code but more work can definitely be done.

html pages and styles

Some work should definitely be done to make styles more consistent and improve the layout of the popup and settings pages. I have mostly used HTML snippets from previous problem sets and lectures.

Overall this has been an amazing experience and I want this to be only the beginning of a long list of projects successfully completed. I also hope this can be useful to other fellow language learners. I am planning to tidy up this code a bit more and add some more functionalities such as:

  • adding more config parameters
  • load available fields and decks automatically from the settings (rather than typing manually)
  • add default language options
  • add second language subs (for translations)

In hindsight, I would have probably spent a bit more time thinking about the implementation of the custom subs. The current bubbles.js is a very old library and documentation is not available anywhere. While it gets the job done, it is a bit limited in functionalities and complex to maintain.

About

Adds YouTube keybindings to create Anki cards from movies and TV shows.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors