Replies: 3 comments 6 replies
-
(I'm in kind of a rush so I'll keep it short)
As long as autocards is not as efficient as possible then we have no idea what it would cost to host for other users.
What do you think? Those three points seem to be good objectives. |
Beta Was this translation helpful? Give feedback.
-
@deklanw To better understand the use case you envision, I'm curious what's your take on the following: Searching for one card about "napoleon rosetta" and getting a premade one wouldn't be that much faster than creating your own. Wouldn't the point be to grab a batch of flashcards related to a topic, like napoleon? This through the semantic search brought up by @thiswillbeyourgithub, but for the specific purpose of getting the top ones, rather than the single best one. |
Beta Was this translation helpful? Give feedback.
-
How would you prioritize the content which has been "indexed" as flashcards (and eventually structured through a ToC)? Most used textbooks of all time? User requests? |
Beta Was this translation helpful? Give feedback.
-
I think the advice of making your own cards is overrated. It seems to me that if you have already 'learned' the context of the card, the card is well-made, and you care to remember the card it's a good card.
I wish every textbook/learning course came with its own expertly-made Anki deck. Fantasy aside, what can we do? I wish I just had a search-engine of premade cards. Then, I could search for cards about topics I'm learning and quickly import them. Well, what if you just look at other people's cards and pick and choose them as you learn relevant material? AnkiWeb doesn't support searching or downloading individual cards from shared decks, and they have taken measures to discourage scraping. Ok, well, there's an add-on for importing from Quizlet. Maybe we can just pick and choose cards from Quizlet and import them to Anki. But, Quizlet also doesn't support searching or downloading individual cards. Hmph.
We find Autocards. Great, now we can make cards from any material. Did you know that the Rosetta Stone was discovered by Napoleon's army while he was in Egypt? That's pretty cool, I'd like to have a card about that. Ok... so I just... feed in this paragraph from my book. I don't have a good dev environment ready. Gotta wait 5 minutes for my premade Colab notebook to start up and install... got the results. Ok, there's a few cards from this paragraph. A couple of them make sense, but none really capture what I wanted to remember. Oh well, I guess I'll just make it myself.
What if we combine the idea of a card search-engine with Autocards? We could generate a ton of cards (at least hundreds of thousands) from pre-selected material and then search, filter, rank, select, etc from a GUI.
This scheme addresses some of the core weaknesses of Autocards: it's slow and inconvenient to generate cards, and the cards are of variable quality. The first one should be obvious: we're computing AOT. So, consider the second. Autocard question-generation generation quality is highly dependent on the phrasing of the text. An easy solution is ensuring redundancy in source text. How many history books explain how the Rosetta Stone was discovered by Napoleon's army in Egypt? Surely one of these books phrases it just so for Autocards. And, if we happen to find multiple high-quality differently-phrased cards about our subject from different sources, that's only a bonus (see point 17).
History seems like the subject particularly amenable to this approach: there are many comprehensive texts, knowledge is redundant across texts, facts alone (without broader reasoning) can get you pretty far, images/videos aren't critical, etc. The practical plan is something like this: select dozens of good history books, make cards for them all, measure linguistic acceptability and perplexity scores, dump into a database, find a search backend like ElasticSearch or some alternative, throw on a GUI. From the GUI you can search, filter, rerank, see original context, easily select multiple cards and then one-click import, see previously-imported cards etc. I'm thinking an Electron app would be nice so no one needs to worry about servers, and we can remember previously-imported cards without user accounts.
Copyright could be a problem, I'm not sure. I don't think it would be possible to reconstruct an entire text from this, even with full contexts. If anything this whole scheme would only encourage people to go buy the books.
Long-term reach ideas:
I would like to experiment with training QAG on better datasets before trying all of this. But, I think it's not too big of a project relative to the potential.
Thoughts?
Beta Was this translation helpful? Give feedback.
All reactions