Open
Description
Following openzim/zimit#113, we should think about implementing a fairly easily editable list (hosted on drive.kiwix.org?) of blacklisted sites that can not be requested on zimit, e.g.
- kiwix.org subdomains (download and library);
- very large corporate websites (e.g. Facebook, Twitter, Reddit, Youtube, etc.)
- websites that have been scraped in the past and failed.
It's probably the matter of a separate ticket, but requests for websites we already have a scraper for (wikipedia, stackoverflow, etc.) should also be soft blocked and the user offered a direct link to the zim file.