Skip to content

Commit 24857f4

Browse files
mvolzXe
andauthored
feat(data): add Citoid to good bots list (#1524)
* Add Wikimedia Foundation citoid services file Wikimedia Foundation runs a service called citoid which retrieves citation metadata from urls in order to create formatted citations. This file contains the ip ranges allocated to the WMF (https://wikitech.wikimedia.org/wiki/IP_and_AS_allocations) from which the services make requests, as well as regex for the User-Agents from both services used to generate citations (citoid, and Zotero's translation-server which citoid makes requests to as well in order to generate the metadata). Signed-off-by: Marielle Volz <marielle.volz@gmail.com> * Add Wikimedia Citoid crawler to allowed list Signed-off-by: Marielle Volz <marielle.volz@gmail.com> * chore: update spelling Signed-off-by: Xe Iaso <me@xeiaso.net> --------- Signed-off-by: Marielle Volz <marielle.volz@gmail.com> Signed-off-by: Xe Iaso <me@xeiaso.net> Co-authored-by: Xe Iaso <me@xeiaso.net>
1 parent e0ece7d commit 24857f4

3 files changed

Lines changed: 21 additions & 0 deletions

File tree

.github/actions/spelling/allow.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,3 +31,5 @@ Stargate
3131
FFXIV
3232
uvensys
3333
de
34+
envoyproxy
35+
unipromos

data/crawlers/_allow-good.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,5 @@
88
- import: (data)/crawlers/marginalia.yaml
99
- import: (data)/crawlers/mojeekbot.yaml
1010
- import: (data)/crawlers/commoncrawl.yaml
11+
- import: (data)/crawlers/wikimedia-citoid.yaml
1112
- import: (data)/crawlers/yandexbot.yaml
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
# Wikimedia Foundation citation services
2+
# https://www.mediawiki.org/wiki/Citoid
3+
4+
- name: wikimedia-citoid
5+
user_agent_regex: "Citoid/WMF"
6+
action: ALLOW
7+
remote_addresses: [
8+
"208.80.152.0/22",
9+
"2620:0:860::/46",
10+
]
11+
12+
- name: wikimedia-zotero-translation-server
13+
user_agent_regex: "ZoteroTranslationServer/WMF"
14+
action: ALLOW
15+
remote_addresses: [
16+
"208.80.152.0/22",
17+
"2620:0:860::/46",
18+
]

0 commit comments

Comments
 (0)