Skip to content

Commit 60a7d5a

Browse files
authored
Merge branch 'internetarchive:master' into fix/mobile-dropdown-cell-height
2 parents d7ca2ed + 8193d72 commit 60a7d5a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+1067
-307
lines changed

.github/ISSUE_TEMPLATE/bug_report.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ body:
2020
label: Reproducing the bug
2121
description: Being as specific as possible, what steps result in triggering this bug?
2222
value: |
23+
2324
1. Go to ...
2425
2. Do ...
2526
@@ -61,4 +62,4 @@ body:
6162
- type: markdown
6263
attributes:
6364
value: |
64-
Thanks for taking the time to fill out this bug report! 👍
65+
Thanks for taking the time to fill out this bug report! 👍

Readme.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -56,11 +56,11 @@ You can also find more information regarding Developer Documentation for Open Li
5656

5757
## Code Organization
5858

59-
* openlibrary/core - core openlibrary functionality, imported and used by www
60-
* openlibrary/plugins - other models, controllers, and view helpers
61-
* openlibrary/views - views for rendering web pages
62-
* openlibrary/templates - all the templates used in the website
63-
* openlibrary/macros - macros are like templates, but can be called from wikitext
59+
* [*openlibrary/core*](/openlibrary/core) - core openlibrary functionality, imported and used by www
60+
* [*openlibrary/plugins*](/openlibrary/plugins) - other models, controllers, and view helpers
61+
* [*openlibrary/views*](/openlibrary/views) - views for rendering web pages
62+
* [*openlibrary/templates*](/openlibrary/templates) - all the templates used in the website
63+
* [*openlibrary/macros*](/openlibrary/macros) - macros are like templates, but can be called from wikitext
6464

6565
## Architecture
6666

Readme_chinese.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -53,11 +53,11 @@
5353

5454
## 代码组成
5555

56-
* openlibrary/core - 公共图书馆的核心功能,由www导入和使用
57-
* openlibrary/plugins - 其它模型、控制器和视图帮助器
58-
* openlibrary/views - 网页视图的呈现
59-
* openlibrary/templates - 所有在网页里使用的模板
60-
* openlibrary/macros - macros和模板类似,但可以被wikitext调用
56+
* [*openlibrary/core*](/openlibrary/core) - 公共图书馆的核心功能,由www导入和使用
57+
* [*openlibrary/plugins*](/openlibrary/plugins) - 其它模型、控制器和视图帮助器
58+
* [*openlibrary/views*](/openlibrary/views) - 网页视图的呈现
59+
* [*openlibrary/templates*](/openlibrary/templates) - 所有在网页里使用的模板
60+
* [*openlibrary/macros*](/openlibrary/macros) - macros和模板类似,但可以被wikitext调用
6161

6262
## 结构
6363

Readme_es.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -54,11 +54,11 @@ También puedes encontrar más información sobre la Documentación para Desarro
5454

5555
## Organización del Código
5656

57-
* openlibrary/core - funcionalidad central de Open Library, importada y utilizada por www
58-
* openlibrary/plugins - otros modelos, controladores y ayudantes de vista (view helpers)
59-
* openlibrary/views - vistas para renderizar páginas web
60-
* openlibrary/templates - todas las plantillas utilizadas en el sitio web
61-
* openlibrary/macros - los macros son similares a las plantillas, pero pueden ser llamados desde wikitext
57+
* [*openlibrary/core*](/openlibrary/core) - funcionalidad central de Open Library, importada y utilizada por www
58+
* [*openlibrary/plugins*](/openlibrary/plugins) - otros modelos, controladores y ayudantes de vista (view helpers)
59+
* [*openlibrary/views*](/openlibrary/views) - vistas para renderizar páginas web
60+
* [*openlibrary/templates*](/openlibrary/templates) - todas las plantillas utilizadas en el sitio web
61+
* [*openlibrary/macros*](/openlibrary/macros) - los macros son similares a las plantillas, pero pueden ser llamados desde wikitext
6262

6363
## Arquitectura
6464

Readme_vn.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -52,11 +52,11 @@ Bạn cũng có thể tìm kiếm thêm thông tin về Tài liệu dành cho nh
5252

5353
## Tổ chức code
5454

55-
* openlibrary/core - chức năng cốt lõi của openlibrary, được nhập và sử dụng bởi www
56-
* openlibrary/plugins - các mô hình, các bộ điều khiển và trình trợ giúp hiển thị khác.
57-
* openlibrary/views - các chế độ xem để hiển thị trang web
58-
* openlibrary/templates - tất cả những templates được dùng trên trang web.
59-
* openlibrary/macros - các macro tương tự như mẫu, nhưng có thể được gọi từ wikitext.
55+
* [*openlibrary/core*](/openlibrary/core) - chức năng cốt lõi của openlibrary, được nhập và sử dụng bởi www
56+
* [*openlibrary/plugins*](/openlibrary/plugins) - các mô hình, các bộ điều khiển và trình trợ giúp hiển thị khác.
57+
* [*openlibrary/views*](/openlibrary/views) - các chế độ xem để hiển thị trang web
58+
* [*openlibrary/templates*](/openlibrary/templates) - tất cả những templates được dùng trên trang web.
59+
* [*openlibrary/macros*](/openlibrary/macros) - các macro tương tự như mẫu, nhưng có thể được gọi từ wikitext.
6060

6161
## Kiến trúc
6262

compose.production.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ services:
2727
restart: unless-stopped
2828
hostname: "$HOSTNAME"
2929
environment:
30-
- GUNICORN_OPTS= --workers 1 --timeout 300 --max-requests 500
30+
- GUNICORN_OPTS= --workers 2 --timeout 300 --max-requests 500
3131
- OL_CONFIG=/olsystem/etc/openlibrary.yml
3232
volumes:
3333
- ../booklending_utils:/booklending_utils

compose.staging.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ services:
3838
restart: unless-stopped
3939
hostname: "$HOSTNAME"
4040
environment:
41-
- GUNICORN_OPTS= --workers 1 --timeout 180 --max-requests 500
41+
- GUNICORN_OPTS= --workers 2 --timeout 180 --max-requests 500
4242
- OL_CONFIG=/olsystem/etc/openlibrary.yml
4343
volumes:
4444
- ../olsystem:/olsystem

compose.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ services:
2020
image: "${OLIMAGE:-oldev:latest}"
2121
environment:
2222
- OL_CONFIG=${OL_CONFIG:-/openlibrary/conf/openlibrary.yml}
23-
- GUNICORN_OPTS=${GUNICORN_OPTS:- --reload --workers 1 --timeout 180 --max-requests 500}
23+
- GUNICORN_OPTS=${GUNICORN_OPTS:- --reload --workers 2 --timeout 180 --max-requests 500}
2424
command: docker/ol-web-fastapi-start.sh
2525
ports:
2626
- ${FAST_WEB_PORT:-18080}:8080

docker/nginx.conf

Lines changed: 35 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,9 @@ http {
3232
include /olsystem/etc/nginx/logging.conf;
3333
access_log /var/log/nginx/access.log iacombined;
3434

35+
js_shared_dict_zone zone=crawler_ips:10M type=number timeout=300s;
36+
js_import /olsystem/etc/nginx/tagger.js;
37+
3538
client_max_body_size 50m;
3639

3740
sendfile on;
@@ -52,20 +55,42 @@ http {
5255
# These rules only do anything if invoked, e.g., in web_nginx.conf.
5356
# TLDR: these rules can be disabled in `docker/web_nginx.conf`
5457
# and `docker/covers_nginx.conf`.
55-
geo $should_apply_limit {
56-
# No rate limit when IP obfuscation is not applied, as every IP is 255.0.0.0.
57-
255.0.0.0 0;
58-
# In cluster traffic
59-
207.241.224.0/20 0;
60-
# All other traffic
61-
default 1;
58+
geo $is_blessed_ip {
59+
255.0.0.0 1; # Internal
60+
207.241.224.0/20 1; # In cluster traffic
61+
default 0; # All other traffic
6262
}
6363

64-
map $should_apply_limit $rate_limit_key {
65-
0 '';
66-
1 $binary_remote_addr;
64+
# Provides $is_blessed_ua
65+
include /olsystem/etc/nginx/is_blessed_ua.map;
66+
67+
map "$is_blessed_ip:$is_blessed_ua" $rate_limit_key {
68+
"0:0" $binary_remote_addr; # Rate-limit by IP
69+
default ''; # Don't rate-limit
6770
}
6871

72+
# check if user-agent provides a means of identification
73+
map $http_user_agent $is_identifying_ua {
74+
default 0;
75+
"~*bot" 1;
76+
"~*spider" 1;
77+
"~*crawl" 1;
78+
"~*google" 1; # sometimes just GoogleOther
79+
"~*http" 1; # Includes url
80+
"~*@" 1; # Includes email
81+
}
82+
83+
js_set $has_hit_crawler_links tagger.check;
84+
85+
# The only crawlers we want to limit are the ones that don't identify themselves as such
86+
map "$is_blessed_ip:$is_identifying_ua:$has_hit_crawler_links" $global_nonidentifying_crawler_rate_limit_key {
87+
default ''; # No shared rate limiting
88+
"0:0:1" '1'; # Shared rate limit
89+
}
90+
91+
# Limit the crawlers that scrape links but don't ID themselves globally
92+
limit_req_zone $global_nonidentifying_crawler_rate_limit_key zone=global_crawler_limit:5m rate=15r/s;
93+
6994
# Matches other sites
7095
limit_req_zone $rate_limit_key zone=web_limit:10m rate=1r/s;
7196
# Higher rate for APIs since they are cheaper and we often hit them

docker/web_nginx.conf

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -105,8 +105,12 @@ server {
105105

106106
location / {
107107
limit_req zone=web_limit burst=100 delay=10;
108+
limit_req zone=global_crawler_limit nodelay;
108109
limit_req_status 429;
109110

111+
js_set $is_crawler_link tagger.tag_crawler;
112+
add_header X-SPS $is_crawler_link; # Need to reference the variable for the js method to be executed
113+
110114
# For returning 200 when someone tries to randomly sort author results.
111115
if ($is_sus_random_sort) {
112116
return 200;
@@ -117,7 +121,7 @@ server {
117121
}
118122

119123
if ($is_sus_referer) {
120-
return 444;
124+
return 403;
121125
}
122126

123127
# Haproxy to better handle load/traffic
@@ -138,7 +142,7 @@ server {
138142
limit_req_status 429;
139143

140144
if ($http_user_agent ~* (bytespider|meta-externalagent) ) {
141-
return 444;
145+
return 403;
142146
}
143147

144148
# Haproxy to better handle load/traffic

0 commit comments

Comments
 (0)