Skip to content

Commit 6231ee7

Browse files
committed
Tweak worker and queue config
1 parent e1bf411 commit 6231ee7

File tree

2 files changed

+23
-8
lines changed

2 files changed

+23
-8
lines changed

appengine/queue.yaml

Lines changed: 21 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,32 @@
1-
total_storage_limit: 2000M
1+
total_storage_limit: 2.0G
2+
3+
# General notes:
4+
# It appears that appengine scales on the basis of whether SOME of the instances
5+
# are running hot. This means that if just one or two instances have high cpu
6+
# for a few minutes, more instances will be started. The cpu utilization
7+
# closely reflects the number of concurrent tasks, together with the percentage
8+
# of time spent blocked on annotation and insertion requests.
9+
#
10+
# This makes it difficult to achieve both stability and high cpu utilization.
11+
# Could we monitor the cpu utilization, or perhaps the time each task spends
12+
# blocked on I/O? If each task is spending 50% of wall time blocked on I/O,
13+
# then 4 tasks should be enough to produce fairly good utilization? We want
14+
# to reject additional tasks if they would push us over utilization target,
15+
# so that they can be directed to other instances.
216

317
queue:
418
- name: etl-ndt-queue
519
target: etl-ndt-parser
620
# Average rate at which to release tasks to the service. Default is 5/sec
721
# This is actually the rate at which tokens are added to the bucket.
8-
# 1.0 allow processing a day's data (about 11K tasks) in 3 to 4 hours.
9-
rate: 1.0/s
22+
# 1.0 allow processing a day's data (about 16K tasks) in about 4 hours.
23+
# 0.3 keeps the load close to 2 instances, processing whole day in about 14 hours.
24+
rate: 0.3/s
1025
# Number of tokens that can accumulate in the bucket. Default is 5. This should
1126
# have very little impact for our environment.
12-
bucket_size: 10
13-
# Maximum number of concurrent requests.
14-
max_concurrent_requests: 360
27+
bucket_size: 20 # To quickly fill the minimum two instances.
28+
# Maximum number of concurrent requests. Should be 0.9 * max concurrent tasks.
29+
max_concurrent_requests: 110 # For max of 10 instances, 12 workers per instance.
1530

1631
- name: etl-ndt-batch-queue
1732
target: etl-ndt-batch-parser

cmd/etl_worker/app-ndt.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ resources:
1818
automatic_scaling:
1919
# We expect fairly steady load, so a modest minimum will rarely cost us anything.
2020
min_num_instances: 2
21-
max_num_instances: 20
21+
max_num_instances: 10
2222
# Very long cool down period, to reduce the likelihood of tasks being truncated.
2323
cool_down_period_sec: 1800
2424
# We don't care much about latency, so a high utilization is desireable.
@@ -37,7 +37,7 @@ network:
3737
- 9090/tcp
3838

3939
env_variables:
40-
MAX_WORKERS: 20
40+
MAX_WORKERS: 12
4141
BIGQUERY_PROJECT: 'mlab-sandbox'
4242
BIGQUERY_DATASET: 'mlab_sandbox'
4343
ANNOTATE_IP: 'true'

0 commit comments

Comments
 (0)