You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# - Signaled: pftaskqueue worker signaled and the task handler was interrupted
395
395
# - InternalError: pftaskqueue worker faced some error processing a task
396
396
reason: Succeeded
397
-
# Returned values from the task handlers.
398
-
# See 'Task Handler Specification' section for how pftaskqueue worker communicates
397
+
# Returned values from the task handlers.
398
+
# See 'Task Handler Specification' section for how pftaskqueue worker communicates
399
399
# with its task handler processes.
400
400
# payload max size varies on backend type to prevent from overloading backend.
401
401
# redis: 1KB
@@ -405,7 +405,7 @@ status:
405
405
# redis: 1KB
406
406
# If size exceeded, the contents will be truncated automatically
407
407
message: ""
408
-
# Below two fields will be set if the worker which processes the task was
408
+
# Below two fields will be set if the worker which processes the task was
409
409
# lost and salvaged by the other worker.
410
410
# See "Worker lifecycle" section below for details.
411
411
# salvagedBy: <workerUID>
@@ -428,10 +428,10 @@ status:
428
428
429
429
If you queued your `TaskSpec`, `pftaskqueue` assign UID to it and generate `Task` with `Pending` phase for it. Some worker pulled a `Pending` task from the queue, `Task` transits to `Received` phase. When `Task` actually stared to be processed by task handler process, it transits to `Processing` phase.
430
430
431
-
Once task handler process succeeded, `Task` transits to `Succeeded` phase. If task handler process failed, `pftaskqueue` can handle automatic retry feature with respect to `TaskSpec.retryLimit`. If the task handler process failed and it didn't reach at its retry limit, `pftaskqueue` re-queue the task with setting `Pending` phase again. Otherwise `pftaskqueue` will give up retry and mark it `Failed` phase. You can see all the process record of the `Task` status.
431
+
Once task handler process succeeded, `Task` transits to `Succeeded` phase. If task handler process failed, `pftaskqueue` can handle automatic retry feature with respect to `TaskSpec.retryLimit`. If the task handler process failed and it didn't reach at its retry limit, `pftaskqueue` re-queue the task with setting `Pending` phase again. Otherwise `pftaskqueue` will give up retry and mark it `Failed` phase. You can see all the process record of the `Task` status.
432
432
433
433
If worker was signaled, tasks in `Received` or `Processing` phase will be treated as failure and `pftaskqueue` will handle automatic retry feature.
# This value will be used when TaskSpec.timeoutSeconds is not set or 0.
518
518
defaultTimeout: 30m0s
519
519
# Task Handler Command
520
-
# A Worker spawns a process with the command for each received tasks
520
+
# A Worker spawns a process with the command for each received tasks
521
521
commands:
522
522
- cat
523
523
# Worker heartbeat configuration to detect worker process existence
524
524
# Please see "Worker lifecycle" section
525
525
heartBeat:
526
-
# A Worker process tries to update its Worker.Status.lastHeartBeatAt field
526
+
# A Worker process tries to update its Worker.Status.lastHeartBeatAt field
527
527
# stored in queue backend in this interval
528
528
interval: 2s
529
529
# A Worker.Status.lastHeartBeatAt will be determined "expired"
@@ -538,7 +538,7 @@ worker:
538
538
exitOnEmpty: false
539
539
# If exitOnEmpty is true, worker waits for exit in the grace period
540
540
exitOnEmptyGracePeriod: 10s
541
-
# If the value was positive, worker will exit
541
+
# If the value was positive, worker will exit
542
542
# after processing the number of tasks
543
543
numTasks: 1000
544
544
# Base directory to create workspace for task handler processes
@@ -597,10 +597,10 @@ status:
597
597
+------------+
598
598
```
599
599
600
-
Once worker started, it starts with `Running` phase. In the startup, a worker register self to the queue and get its UID. The UID becomes the identifier of workers. If worker exited normally (with `exit-code=0`), it transits `Succeeded` phase. If `exit-code` was not 0, it transits to `Failed` phase.
600
+
Once worker started, it starts with `Running` phase. In the startup, a worker register self to the queue and get its UID. The UID becomes the identifier of workers. If worker exited normally (with `exit-code=0`), it transits `Succeeded` phase. If `exit-code` was not 0, it transits to `Failed` phase.
601
601
602
602
However, worker process was go away by various reasons (`SIGKILL`-ed, `OOMKiller`, etc.). Then, how to detect those worker's sate? `pftaskquue` applies simple timeout based heuristics. A worker process keeps sending heartbeat during it runs, with configured interval, to the queue by updating its `Status.lastHeartBeatAt` field. If the heartbeat became older then configured expiration duration, the worker was determined as 'Lost' state (`phase=Failed, reason=Lost`). Moreover when a worker detects their own heartbeat expired, they exited by their selves to wait they will be salvaged by other workers.
603
-
603
+
604
604
On every worker startup, a worker tries to find `Lost` workers which are safe to be salvaged. `pftaskqueue` also used simple timeout-based heuristics in salvation, too. If time passed `Worker.HeartBeat.SalvagedDuration` after its heartbeat expiration, the worker is determined as a salvation target. Once the worker finds some salvation target workers, it will salvage the worker. "Salvation" means
605
605
606
606
- marks the target `Salvaged` phase (`phase=Failed, reason=Salvaged`)
`pftaskqueue` has a lot of configuration parameters. `pftaskqueue` provides multiple ways to configure them. `pftaskqueue` reads configuraton parameter in the following precedence order. Each item takes precedence over the item below it:
694
694
@@ -745,16 +745,16 @@ redis:
745
745
# key prefix of redis database
746
746
# all the key used pftaskqueue was prefixed by '_pftaskqueue:{keyPrefix}:`
747
747
keyPrefix: omura
748
-
748
+
749
749
# redis server information(addr, password, db)
750
750
addr: ""
751
751
password: ""
752
752
db: 0
753
-
753
+
754
754
#
755
755
# timeout/connection pool setting
756
756
# see also: https://github.com/go-redis/redis/blob/a579d58c59af2f8cefbb7f90b8adc4df97f4fd8f/options.go#L59-L95
757
-
#
757
+
#
758
758
dialTimeout: 5s
759
759
readTimeout: 3s
760
760
writeTimeout: 3s
@@ -764,9 +764,9 @@ redis:
764
764
poolTimeout: 4s
765
765
idleTimeout: 5m0s
766
766
idleCheckFrequency: 1m0s
767
-
767
+
768
768
#
769
-
# pftaskqueue will retry when redis operation failed
769
+
# pftaskqueue will retry when redis operation failed
0 commit comments