You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# - Signaled: pftaskqueue worker signaled and the task handler was interrupted
388
395
# - InternalError: pftaskqueue worker faced some error processing a task
389
396
reason: Succeeded
390
-
# Returned values from the task handlers.
391
-
# See 'Task Handler Specification' section for how pftaskqueue worker communicates
397
+
# Returned values from the task handlers.
398
+
# See 'Task Handler Specification' section for how pftaskqueue worker communicates
392
399
# with its task handler processes.
393
400
# payload max size varies on backend type to prevent from overloading backend.
394
401
# redis: 1KB
@@ -398,7 +405,7 @@ status:
398
405
# redis: 1KB
399
406
# If size exceeded, the contents will be truncated automatically
400
407
message: ""
401
-
# Below two fields will be set if the worker which processes the task was
408
+
# Below two fields will be set if the worker which processes the task was
402
409
# lost and salvaged by the other worker.
403
410
# See "Worker lifecycle" section below for details.
404
411
# salvagedBy: <workerUID>
@@ -421,10 +428,10 @@ status:
421
428
422
429
If you queued your `TaskSpec`, `pftaskqueue` assign UID to it and generate `Task` with `Pending` phase for it. Some worker pulled a `Pending` task from the queue, `Task` transits to `Received` phase. When `Task` actually stared to be processed by task handler process, it transits to `Processing` phase.
423
430
424
-
Once task handler process succeeded, `Task` transits to `Succeeded` phase. If task handler process failed, `pftaskqueue` can handle automatic retry feature with respect to `TaskSpec.retryLimit`. If the task handler process failed and it didn't reach at its retry limit, `pftaskqueue` re-queue the task with setting `Pending` phase again. Otherwise `pftaskqueue` will give up retry and mark it `Failed` phase. You can see all the process record of the `Task` status.
431
+
Once task handler process succeeded, `Task` transits to `Succeeded` phase. If task handler process failed, `pftaskqueue` can handle automatic retry feature with respect to `TaskSpec.retryLimit`. If the task handler process failed and it didn't reach at its retry limit, `pftaskqueue` re-queue the task with setting `Pending` phase again. Otherwise `pftaskqueue` will give up retry and mark it `Failed` phase. You can see all the process record of the `Task` status.
425
432
426
433
If worker was signaled, tasks in `Received` or `Processing` phase will be treated as failure and `pftaskqueue` will handle automatic retry feature.
# This value will be used when TaskSpec.timeoutSeconds is not set or 0.
511
518
defaultTimeout: 30m0s
512
519
# Task Handler Command
513
-
# A Worker spawns a process with the command for each received tasks
520
+
# A Worker spawns a process with the command for each received tasks
514
521
commands:
515
522
- cat
516
523
# Worker heartbeat configuration to detect worker process existence
517
524
# Please see "Worker lifecycle" section
518
525
heartBeat:
519
-
# A Worker process tries to update its Worker.Status.lastHeartBeatAt field
526
+
# A Worker process tries to update its Worker.Status.lastHeartBeatAt field
520
527
# stored in queue backend in this interval
521
528
interval: 2s
522
529
# A Worker.Status.lastHeartBeatAt will be determined "expired"
@@ -531,7 +538,7 @@ worker:
531
538
exitOnEmpty: false
532
539
# If exitOnEmpty is true, worker waits for exit in the grace period
533
540
exitOnEmptyGracePeriod: 10s
534
-
# If the value was positive, worker will exit
541
+
# If the value was positive, worker will exit
535
542
# after processing the number of tasks
536
543
numTasks: 1000
537
544
# Base directory to create workspace for task handler processes
@@ -590,10 +597,10 @@ status:
590
597
+------------+
591
598
```
592
599
593
-
Once worker started, it starts with `Running` phase. In the startup, a worker register self to the queue and get its UID. The UID becomes the identifier of workers. If worker exited normally (with `exit-code=0`), it transits `Succeeded` phase. If `exit-code` was not 0, it transits to `Failed` phase.
600
+
Once worker started, it starts with `Running` phase. In the startup, a worker register self to the queue and get its UID. The UID becomes the identifier of workers. If worker exited normally (with `exit-code=0`), it transits `Succeeded` phase. If `exit-code` was not 0, it transits to `Failed` phase.
594
601
595
602
However, worker process was go away by various reasons (`SIGKILL`-ed, `OOMKiller`, etc.). Then, how to detect those worker's sate? `pftaskquue` applies simple timeout based heuristics. A worker process keeps sending heartbeat during it runs, with configured interval, to the queue by updating its `Status.lastHeartBeatAt` field. If the heartbeat became older then configured expiration duration, the worker was determined as 'Lost' state (`phase=Failed, reason=Lost`). Moreover when a worker detects their own heartbeat expired, they exited by their selves to wait they will be salvaged by other workers.
596
-
603
+
597
604
On every worker startup, a worker tries to find `Lost` workers which are safe to be salvaged. `pftaskqueue` also used simple timeout-based heuristics in salvation, too. If time passed `Worker.HeartBeat.SalvagedDuration` after its heartbeat expiration, the worker is determined as a salvation target. Once the worker finds some salvation target workers, it will salvage the worker. "Salvation" means
598
605
599
606
- marks the target `Salvaged` phase (`phase=Failed, reason=Salvaged`)
`pftaskqueue` has a lot of configuration parameters. `pftaskqueue` provides multiple ways to configure them. `pftaskqueue` reads configuraton parameter in the following precedence order. Each item takes precedence over the item below it:
687
694
@@ -738,16 +745,16 @@ redis:
738
745
# key prefix of redis database
739
746
# all the key used pftaskqueue was prefixed by '_pftaskqueue:{keyPrefix}:`
740
747
keyPrefix: omura
741
-
748
+
742
749
# redis server information(addr, password, db)
743
750
addr: ""
744
751
password: ""
745
752
db: 0
746
-
753
+
747
754
#
748
755
# timeout/connection pool setting
749
756
# see also: https://github.com/go-redis/redis/blob/a579d58c59af2f8cefbb7f90b8adc4df97f4fd8f/options.go#L59-L95
750
-
#
757
+
#
751
758
dialTimeout: 5s
752
759
readTimeout: 3s
753
760
writeTimeout: 3s
@@ -757,9 +764,9 @@ redis:
757
764
poolTimeout: 4s
758
765
idleTimeout: 5m0s
759
766
idleCheckFrequency: 1m0s
760
-
767
+
761
768
#
762
-
# pftaskqueue will retry when redis operation failed
769
+
# pftaskqueue will retry when redis operation failed
0 commit comments