Skip to content

Keep worker threads alive to wait for jobs #97

@grst

Description

@grst

Given a DAG like this, when running paraffin worker --jobs 8, the first job gets executed, and all other threads get killed due to a timeout.
Therefore, all jobs end up being executed sequentially on the one remaining worker thread, even though they could be ran in parallel after executing the root node.

flowchart TD
        node1["stage1/dvc.yaml:stage1"]
        node2["stage2a/dvc.yaml:stage1"]
        node3["stage2b/dvc.yaml:stage1"]
        node4["stage2c/dvc.yaml:stage1"]
        node5["stage2d/dvc.yaml:stage1"]
        node1-->node2
        node1-->node3
        node1-->node4
        node1-->node5
Loading

Repex

git clone https://github.com/grst/paraffin-repex.git
cd paraffin-repex
paraffin submit
paraffin worker --jobs 8

Log

INFO:paraffin.cli:Listening on queues: ['default']
INFO:paraffin.cli:Running job 'stage1/dvc.yaml:stage1'
INFO:paraffin.cli:Listening on queues: ['default']
INFO:paraffin.cli:Timeout reached - exiting.
INFO:paraffin.cli:Listening on queues: ['default']
INFO:paraffin.cli:Timeout reached - exiting.
INFO:paraffin.cli:Listening on queues: ['default']
INFO:paraffin.cli:Timeout reached - exiting.
INFO:paraffin.cli:Listening on queues: ['default']
INFO:paraffin.cli:Timeout reached - exiting.
INFO:paraffin.cli:Listening on queues: ['default']
INFO:paraffin.cli:Timeout reached - exiting.
INFO:paraffin.cli:Listening on queues: ['default']
INFO:paraffin.cli:Timeout reached - exiting.
Running stage 'stage1/dvc.yaml:stage1':
> sleep 10
INFO:paraffin.cli:Listening on queues: ['default']
INFO:paraffin.cli:Timeout reached - exiting.
> touch stage1.txt
WARNING: 'stage1/stage1.txt' is empty.
Generating lock file 'stage1/dvc.lock'
Updating lock file 'stage1/dvc.lock'

To track the changes with git, run:

        git add stage1/dvc.lock

To enable auto staging, run:

        dvc config core.autostage true
Use `dvc push` to send your updates to remote storage.
WARNING: 'stage1/stage1.txt' is empty.
WARNING:dvc.output:'stage1/stage1.txt' is empty.
INFO:paraffin.cli:Running job 'stage2d/dvc.yaml:stage1'
WARNING: 'stage1/stage1.txt' is empty.
Running stage 'stage2d/dvc.yaml:stage1':
> sleep 10
WARNING: 'stage1/stage1.txt' is empty.
Generating lock file 'stage2d/dvc.lock'
Updating lock file 'stage2d/dvc.lock'

To track the changes with git, run:

        git add stage2d/dvc.lock

To enable auto staging, run:

        dvc config core.autostage true
Use `dvc push` to send your updates to remote storage.
WARNING: 'stage1/stage1.txt' is empty.
WARNING:dvc.output:'stage1/stage1.txt' is empty.
INFO:paraffin.cli:Running job 'stage2c/dvc.yaml:stage1'
WARNING: 'stage1/stage1.txt' is empty.
Running stage 'stage2c/dvc.yaml:stage1':
> sleep 10
WARNING: 'stage1/stage1.txt' is empty.
Generating lock file 'stage2c/dvc.lock'
Updating lock file 'stage2c/dvc.lock'

To track the changes with git, run:

        git add stage2c/dvc.lock

To enable auto staging, run:

        dvc config core.autostage true
Use `dvc push` to send your updates to remote storage.
WARNING: 'stage1/stage1.txt' is empty.
WARNING:dvc.output:'stage1/stage1.txt' is empty.
INFO:paraffin.cli:Running job 'stage2b/dvc.yaml:stage1'
WARNING: 'stage1/stage1.txt' is empty.
Running stage 'stage2b/dvc.yaml:stage1':
> sleep 10
WARNING: 'stage1/stage1.txt' is empty.
Generating lock file 'stage2b/dvc.lock'
Updating lock file 'stage2b/dvc.lock'

To track the changes with git, run:

        git add stage2b/dvc.lock

To enable auto staging, run:

        dvc config core.autostage true
Use `dvc push` to send your updates to remote storage.
WARNING: 'stage1/stage1.txt' is empty.
WARNING:dvc.output:'stage1/stage1.txt' is empty.
INFO:paraffin.cli:Running job 'stage2a/dvc.yaml:stage1'
WARNING: 'stage1/stage1.txt' is empty.
Running stage 'stage2a/dvc.yaml:stage1':
> sleep 10
WARNING: 'stage1/stage1.txt' is empty.
Generating lock file 'stage2a/dvc.lock'
Updating lock file 'stage2a/dvc.lock'

To track the changes with git, run:

        git add stage2a/dvc.lock

To enable auto staging, run:

        dvc config core.autostage true
Use `dvc push` to send your updates to remote storage.
WARNING: 'stage1/stage1.txt' is empty.
WARNING:dvc.output:'stage1/stage1.txt' is empty.
INFO:paraffin.cli:Timeout reached - exiting.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions