Skip to content

Commit 1e9f970

Browse files
committed
Initial commit
0 parents  commit 1e9f970

26 files changed

+980
-0
lines changed

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
venv/

LICENSE

+21
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2024 Parnassus Labs, Inc.
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

Makefile

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
.PHONY: test
2+
test:
3+
pytest dramatiq_workflow

README.md

+96
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# dramatiq-workflow
2+
3+
`dramatiq-workflow` allows running workflows (chains and groups of tasks) using
4+
the Python background task processing library [dramatiq](https://dramatiq.io/).
5+
6+
A workflow allows running tasks in parallel and in sequence. It is a way to
7+
define a workflow of tasks, a combination of chains and groups in any order and
8+
nested as needed.
9+
10+
## Features
11+
12+
- Define workflows with tasks running in parallel and in sequence.
13+
- Nest chains and groups of tasks to create complex workflows.
14+
- Schedules workflows to run in the background using dramatiq.
15+
16+
**Note:** `dramatiq-workflow` does not support passing the results from one task
17+
to the next one in a chain. We recommend using a database to store intermediate
18+
results if needed.
19+
20+
## Installation
21+
22+
You can install `dramatiq-workflow` from PyPI:
23+
24+
```sh
25+
pip install dramatiq-workflow
26+
```
27+
28+
## Example
29+
30+
Let's assume we want a workflow that looks like this:
31+
32+
```text
33+
╭────────╮ ╭────────╮
34+
│ Task 2 │ │ Task 5 │
35+
╭──┼● ●┼──┼● ●┼╮
36+
╭────────╮│ ╰────────╯ ╰────────╯│ ╭────────╮
37+
│ Task 1 ││ ╭────────╮ │ │ Task 8 │
38+
│ ●┼╯ │ Task 3 │ ╰──┼● │
39+
│ ●┼───┼● ●┼───────────────┼● │
40+
│ ●┼╮ ╰────────╯ ╭─┼● │
41+
╰────────╯│ ╭────────╮ ╭────────╮│╭┼● │
42+
│ │ Task 4 │ │ Task 6 │││╰────────╯
43+
╰──┼● ●┼───┼● ●┼╯│
44+
│ ●┼╮ ╰────────╯ │
45+
╰────────╯│ │
46+
│ ╭────────╮ │
47+
│ │ Task 7 │ │
48+
╰──┼● ●┼─╯
49+
╰────────╯
50+
```
51+
52+
We can define this workflow as follows:
53+
54+
```python
55+
from dramatiq_workflow import Workflow, Chain, Group
56+
57+
workflow = Workflow(
58+
Chain(
59+
task1.message(),
60+
Group(
61+
Chain(
62+
task2.message(),
63+
task5.message(),
64+
),
65+
task3.message(),
66+
Chain(
67+
task4.message(),
68+
Group(
69+
task6.message(),
70+
task7.message(),
71+
),
72+
),
73+
),
74+
task8.message(),
75+
),
76+
)
77+
workflow.run() # Schedules the workflow to run in the background
78+
```
79+
80+
### Execution Order
81+
82+
In this example, the execution would look like this:
83+
84+
1. Task 1 runs
85+
2. Task 2, 3, and 4 run in parallel once Task 1 finishes
86+
3. Task 5 runs once Task 2 finishes
87+
4. Task 6 and 7 run in parallel once Task 4 finishes
88+
5. Task 8 runs once Task 5, 6, and 7 finish
89+
90+
*This is a simplified example. The actual execution order may vary because
91+
tasks that can run in parallel (i.e., in a Group) are not guaranteed to run in
92+
the order they are defined in the workflow.*
93+
94+
## License
95+
96+
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.

dramatiq_workflow/__init__.py

+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
from ._base import Workflow
2+
from ._middleware import WorkflowMiddleware
3+
from ._models import Chain, Group, Message, WithDelay, WorkflowType
4+
5+
__all__ = [
6+
"Chain",
7+
"Group",
8+
"Message",
9+
"WithDelay",
10+
"Workflow",
11+
"WorkflowMiddleware",
12+
"WorkflowType",
13+
]
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

dramatiq_workflow/_barrier.py

+38
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
import dramatiq.rate_limits
2+
3+
4+
class AtMostOnceBarrier(dramatiq.rate_limits.Barrier):
5+
"""
6+
The AtMostOnceBarrier is a barrier that ensures that it is released at most
7+
once.
8+
9+
We use this because we want to avoid running callbacks in chains multiple
10+
times. Running callbacks more than once can have compounding effects
11+
especially when groups are involved.
12+
13+
The downside of this is that we cannot guarantee that the barrier will be
14+
released at all. Theoretically a worker could die after releasing the
15+
barrier but just before it has a chance to schedule the callbacks.
16+
"""
17+
18+
def __init__(self, backend, key, *args, ttl=900000):
19+
super().__init__(backend, key, *args, ttl=ttl)
20+
self.ran_key = f"{key}_ran"
21+
22+
def create(self, parties):
23+
self.backend.add(self.ran_key, 0, self.ttl)
24+
return super().create(parties)
25+
26+
def wait(self, *args, block=True, timeout=None):
27+
if block:
28+
# Blocking with an AtMostOnceBarrier is not supported as it could
29+
# lead to clients waiting indefinitely if the barrier already
30+
# released.
31+
raise ValueError("Blocking is not supported by AtMostOnceBarrier")
32+
33+
released = super().wait(*args, block=False)
34+
if released:
35+
never_released = self.backend.incr(self.ran_key, 1, 1, self.ttl)
36+
return never_released
37+
38+
return False

dramatiq_workflow/_base.py

+185
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,185 @@
1+
import logging
2+
import time
3+
from uuid import uuid4
4+
5+
import dramatiq
6+
import dramatiq.rate_limits
7+
8+
from ._constants import CALLBACK_BARRIER_TTL, OPTION_KEY_CALLBACKS
9+
from ._helpers import workflow_with_completion_callbacks
10+
from ._middleware import WorkflowMiddleware, workflow_noop
11+
from ._models import Barrier, Chain, CompletionCallbacks, Group, Message, WithDelay, WorkflowType
12+
from ._serialize import serialize_callbacks, serialize_workflow
13+
14+
logger = logging.getLogger(__name__)
15+
16+
17+
class Workflow:
18+
"""
19+
A workflow allows running tasks in parallel and in sequence. It is a way to
20+
define a workflow of tasks, a combination of chains and groups in any
21+
order and nested as needed.
22+
23+
Example:
24+
25+
Let's assume we want a workflow that looks like this:
26+
27+
╭────────╮ ╭────────╮
28+
│ Task 2 │ │ Task 5 │
29+
╭──┼● ●┼──┼● ●┼╮
30+
╭────────╮│ ╰────────╯ ╰────────╯│ ╭────────╮
31+
│ Task 1 ││ ╭────────╮ │ │ Task 8 │
32+
│ ●┼╯ │ Task 3 │ ╰──┼● │
33+
│ ●┼───┼● ●┼───────────────┼● │
34+
│ ●┼╮ ╰────────╯ ╭─┼● │
35+
╰────────╯│ ╭────────╮ ╭────────╮│╭┼● │
36+
│ │ Task 4 │ │ Task 6 │││╰────────╯
37+
╰──┼● ●┼───┼● ●┼╯│
38+
│ ●┼╮ ╰────────╯ │
39+
╰────────╯│ │
40+
│ ╭────────╮ │
41+
│ │ Task 7 │ │
42+
╰──┼● ●┼─╯
43+
╰────────╯
44+
45+
We can define this workflow as follows:
46+
47+
```python
48+
from dramatiq_workflow import Workflow, Chain, Group
49+
50+
workflow = Workflow(
51+
Chain(
52+
task1.message(),
53+
Group(
54+
Chain(
55+
task2.message(),
56+
task5.message(),
57+
),
58+
task3.message(),
59+
Chain(
60+
task4.message(),
61+
Group(
62+
task6.message(),
63+
task7.message(),
64+
),
65+
),
66+
),
67+
task8.message(),
68+
),
69+
)
70+
workflow.run() # Schedules the workflow to run in the background
71+
```
72+
73+
In this example, the execution would look like this*:
74+
1. Task 1 runs
75+
2. Task 2, 3, and 4 run in parallel once Task 1 finishes
76+
3. Task 5 runs once Task 2 finishes
77+
4. Task 6 and 7 run in parallel once Task 4 finishes
78+
5. Task 8 runs once Task 5, 6, and 7 finish
79+
80+
* This is a simplified example. The actual execution order may vary because
81+
tasks that can run in parallel (i.e. in a Group) are not guaranteed to run
82+
in the order they are defined in the workflow.
83+
"""
84+
85+
def __init__(
86+
self,
87+
workflow: WorkflowType,
88+
broker: dramatiq.Broker | None = None,
89+
):
90+
self.workflow = workflow
91+
self.broker = broker or dramatiq.get_broker()
92+
93+
self._delay = None
94+
self._completion_callbacks = []
95+
96+
while isinstance(self.workflow, WithDelay):
97+
self._delay = (self._delay or 0) + self.workflow.delay
98+
self.workflow = self.workflow.task
99+
100+
def run(self):
101+
current = self.workflow
102+
completion_callbacks = self._completion_callbacks.copy()
103+
104+
if isinstance(current, Message):
105+
current = self.__augment_message(current, completion_callbacks)
106+
self.broker.enqueue(current, delay=self._delay)
107+
return
108+
109+
if isinstance(current, Chain):
110+
tasks = current.tasks[:]
111+
if not tasks:
112+
self.__schedule_noop(completion_callbacks)
113+
return
114+
115+
task = tasks.pop(0)
116+
if tasks:
117+
completion_id = self.__create_barrier(1)
118+
completion_callbacks.append((completion_id, Chain(*tasks), False))
119+
self.__workflow_with_completion_callbacks(task, completion_callbacks).run()
120+
return
121+
122+
if isinstance(current, Group):
123+
tasks = current.tasks[:]
124+
if not tasks:
125+
self.__schedule_noop(completion_callbacks)
126+
return
127+
128+
completion_id = self.__create_barrier(len(tasks))
129+
completion_callbacks.append((completion_id, None, True))
130+
for task in tasks:
131+
self.__workflow_with_completion_callbacks(task, completion_callbacks).run()
132+
return
133+
134+
raise TypeError(f"Unsupported workflow type: {type(current)}")
135+
136+
def __workflow_with_completion_callbacks(self, task, completion_callbacks) -> "Workflow":
137+
return workflow_with_completion_callbacks(
138+
task,
139+
self.broker,
140+
completion_callbacks,
141+
delay=self._delay,
142+
)
143+
144+
def __schedule_noop(self, completion_callbacks: CompletionCallbacks):
145+
noop_message = workflow_noop.message()
146+
noop_message = self.__augment_message(noop_message, completion_callbacks)
147+
self.broker.enqueue(noop_message, delay=self._delay)
148+
149+
def __augment_message(self, message: Message, completion_callbacks: CompletionCallbacks) -> Message:
150+
return message.copy(
151+
# We reset the message timestamp to better represent the time the
152+
# message was actually enqueued. This is to avoid tripping the max_age
153+
# check in the broker.
154+
message_timestamp=time.time() * 1000,
155+
options={OPTION_KEY_CALLBACKS: serialize_callbacks(completion_callbacks)},
156+
)
157+
158+
@property
159+
def __rate_limiter_backend(self):
160+
if not hasattr(self, "__cached_rate_limiter_backend"):
161+
for middleware in self.broker.middleware:
162+
if isinstance(middleware, WorkflowMiddleware):
163+
self.__cached_rate_limiter_backend = middleware.rate_limiter_backend
164+
break
165+
else:
166+
raise RuntimeError(
167+
"WorkflowMiddleware middleware not found! Did you forget "
168+
"to set it up? It is required if you want to use "
169+
"workflows."
170+
)
171+
return self.__cached_rate_limiter_backend
172+
173+
def __create_barrier(self, count: int):
174+
if count == 1:
175+
# No need to create a distributed barrier if there is only one task
176+
return None
177+
178+
completion_uuid = str(uuid4())
179+
completion_barrier = Barrier(self.__rate_limiter_backend, completion_uuid, ttl=CALLBACK_BARRIER_TTL)
180+
completion_barrier.create(count)
181+
logger.debug("Barrier created: %s (%d tasks)", completion_uuid, count)
182+
return completion_uuid
183+
184+
def __str__(self):
185+
return f"Workflow({serialize_workflow(self.workflow)})"

dramatiq_workflow/_constants.py

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
CALLBACK_BARRIER_TTL = 86_400_000
2+
OPTION_KEY_CALLBACKS = "workflow_completion_callbacks"

dramatiq_workflow/_helpers.py

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
import dramatiq
2+
3+
from ._models import CompletionCallbacks, WorkflowType
4+
5+
6+
def workflow_with_completion_callbacks(
7+
workflow: WorkflowType,
8+
broker: dramatiq.Broker,
9+
completion_callbacks: CompletionCallbacks,
10+
delay: int | None = None,
11+
):
12+
from ._base import Workflow
13+
14+
w = Workflow(workflow, broker)
15+
w._completion_callbacks = completion_callbacks
16+
if delay is not None:
17+
w._delay = (w._delay or 0) + delay
18+
return w

0 commit comments

Comments
 (0)