DM-51738 : Implement Campaign and Node FSM#198
DM-51738 : Implement Campaign and Node FSM#198tcjennings merged 7 commits intotickets/DM-51337/releasefrom
Conversation
00b4769 to
99b7b13
Compare
410006a to
4caf5fb
Compare
- move campaign test fixture to module conftest - scope campaign test fixture to function - add auto fixture for patched config
- Implement Campaign/Node FSM using pytransitions - Establish daemon v2 iteration loop - Establish E2E FSM and Daemon tests - machines get their own sessions - add config flags for daemon v2 campaign/node processing
- validate graph and affect status update in background task - update fastapi session override in tests - add activity log route for campaigns - implement campaign status updates via PATCH and FSM
- chore: refactor test fixtures - fix: improve session hygiene in tests - chore: denormalize timestamp columns for activitylog - fix: FSM finalize based on activitylog finished_at instead of detail - fix: handle full lifecycle of session in session factory - feat(daemon): refactor task callbacks as modular callables - fix(daemon): set submitted_at and finished_at for tasks instead of deleting - fix(graph): use session.get() instead of select() - fix: allow null node in activty log
- rename "session manager" class as "database manager" - replace "get_async_session" function with calls to sessionmaker().
a9dbeda to
1a1305c
Compare
- use "uuid" instead of "id" in graph node simple mode - require namespace when GET node by name - include additional links in campaign response header - campaign post returns existing campaign on duplicate - improve consistency in route parameter naming - delegate more route typing to pydantic models - use sorting in activity log route
1a1305c to
2be9511
Compare
ctslater
left a comment
There was a problem hiding this comment.
Impressive amount of substantive code here, really fun to read through (and sorry it took me forever!) Just a few questions on things I didn't understand, no real disagreements. Going to be fun to start trying it out.
| # TODO: notification callback | ||
| finalizer = create_task(finalize_runner_callback(context)) | ||
| finalizer.add_done_callback(callbacks.discard) | ||
| callbacks.add(finalizer) |
There was a problem hiding this comment.
doesn't something need to await callbacks?
There was a problem hiding this comment.
Not exactly. This is a bit of a rube goldberg sequence of async tasks that starts with the primary task that the daemon is adding to a task group, which is awaited at the end of the TaskGroup context manager. Each of these tasks has a callback coro (task_runner_callback) which is awaited when the original task completes; and that coro itself has a callback coro to be awaited when it finishes, and this last bit is where callbacks comes in. callbacks is a set collection that is holding a strong reference to the callbacks set up by task_runner_callback so they don't get lost in the shuffle. The last callback coro (callbacks.discard) cleans up the collection and discards the strong reference to coros as they complete. A simpler example of this pattern is in the python docs.
| happy_path = [StatusEnum.waiting, StatusEnum.ready, StatusEnum.running, StatusEnum.accepted] | ||
| if self in happy_path: | ||
| i = happy_path.index(self) | ||
| return happy_path[i + 1] |
There was a problem hiding this comment.
Hope nobody calls this on StatusEnum.accepted?
There was a problem hiding this comment.
This is true by fiat but in uncontrolled hands it could be an IndexError waiting to happen. I'll make a note to guard against that.
| class InvalidCampaignGraphError(Exception): ... | ||
|
|
||
|
|
||
| class CampaignMachine(NodeMachine): |
There was a problem hiding this comment.
Am I understanding right that the implementation of "start", "finish" and "resume" triggers will go in here?
| assert self.db_model is not None | ||
|
|
||
| if self.activity_log_entry is not None: | ||
| return None |
There was a problem hiding this comment.
I don't understand why this gives up here.
There was a problem hiding this comment.
This is a guard against the case where a log entry object already exists when this callback is invoked. It doesn't "give up" exactly, it returns early instead of clobbering an existing instance attribute.
| async def do_finish(self, event: EventData) -> None: ... | ||
|
|
||
| async def is_successful(self, event: EventData) -> bool: | ||
| """Checks whether the WMS job is finished or not based on the result of |
There was a problem hiding this comment.
I know this is a placeholder for future-work but this only makes sense for Groups.
There was a problem hiding this comment.
Yes, the organization of callbacks and docstrings will be changed substantially as we implement specific types of Node machines in the next iteration.
| while end_node.status is not StatusEnum.accepted: | ||
| i -= 1 | ||
| await consider_campaigns(session) | ||
| await consider_nodes(session) |
There was a problem hiding this comment.
I'm not understanding how the daemon reaches the nodes beyond the first iteration in this loop? Don't the nodes need to advance to some "finished" state before the next layer in the graph can be reached?
There was a problem hiding this comment.
The loop is watching the status of the end node of the campaign. Because this is a test, it is a contrived example where the number of iterations is "known" (hence the countdown variable i) but each node in turn is advanced by one status along the happy path for each execution of this loop (the daemon "task" for this being generated by consider_campaigns and the resolution of that task by consider_nodes). After the first iteration the start node will have advanced from waiting to ready, and after the second iteration from ready to running, etc. The loop only ends when the end_node's status reaches accepted which can only happen when the entire graph has been "considered".
Implements core logic for campaign and node evolution through an FSM implemented using the
pytransitionspackage.Provides scaffolding for iterative implementation of business logic (i.e., translating v1 "scripts" to v2 "triggers") with ABCs and E2E tests.