Skip to content

Add workflow system info in Workflow Execution context #5405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

khushboobhatia01
Copy link
Contributor

@khushboobhatia01 khushboobhatia01 commented Oct 24, 2021

This PR adds workflow engine system information to WorkflowExecution context.
Ref #5373

{
    "engine_info": {
        "hostname": "st2", 
        "pid": 1085262
    }, 
    "parent": {
        "user": "stanley", 
        "pack": "examples"
    }, 
    "st2": {
        "action_execution_id": "6174ce479195da5bbddf5443", 
        "api_url": "http://127.0.0.1:9101/v1", 
        "runner": "orquesta", 
        "workflow_execution_id": "6174ce48f3b5fca8fa795f97", 
        "user": "stanley", 
        "action": "examples.orquesta-sequential", 
        "pack": "examples"
    }
}

This information will be used to identify orphaned workflow execution.

@pull-request-size pull-request-size bot added the size/XS PR that changes 0-9 lines. Quick fix/merge. label Oct 24, 2021
@m4dcoder m4dcoder added this to the 3.7.0 milestone Oct 26, 2021
Copy link
Contributor

@m4dcoder m4dcoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. The action execution that corresponds to the workflow execution should have the system info recorded. Do we need to duplicate that here in the workflow execution document?
  2. The workflow execution can be processed by 1 or more different workflow engine during its execution. I see the need to update the system info in the action execution whenever the workflow execution is being picked up again and processed. Can you visit the workflow service at https://github.com/StackStorm/st2/blob/master/st2common/st2common/services/workflows.py and review functions that start with handle_*? These functions are usually the entry point for workflow engine to rehydrate a workflow execution.
  3. Can you explain how you will determine that a workflow execution is orphaned? The engine info here is not sufficient because this just tells us workflow engine "X" is the last process that worked on this workflow execution and it's not actually orphaned.

@m4dcoder
Copy link
Contributor

@khushboobhatia01 Can you also clean up this PR because the branch is littered with Trigger/Retrigger CI? This will get into the commit history for master branch.

@pull-request-size pull-request-size bot added size/M PR that changes 30-99 lines. Good size to review. and removed size/XS PR that changes 0-9 lines. Quick fix/merge. labels Nov 11, 2021
@khushboobhatia01 khushboobhatia01 force-pushed the update_workflow_system_info branch from adbbf24 to 59b5ced Compare November 11, 2021 10:58
@khushboobhatia01 khushboobhatia01 requested review from a team as code owners November 11, 2021 10:58
@khushboobhatia01 khushboobhatia01 force-pushed the update_workflow_system_info branch from d8514b9 to de8a64c Compare November 11, 2021 11:16
@Kami
Copy link
Member

Kami commented Nov 11, 2021

I won't have time to perform a detailed review, but just two quick comments / questions:

  1. Performance implications

It appears that depending on the object passed to the function, we will either perform 1 or 2 additional GET one and 1 (2? I don't know why we need that delete at the end in finally) additional UPDATE query each time handle_workflow_execution and handle_action_execution is called.

A while back I spent a substantial amount of time of profiling and optimizing performance of that code path (which is considered a hot code path since it's executed often and has major performance implication on how long running / scheduling executions and workflows takes).

And per @m4dcoder coder, that code can be executed multiple times (each time execution it's picked up by workflow engine) so this makes it even more problematic.

Even though we made a lot of performance improvements which should make those GET and UPDATE queries much faster now, we should still think about it some more and see if there is any way to reduce number of queries and make that code more efficient.

So in short, each time we touch those functions we should think hard when adding new DB operations and if they are really needed.

  1. Code formatting / style

I don't think we should use a decorator.

It makes code harder to read and understand.

It should be a regular function / method (or two of them so we can get rid of *args and **kwargs stuff which also makes code hard to read and understand) which gets called in places where it's needed.

@@ -1561,3 +1564,38 @@ def identify_orphaned_workflows():
continue

return orphaned


def add_system_info_to_action_context(func):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my other comment, I think we should refactor this 1 or 2 separate functions and not use a decorator.

except Exception as e:
raise e
finally:
lv_ex = lv_db_access.LiveAction.get_by_id(lv_ex.id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear why we need this code path so we should add a comment on why this delete is here and why it's needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kami The intention here is to add engine information to the context at the start of handling and remove it once handling is complete.
Any action in running/pausing/cancelling state, having engine information about a deregistered service can be classified as orphaned workflow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the workflow execution can be processed by 1 or more different workflow engine during its execution, it's difficult to know which workflow engine is currently handling the execution.

@cognifloyd
Copy link
Member

@khushboobhatia01 what are your plans with this PR? Do you know when you'll be able to pick it up again and address some of the feedback here?

@arm4b arm4b removed this from the 3.7.0 milestone Sep 12, 2022
@CLAassistant
Copy link

CLAassistant commented Jan 24, 2024

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/M PR that changes 30-99 lines. Good size to review. status: potential performance implications
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants