fix(python apps): fork child + exec appcmd in child to avoid issues #1580
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Fork is unsafe after threading has been used, at least for python. This is a known issue in the python community. See:
Forking after threading has been used can cause deadlocks/hangs when using threading primitives like Lock, Condition, or Queue in a process created via fork. All of this occurs within the internals of durabletask-python. This means it will impact any app using the python sdk, be it dapr agents or general python workflows. I tried various python versions and they all experience hangs, at various points due to this.
This PR corrects this by using syscall.ForkExec to fork a child, then execs python or whatever app language within that child so the app language starts via exec, avoiding it knowing about the fork. I will say a downfall of this is I lose the blue APP prefix stuff on the log lines. So this means changing the UX via terminals unfortunately. I tried adding that back with pipes, but then adding the pipes to give me the log prefix then broke my python app again causing the hang.
This is what it looks like btw with this PR. Blue INFO is from the sidecar and the app logs are in white without the ===APP=== prefix that we used to have:
I need to test this on a windows machine to make sure I don't leave zombies, and also I think at this point this is good enough to fix other OS's. I could leave a note in the docs that running python apps on windows could lead to unpredictable behavior for now and create a follow up issue for that. Thoughts?
Issue reference
Checklist
Please make sure you've completed the relevant tasks for this PR, out of the following list: