Skip to content

Add a sample that uses a workflow to lock resources #172

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Apr 16, 2025

Conversation

nagl-temporal
Copy link
Contributor

@nagl-temporal nagl-temporal commented Apr 1, 2025

What was changed

Adds a new sample that demonstrates how to use a workflow to mediate access to a small pool of resources.

Why?

A customer asked for an example of how to do this kind of thing. We ruled out worker-specific task queues and sessions - those work best when you can run the worker in/near the protected resource. They cannot, for security reasons, which leaves us with:

  • non-temporal approaches to resource locking, or
  • something like this

Checklist

  1. How was this tested:
    Ran the worker and starter per the readme. New unit test passes.

  2. Any docs updates needed?
    No - it's just a sample.

@CLAassistant
Copy link

CLAassistant commented Apr 1, 2025

CLA assistant check
All committers have signed the CLA.

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@cretz
Copy link
Member

cretz commented Apr 2, 2025

There is a pattern for doing this with signal-with-start. You can see a sample in .NET, Go, and TypeScript.

Would totally support such a sample in Python. Needs to be a simple Lock class usable in a workflow with similar semantics to https://docs.python.org/3/library/asyncio-sync.html#asyncio.Lock but does signal with start and such. Also make sure we have a test or two. Feel free to adapt this PR (or close it and I can make an issue tracking that we need such a sample).

resource.autorelease = True
yield resource
finally:
if resource.autorelease:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blah. I want there to be a way to tell whether the workflow code is CAN'ing here, but I believe there isn't one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should always release it if a user is using async with. It will be very confusing if I as a user use async with and the resource remains when that exits. If we need to let callers keep the resource across continue as new, they should "detach" it (e.g. can offer a helper for this that returns some type) and "reattach" it on next workflow use. Granted I also think including continue as new from the caller side in this sample is a bit confusing, but maybe it's needed.

@nagl-temporal nagl-temporal requested a review from cretz April 11, 2025 01:16
@nagl-temporal
Copy link
Contributor Author

nagl-temporal commented Apr 11, 2025

@cretz LMK what you think about the async with abstraction and how it's rough around the edges re: continue-as-new. I also didn't code .lock() and .unlock()... could do if needed.

async def acquire_resource(
cls,
*,
already_acquired_resource: Optional[AcquiredResource] = None,
Copy link
Contributor Author

@nagl-temporal nagl-temporal Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this, but... (thought finished after ⭐)

@workflow.run
async def run(self, input: ResourceLockingWorkflowInput):
async with ResourceAllocator.acquire_resource(
already_acquired_resource=input.already_acquired_resource
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⭐ ...I had to have some way of handling the case where my CAN-predecessor already locked the resource. It was either an optional param to acquire_resource or branching around the async with, which felt much more awkward.

Copy link
Member

@cretz cretz Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this sample needs to complication of including continue as new support for this workflow using the resource manager. This workflow isn't large enough to ever need to continue as new, so including support for it is a bit confusing (people aren't going to understand why it is here, but think they need to do the same).

However, if you do keep this, I would call this reattach= and have it accept an typed object that came from detach in the pre-CAN workflow.

Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a bunch of comments, can take em or leave em, some of which are self-contradictory. Now that I've seen the full use case, I would think of this as a ResourcePool (with a ResourcePoolWorkflow and a ResourcePoolClient)

Comment on lines 40 to 41
start_signal="acquire_resource",
start_signal_args=[AcquireRequest(info.workflow_id)],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would usually be a good opportunity to use our new update-with-start functionality, but if you must span continue-as-new on the lock manager workflow, you have to stay with signals for now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think it's important to keep CAN in. Without CAN, the sample can't demonstrate how to detach/reattach so it seems incomplete to me.

resource.autorelease = True
yield resource
finally:
if resource.autorelease:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should always release it if a user is using async with. It will be very confusing if I as a user use async with and the resource remains when that exits. If we need to let callers keep the resource across continue as new, they should "detach" it (e.g. can offer a helper for this that returns some type) and "reattach" it on next workflow use. Granted I also think including continue as new from the caller side in this sample is a bit confusing, but maybe it's needed.

@nagl-temporal
Copy link
Contributor Author

nagl-temporal commented Apr 11, 2025

Thank you - I like the new names. I'll take most of the comments. I want to leave CAN in, because that's the only thing that makes this tricky - sample feels incomplete without it. However, if I add .lock() and .unlock() and use those, the awkwardness goes away. reattach seems fairly natural now, under the new names.

@nagl-temporal nagl-temporal requested a review from cretz April 11, 2025 19:04
Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, only a few minor things

Comment on lines 23 to 25
await handle.signal(
"acquire_resource", AcquireRequest(workflow.info().workflow_id)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
await handle.signal(
"acquire_resource", AcquireRequest(workflow.info().workflow_id)
)
await handle.signal(
ResourcePoolWorkflow.acquire_resource, AcquireRequest(workflow.info().workflow_id)
)

To be more type safe. Same everywhere

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost everywhere - the pool workflow should signal the requester in an un-typesafe way (since it can't know what kind of workflow the requester is).

f"assign_resource_{workflow.info().workflow_id}",
AcquireResponse(release_key=release_signal, resource=resource),
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no race here, right? Imagine the release signal arrives immediately. The fix is obvious if there is a race, but the code smells nicer this way.

Why I think there's no race: I believe that this handler must run to completion before the handler for the release signal can start.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, there is no race here because this is only called in one place, serially, in the primary loop. However, this can show the dangers of overly extracting/modularizing single-use methods - it can be hard to see the constraints they expect of the callers. Can technically add a "Not safe for concurrent use" docstring if concerned.

Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only minor, non-blocking things, LGTM. Merge at will or let me know when done and I can.

f"assign_resource_{workflow.info().workflow_id}",
AcquireResponse(release_key=release_signal, resource=resource),
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, there is no race here because this is only called in one place, serially, in the primary loop. However, this can show the dangers of overly extracting/modularizing single-use methods - it can be hard to see the constraints they expect of the callers. Can technically add a "Not safe for concurrent use" docstring if concerned.

@nagl-temporal nagl-temporal merged commit 1b6145a into temporalio:main Apr 16, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants