-
Notifications
You must be signed in to change notification settings - Fork 549
WIP: Add JobManager
#4287
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: series/3.x
Are you sure you want to change the base?
WIP: Add JobManager
#4287
Conversation
21ea992
to
1e7138f
Compare
* Creates and launches the given `Job` in the background. If another Job with the same id was | ||
* already running, it will be cancelled before starting this one. | ||
*/ | ||
def startJob(id: Id, job: Resource[F, JobManager.Job[F, S]]): F[Unit] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On one had, I like the idea of users being able to use their own Ids
.
On the other, I think most users would benefit from a default using automatically generated UUIDs
.
Should we provide such default in some way?
/** | ||
* Gets the status of the `Job` associated with the given `id`. If `id` doesn't exists or the | ||
* `Job` already finished then the returned value will be a `None`. | ||
*/ | ||
def getJobStatus(id: Id): F[Option[S]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I somewhat dislike the idea that both bad id and already finished return in a None
.
But, otherwise, the map will grow indefinitely.
Any ideas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By fully controlling IDs, we could have startJob
return a fresh ID, so "bad ID" would never happen. (But then we'd lose the ability of users to have their own IDs...)
/** | ||
* Signals cancellation of the `Job` associated with the given `id`, and waits for its | ||
* completion. | ||
*/ | ||
def cancelJob(id: Id): F[Unit] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we provide a cacelAndForget
variant where we don't wait on cancellation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would that be essentially cancelJob(...).start
? If yes, I don't think we should add it (everyone can .start
for themselves).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conceptually yes, but also, since we already have a Dispatcher
in place, it could be used to perform that rather than raw start
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so I think my point is: if it's somehow better (performance, safety, whatever) than just .start
-ing, then yeah, maybe add it. If it's not better, then definitely not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK, it is safer since its life cycle is attached to the Supervisor
.
But I am not sure what canceling a cancel does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does "nothing", as a cancel
is uncancelable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I guess as you said, it provides no value and users who don't want to wait on the cancellation can just start
.
trait Job[F[_], S] { | ||
|
||
/** | ||
* Starts the logic of this `Job`. | ||
*/ | ||
def run: F[Unit] | ||
|
||
/** | ||
* Gets the status of this `Job`. | ||
*/ | ||
def getStatus: F[S] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users would then implement this trait
for their own Jobs
.
} | ||
} | ||
|
||
supervisor.supervise(runJob).void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't wait for the Job
to start before returning to the user.
But, that means there is a brief delay between starting the Job
and users being able to query its status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's okay. I think the point of std
is to solve tricky race conditions like this for users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point.
What do you think would be the best way to solve that? A Deferred
a CountdownLatch
? Other thing?
Or maybe, simply run the runJob
logic there rather that sending it to the Supervisor
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm a little confused by why it's sent to the Supervisor... (I'm sure there is a reason, I just don't know it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main thing is that if the same id
was already found we cancel it, which is costly.
But, I think I could just send that to the Supervisor
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, looking at the code, I realized that we also need to run the job setup (Resource.acquire
) before being able to run the job itself.
So that was also part of what was being done in the background, but it also means that the short delay could be bigger.
Thus, I decided to use a Deferred
to wait until the job has been properly registered before returning. But that may take a while.
Another idea that I just had would be to have an Initializing
status that could be used meanwhile. However, that would complicate the logic quite a bit.
cancel = fiber.cancel | ||
).some | ||
) | ||
.flatMap(_.traverse_(_.cancel)) >> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case the same id
was already used, we cancel the previous Job
.
Implements the
JobManager
idea proposed in #1345Roadmap