Document the Quarkus concurrency model#52638
Conversation
|
Thanks for your pull request! Your pull request does not follow our editorial rules. Could you have a look?
This message is automatically generated by a bot. |
|
This is work in progress. I'm submitting the PR to let some people review the first part (multithreading and event loops). The rest is probably not interesting for anyone, but feel free to have a look :-) I expect it will take me at least a week or two to document the Quarkus concurrency model. |
|
🎊 PR Preview 39854fc has been successfully built and deployed to https://quarkus-pr-main-52638-preview.surge.sh/version/main/guides/
|
|
/cc @vietj @tsegismont as there is Vert.x-related content |
9805f69 to
aa552f6
Compare
| As shown above, requests are processed concurrently on a small number of threads; these are called _event loop threads_ (or sometimes _I/O threads_). | ||
| Each request is assigned to an event loop thread and processing of requests is interleaved on that thread. | ||
| (Yes, it is a form of cooperative multitasking.) | ||
| Since Vert.x is ultimately a Java toolkit, this interleaving is expressed in Java code using _callbacks_ and _futures_; other programming languages often use the `async` / `await` keywords. |
There was a problem hiding this comment.
Vert.x started before Java 8 was out, so callbacks (and our own futures) were the only option. Later came CompletionStage/CompletableFuture. They have been considered as an alternative but we came to the conclusion they wouldn't bring any improvement to our users so we made it easy to convert Vert.x futures to and from CompletionStage.
|
|
||
| On all 3 places, `Vertx.currentContext().isEventLoopContext()` returns `true`. | ||
|
|
||
| This introduces more complexity into integrating with third-party libraries, because it is possible to call into a library from an event-loop `Context` that currently executes on a worker thread. |
There was a problem hiding this comment.
It's a good point you're making from here to the end of the section. Nevertheless, I'm not sure I'd say that "often" you want to jump back to the type of thread that made the call. I think it is the responsibility of the library integration maintainer to define the behavior of the method call (e.g. in Javadoc) and the responsibility of the user to undertand that behavior.
When you invoke a synchronous API, by design you will continue on the same thread (or virtual thread, but that's another story).
I believe that, as a user, when you work with asynchronous APIs, you must build the habit of checking the API's behavior.
There was a problem hiding this comment.
Good point, though I'm gonna say most APIs don't actually bother documenting such things, regardless how important they are. I'll try to expand on this, because "often" doesn't really do it justice.
There was a problem hiding this comment.
I think I improved this slightly, but I should be able to expand on the current wording. I didn't want to dive too deep into this, but there are multiple cases and each warrants different behavior, some of them may have more freedom to choose than others. I should be able to add some examples, too.
dd2f6f5 to
4df170b
Compare
|
Hi @tsegismont, sorry I didn't reply sooner, I was busy completing at least the Mutiny part of the document. Now that is done (ping @jponge :-) ), I'll answer to your comments. Thanks! |
4df170b to
778de32
Compare
| } | ||
| ---- | ||
|
|
||
| Here, the worker thread does some work (in our example, just sleeping), and the event loop at the same time finishes the response to the request. |
There was a problem hiding this comment.
From a logical scoping perspective, having this fire and forget operation that outlives the http request processing is not correct, but it's still valid from the perspective of asynchronous tasks being dispatched on an event-loop. Things could go wrong if the blocking task tried accessing the request / response objects, but other than that it's not that incorrect.
You could imagine a case where you fire and forget sending a message over the event-bus, and you can afford to loose it, and it is fine if the http response is closed before the message has been dispatched over the event-bus.
I'm not saying this is a great way to structure async code logic, but it's not necessarily incorrect 😄
There was a problem hiding this comment.
From a logical scoping perspective, having this fire and forget operation that outlives the http request processing is not correct, but it's still valid from the perspective of asynchronous tasks being dispatched on an event-loop. Things could go wrong if the blocking task tried accessing the request / response objects, but other than that it's not that incorrect.
That's not necessarily true. The fire and forget task also wouldn't be able to load data from/store data to the Context, or at least make sure it doesn't step on the request/response handling code. Being able to store data in Context is the reason why in-request concurrency is not allowed in Vert.x, or at least that's my understanding.
You could imagine a case where you fire and forget sending a message over the event-bus, and you can afford to loose it, and it is fine if the http response is closed before the message has been dispatched over the event-bus.
Yeah, agree. In principle, it's OK, provided the user upholds certain rules. I actually don't think anyone has ever bothered thinking about this too deeply, so the rules have never been specified. I'm erring on the side of caution here. For example, Vert.x stores tracing data in the Context, which you cannot affect, so if you run 2 actions concurrently on a single Context, your tracing data become garbage. I don't think Vert.x on its own stores anything else in the Context, so maybe this should be an exception instead of a rule, but what is the rule then? :-)
There was a problem hiding this comment.
I don't think Vert.x on its own stores anything else in the
Context
I'm actually wrong here, at least assuming https://reactiverse.io/reactiverse-contextual-logging/ is part of Vert.x, which seems fair :-) This project also stores data in the Context and those data can easily become garbage when accessed concurrently.
There was a problem hiding this comment.
All good points 😃
As @jponge , I can see the case of POSTing a long-running blocking task, the server replying with 202 immediately and a location to poll with GET (see example)
As simple and valid implementation would look like what this doc suggests as forbidden.
In fact, it's perfectly valid, but it messes with contextual data (logging, tracing) because of the hierarchy of contexts.
A better pattern would be to send a message to the event bus, with a blocking consumer picking up the message. Then the consumer would run on a context that is not the duplicated context running the request (in pseudo-code):
start {
registerBlockingConsumer("foo", fooHandler)
setupHttpServer(requestHandler)
}
requestHandler {
eventBus.send("foo", msg)
reply(202, taskId)
}
fooHandler {
// Process task on worker thread
}There was a problem hiding this comment.
Yes!
The event bus here introduces an "asynchronous boundary" through which the Context is not "propagated" (way too many way too overloaded words).
| When a context-bound promise is completed, it automatically enqueues the callback on the context from which it was created. | ||
|
|
||
| WARNING: As mentioned above, `ContextInternal` is an _internal_ API. | ||
| In Vert.x 5, it moves to a different package, so prepare to deal with the fallout. |
There was a problem hiding this comment.
You should say what package it is, and say "it has moved"
| In Vert.x, as mentioned above, request processing must be entirely serial, so we can actually use the duplicated `Context` to store data that live for a shorter duration than the entire request. | ||
| Typical example is a logging MDC (mapped diagnostic context), or tracing data (for example for OpenTelemetry). | ||
|
|
||
| In Quarkus, however, concurrency inside the request is allowed, so *duplicated `Context` must not contain data that are shorter than the whole request*. |
There was a problem hiding this comment.
I think we need Vert.x and Quarkus examples here because what shorter-lived data is and why it is a problem is not clear to me as I'm reading this section.
|
|
||
| == Mutiny | ||
|
|
||
| Mutiny is a functional reactive programming library, similar to RxJava and others. |
|
|
||
| Mutiny is a functional reactive programming library, similar to RxJava and others. | ||
| It exposes two types: `Uni` and `Multi`. | ||
| `Uni` represents a single result (of one action), while `Multi` represents a stream of results (of one or more actions). |
There was a problem hiding this comment.
A stream of asynchronous events rather than results is more accurate.
| executor("origin").execute(() -> { | ||
| em.complete("foobar"); | ||
| }); | ||
| }).map(item -> { |
| executor("origin").execute(() -> { // <2> | ||
| em.complete("foobar"); | ||
| }); | ||
| }).flatMap(item -> { // <3> |
There was a problem hiding this comment.
I'd use onItem().transformToUni()
|
|
||
| @Override | ||
| public void onItem(String item) { // <6> | ||
| executor.execute(() -> { |
There was a problem hiding this comment.
In those examples what is the purpose of doing one more dispatch through an executor? The request signal is meant to be non-blocking per-RS semantics.
| In Quarkus, the captured `Context` is typically a duplicated `Context` already, but in our example above, it is not. | ||
| So outside Quarkus, you might see the `Context` changing in the middle of the pipeline. | ||
|
|
||
| In my personal opinion, this is a bug. |
19ccf87 to
341fa28
Compare
|
Marked this as ready for review, because it feels reasonably complete now. I'm not entirely happy about the Quarkus part, but I couldn't figure out a better shape at the moment. The content is quite dense there and contains no examples. I guess I need at least some time off from this guide, I've been working on it for too long :-) Of course, any help would be welcome too. |
This comment has been minimized.
This comment has been minimized.
| Calling `isBlockingAllowed()` is pretty much the same as calling `!Context.isOnEventLoopThread()`. | ||
| There are 2 reasons for why `BlockingOperationControl` exists: | ||
|
|
||
| . It may be called even when Vert.x is not present. |
There was a problem hiding this comment.
I don't understand, could you elaborate?
There was a problem hiding this comment.
It is an API in Quarkus, and in Quarkus applications, Vert.x doesn't necessarily have to be present (although it is in like 99.9% of cases).
It is a fancy way of asking !Context.isOnEventLoopThread(), and the more I think about it, the more I don't see why we have it, but we do :-)
There was a problem hiding this comment.
OK, thanks for clarifications
341fa28 to
690a73b
Compare
|
Added a section on request boundaries and how to break out of them. Especially the first section is not finished, but its existence actually makes me a lot happier about the Quarkus chapter. The second section also requires more work (on wording: "request" vs "duplicated |
| This works fine, but only when no in-request concurrency exists, and there is non-trivial cost to performance. | ||
|
|
||
| Lately, Quarkus has shifted to storing data in a duplicated `Context`. | ||
| This doesn't have such performance overhead, but once again, it works fine only when no in-request concurrency exists. |
There was a problem hiding this comment.
Most REST requests delegate execution to a worker thread but security is implemented with mutiny on the event-loop... Concurrency will always be possible in these requests.
There was a problem hiding this comment.
I'm no expert on most things Quarkus, including security :-), but I think that what security does executes prior to the application code. It executes on a different thread than the application code, but it does not execute concurrently. This is the classic case of "serial/linear execution with thread switches", in other words.
There was a problem hiding this comment.
This is one example of activity in the event-loop interfering with requests offloaded to a worker thread:
#43134 (comment)
I found a similar issue I was able to track to the security layer using Uni and the io.quarkus.opentelemetry.runtime.propagation.OpenTelemetryMpContextPropagationProvider. They were interfering with the worker thread DuplicatedContext. I couldn't find that comment.
There was a problem hiding this comment.
So the reproducer you link has nothing to do with Quarkus security, but it does one thing: in-request concurrency. Your request handling code basically does:
public String hello() {
log.info(...);
managedExecutor.execute(() -> {
log.info(...);
});
log.info(...);
return "Hello";
}There's more in the callback offloaded to an executor, but even this simplified code shows that you're explicitly doing in-request concurrency. At the moment, frankly, all bets are off. We cannot guarantee anything.
There was a problem hiding this comment.
I couldn't find the one about security... This was an independent example.
There was a problem hiding this comment.
If you could find the security issue, that would be good. It is entirely possible we're doing the same thing, but I'd say we really shouldn't.
There was a problem hiding this comment.
I finally found the comment containing the security related stack trace. It's here: #49468 (comment)
The comments before explain what is going on, here.
This comment has been minimized.
This comment has been minimized.
690a73b to
82faae6
Compare
|
Added a section about OpenTelemetry. |
This comment has been minimized.
This comment has been minimized.
|
Converting to draft as we don't need to run CI for each iteration. |
|
MDC Context section added |
918d343 to
27cb192
Compare
27cb192 to
e54ac73
Compare

No description provided.