-
Notifications
You must be signed in to change notification settings - Fork 41
Pooled annotation like in EJB @Stateless, but for CDI beans #810
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Kyle Aure <[email protected]>
Co-authored-by: Kyle Aure <[email protected]>
Co-authored-by: Kyle Aure <[email protected]>
Co-authored-by: Kyle Aure <[email protected]>
Co-authored-by: Kyle Aure <[email protected]>
njr-11
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Pooled works only when bean instances are stateless, otherwise non-deterministic behavior results from reusing dirty instances. However, I don't see anything here that limits this annotation to beans that are stateless.
Also, when I look at what this annotation does, it doesn't seem to belong in the Concurrency spec and seems like it should be in the CDI spec instead. Although there is some locking involved, the locking aspect is more of a background detail needed to accomplish the main purpose, which is to pool and reuse instances. I don't think it is correct to consider the involvement of locking as automatically positioning it under the Concurrency spec.
Fix other fields and methods after rename Add example for usage
njr-11
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some additional comments
Co-authored-by: Nathan Rauh <[email protected]>
You'd think, but that's not how it works. This is the same behaviour as @stateless, which despite the name is actually not stateless. The state however is typically an expensive to create or non-threadsafe object that is initialized once. The bean is stateless in the sense of external observable behaviour; e.g. do two methods calls to a pooled bean, and you can't expect anything like a counter or whatever to be consistent. However, the bean can initialize itself and keep its own dependencies internally. This is all quite well understood from the EJB specification. It belongs in the Concurrency spec, because this is how "max concurrency" is done in EJB (see also the linked at #136) and how guarding of non threadsafe objects is done without using a globally locked Singleton. |
Co-authored-by: Nathan Rauh <[email protected]>
Sure, I don't mean to get caught up in whether stateless is truly stateless. My point is that there is a particular usage pattern that applications need to abide by in order for pooling to be valid. But we haven't said that anywhere. We need to. Otherwise applications are going to end up with dirty state across reuse.
I'm trying to review this from the perspective of a user not having prior knowledge of EJB. A new user should be able to read the Javadoc of the annotation to understand what it is for and how to use it correctly.
I'm not convinced. We are presenting this as a solution for pooling and reusing bean instances. It is named |
Of course. For the spec this is just an early draft taken from Concurro, which purely as an implementation doesn't need to be so clear on the one hand, and on the other hand it's used to power the CDI based EJB implementation we're using in Piranha (which on its turn references the EJB spec).
Well, it does what #136 asks for, and this particular approach was described there. We also have accepted it in Concurro, which is a library for Concurrency. I do agree that Pooled is more of the mechanism used to achieve that goal. The higher level concept is that each method call on a bean is internally dispatched to a bean instance. Only X of such method dispatches can happen concurrently. Pooling for the sake of pooling beans is not the goal there, though the pooling concept is good to specify as it allows for the guard for thread-safe objects. Maybe we should rename it to MaxConcurrency, and move the pooling aspect somewhat down in the javadoc? p.s. in the original issue Adam already said this actually: "Currently proprietary max-pool-size settings are used to control the max concurrency of an application and so to prevent overloading the application server. We could abstract the intended behavior into an annotation:" |
Co-authored-by: Nathan Rauh <[email protected]>
Co-authored-by: Nathan Rauh <[email protected]>
Co-authored-by: Nathan Rauh <[email protected]>
I wouldn't recommend renaming to MaxConcurrency to try to be a better fit for the Concurrency spec. Emphasizing the pooling aspect is very important to the user's proper understanding and correct usage and should be kept at the forefront. I think you were right to do that in the first place. It is important for the user to be very aware that bean instances are pooled and reused so they only use the annotation on beans with state that will not be dirtied. The name MaxConcurrency does not communicate that. The whole concept of having a bean instance upon which the user invokes operations that are then delegated to other bean instances (which in this case is also subject to a maximum constraint) is an approach that the CDI ought to weigh in on and consider whether they might want to do more efficiently. I spoke with an EJB spec member at my company who pointed out other design aspects of pooling stateless beans such as the possibility of having a minimum constraint for pre-population and the possibility of exceeding a maximum pooled instances with temporary non-pooled instances. He also suggested that the destroyOn and keepOn could be replaced/combined into a single keep list, with all other exceptions otherwise causing a discard. More ideas to consider in the design of this given what users have requested over the years. |
|
p.s. specifically about providing the proxy implementation:
In a CDI-based Jakarta Concurrency model, To recap: The integration is really semantic.
The requested and proposed annotation is complementary to context propagation: context propagation preserves correctness across threads; pooling preserves safety and capacity under parallelism. So, this puts We can draw a comparison to Java SE; |
It doesn't matter where someone initially filed an issue. The spec that it belongs under should be determined by the solution architected.
That is partly true (except the characterization as "weaker"), but unrelated. Concurrency spec has a context propagation model that allows for other Jakarta EE specs to tie into. Several other specs do tie into it. CDI chose not to. I understand why they made that choice - because CDI context isn't really compatible with being moved around to different threads. There would have needed to be a significant amount of architectural rework in CDI, some confusing scenarios for applications, and probably some breaking changes in order to make it work properly. The outcome of CDI not being receptive to that proposal should in no way imply CDI automatically won't be receptive to a completely different, and far less intrusive, proposal of an entirely different nature. I do see that you rewrote some of the Javadoc to focus more on the maximum pooled instances constraint, framing it as a maximum concurrency concept. But Poolable doesn't necessarily need to behave that way. As pointed out by our EJB spec participant based on how some customers use stateless enterprise beans in our product, a pool could have a maximum number of instances to hand out, and after none are left, temporarily create additional bean instances upon request so that applications are not blocked, discarding any extra instances in excess of the maximum rather than pooling them. There is nothing Concurrency-spec related about that. The concept is a pool of reusable bean instances. Throttling it to a maximum is only one aspect. There could be many other aspects, like whether there should be a minimum number of bean instances pre-initialized or kept available, and whether over time the number pooled should drop from the maximum down to the minimum (I believe our implementation for EJB does something like that). The more I think about this, the more certain I am becoming that it doesn't belong in the Concurrency spec. The fact that this pool of CDI bean instances involves injecting one bean instance, intercepting its usage and redirecting to another CDI bean instance from a pool, is really all about CDI beans and deserves to be addressed by their specification. Maybe they don't agree this is a very efficient way to do things, or isn't a recommend usage of CDI. Maybe they have better ideas that can only be designed with the ability to write requirements into the CDI spec itself. I'm pretty sure CDI is the right place for this to be considered and designed, taking what you have written as a start. |
|
I am just reading up on the proposal now but to be fair this seems like a textbook example of using CDI as a framework to provide your custom scope. It's essentially the same as with transactional scope which is based on CDI but resides in the transactions spec. I am not familiar enough with all the intended usages to say if this is the best place but I suppose it might be as you (as in concurrency impl) want to be in control of the scope activation and cross-thread propagation? That's something pure CDI implementation wouldn't really serve you well in. Other than that, you could push it into an EE "umbrella" level where a lot of previously CDI-owned EE integrations are now housed. However, in such case it is not clear who owns and implements it so I'd rather opt for a specification that will claim direct ownership.
You will get a client proxy "out of the box" through that custom (normal scoped) context - any normal scoped bean instance will be injected as a proxy object that then delegates into its respective context to obtain the contextual instance. For example, take a look at the Narayana impl of the transactional context - https://github.com/jbosstm/narayana/blob/main/ArjunaJTA/cdi/classes/com/arjuna/ats/jta/cdi/TransactionContext.java |
I don't know of any need for the concurrency implementation to be in control of either here. It could be done by an implementation of any spec and a concurrency implementation would not need to be involved.
If CDI has already moved integrations there, that is probably the place for it. As I see it, this isn't even an integration between CDI and Concurrency. It is just a value-add built on CDI. |
I think it matters a lot, as clearly that person too (in this case Payara founder Steve, clearly not a random inexperienced person) thought that a max concurrency mechanism would best fit in the concurrency spec. Various people who responded, not the most inexperienced people either, seemingly also thought this was the best place. |
I'm really of the exact opposite opinion. Your initial response made me doubt myself, but I've been talking to people about this, reading several articles, and I'm really quite sure that a mechanism to throttle concurrency only fits the concurrency spec. Platform is absolutely not the place for this, as this is a clear feature (service) and not some integration between specs. We just saw, as I expected, core CDI is not the place either.
That is a different policy. This proposal intentionally standardizes the bounded-acquisition policy because it provides portable max-concurrency semantics. We only define observable concurrency behavior; lifecycle tuning (prefill, shrink) remains an unspecified implementation detail.
The characterization that intercepting a CDI bean and redirecting the call is really all about CDI beans and therefore must be addressed by the CDI specification does not reflect how Jakarta Concurrency already operates. Since 3.0, Jakarta Concurrency has defined the In other words, Jakarta Concurrency already standardizes CDI-level interception and redirection semantics for CDI beans and is infact already relying on CDI for proxy/interceptor mechanics, while defining the concurrency contract in the Concurrency spec. Also, "CDI might not recommend this pattern" is not an argument against standardization here: |
Co-authored-by: Nathan Rauh <[email protected]>
I mean, feel free to bring it up in the CDI spec if you want to. |
The first thing to come to mind when I think a pool of bean instances is a performance optimization to avoid costly initialization by reusing existing instances, not using beans to parallelize work. |
The cost of init can be solved by an eagerly init bean even today, no custom scope needed. |
The original issue was copied from EJB, and meant for use across the platform, so presumably in both EJBs and CDI beans. Now that it's just CDI, I think it should be in the CDI spec, like the original issue for EJB was in the EJB spec. |
I'm pretty sure the original issue was targeted at CDI. In 2012 already @m-reza-rahman defined a number of features, at the time specified in EJB, to be ported to CDI compatible features. The goal was to have a common component model around CDI. To quote him directly; "The original intent was for a way to provide an equivalent to how EJB pooling effectively causes bandwidth throttling at the component level. This should be applicable to any CDI bean that can be concurrently accessed. This is an important missing piece for EJB users that see this type of functionality as valuable." It basically can't be more clear than that. |
|
Setting other container-provided services aside, there are two concerns that might look orthogonal to some and less so to others: concurrency control and, in case of stateless beans, pooling. I'm gonna claim concurrency control is very much in scope of the Jakarta Concurrency project, even if it decides it doesn't have to do anything in the Jakarta Concurrency specification. Concurrency control is well specified for session beans (see Serializing Session Bean Methods and Singleton Session Bean Concurrency), pooling is in fact not specified at all (although the specification mentions it's a common implementation strategy). Singleton session beansThey are basically equivalent to There's a big difference in concurrency control. Singleton EJBs with container-managed concurrency (the default) are associated with a read/write lock and all method invocations have to obtain either a read lock or a write lock. There's no equivalent to this for CDI beans; it is as if CDI beans always had bean-managed concurrency, so to speak. It is relatively simple to implement this using an interceptor (that's what we did in Quarkus). If someone wanted to specify this, I think that would make sense. Stateful session beansThey are basically equivalent to Again, there's a big difference in concurrency control. Method invocations on a stateful bean are serialized, so there is never more than 1 method being invoked at the same time on one instance of the bean. There's no equivalent for this in CDI. If we specified read/write locking, like I suggest in previous section, using similar defaults, then adding serialization would be a question of adding a single annotation on the class. Stateless session beansFinding an equivalent is tricky, because it depends on a viewpoint. It can be seen as equivalent of It is also very much unspecified what should happen if the pool reaches its maximum capacity and all instances are in use and a new request comes. Common approaches include: throw an exception, providing basically a concurrency limiter, or create a new instance anyway. Both of these can be done eagerly or after a timeout (so that the implementation first waits if an instance becomes available). But those are implementation details, let's not concern with those for now. In any case, I don't think this behavior can be implemented solely by creating a new CDI context. The context is asked for an instance of the bean, which can easily pick one up from a pool, but the context is not told that the instance has finished executing the method and can be put back to the pool. It might be possible to implement this using an interceptor, but it's not immediately obvious to me how to forward the method to a pooled instance except using reflection, so let's discard this idea as well. It feels like this would require serious work in the CDI specification. There's essentially no concurrency control for stateless beans, because each method invocation is routed to a different instance. ConclusionIt's relatively straightforward to specify and implement concurrency control using a read/write lock interceptor. Jakarta Concurrency would be a natural home for this, but so would CDI. I don't care which, myself. It would not be exactly straightforward to specify pooling. (No wonder the EJB specification doesn't bother specifying it :-) ) It feels like this would require altering the specification of client proxies and probably a few more core parts of CDI. This would definitely belong to CDI. Personally, I don't understand the appeal of bean pooling, but if you feel strongly about it, feel free to file an issue. If you do, please (please!) include more details than just "I want bean instance pooling", because that's really not helpful at all. If we decide to do both, it feels like concurrency control should also belong to CDI, because there are interactions between the two. I personally think concurrency control on pooled beans should be a definition error. |
@arjantijms Could you please point me to this? I couldn't find anything in https://github.com/eclipse-ee4j/glassfish-concurro and I'm interested in what Concurro does, because to the best of my knowledge, pooling cannot be implemented as a CDI normal scope, at least not yet. |
| /** | ||
| * {@return the maximum number of instances in the pool.} | ||
| */ | ||
| int value() default 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest that default should be some magic constant, e.g. -1, so that implementations can distinguish between the default value and value that's explicitly set.
In case of the default, implementations would be able to change the default in an external configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for the accessTimeout parameter - it should be possible to detect whether the value is the default or explicitly set.
| @Documented | ||
| @Target({TYPE, METHOD}) | ||
| @Retention(RUNTIME) | ||
| public @interface Pooled { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add "name" parameter to give the bean a name, which could be used in configuration to specify the values externally (via vendor-specific means, until Jakarta Config is available).
Stateless beans are
But you have good point about the implementation, I didn't realize it would be that hard to track which instances are currently "free" to take up the task. Of note is also that the pooling might often be more costly to perform than just initiating a new bean for each task, then destroying it. |
|
@mkouba just had a very good remark on a chat - this cannot be implemented by a
Not using normal scope means there would be no client proxy which wouldn't work here either 🤔 |
|
My view is that we should only focus on limiting maximum concurrency and ensuring thread-safety, not pooling. Pooling is an old concept and is not needed in general nowadays, and it's also not defined in the EJB spec. Limiting concurrency is IMHO in scope of Jakarta Concurrency, not plain CDI. Adding it to CDI would be possible, it already contains Though, I'm not sure how to limit the number of concurrent accesses to a bean. For a method it's simple - a standard interceptor would be enough, as #136 suggests, but it wouldn't prevent concurrent access to the whole bean. A normal scope, as @arjantijms suggests in this PR, would be better, it would create a separate instance of the bean per thread but it's not clear to me when such a bean would be destroyed - would it be at and of HTTP request, like P.S. I would rather avoid specifying any pooling mechanism, in any spec. Pooling is not defined even for EJBs, it's implementation specific. It's not needed nowadays, and we had several discussions how to do this in CDI and neve found a good solution that could be reasonably implemented. |
Great point!
I think we could possibly define a special kind of a scope that is not a normal scope but requires a special kind of client proxies that perform pooling 😆 But I'd rather not. |
Concurrency limiter is actually already defined in MicroProfile Fault Tolerance, under the name By ensuring thread safety, I assume you mean things like
Technically, there are many kinds of requests, not just an HTTP request; one might argue that
You can do that in an interceptor just like for methods, you just store the EDIT: let me clarify the last point, because it's actually important. You cannot use the class name of the declaring class of the invoked method. Your interceptor has to inject the intercepted bean (
+1 |
|
In the end, on second though, what we want seems basically as:
Although the pooling concept seems relevant to avoid creating instances for each individual method call, which makes sense in case of multiple frequent calls, but at the same time, it complicates things. I would rather follow standard CDI concepts and create 2 things - Method scope (in CDI spec) and a MaxConcurrency interceptor (in Jakarta Concurrency, similar to Bulkhead in MicroProfile). And then maybe stereotype that combines both and simulates That would allow limiting concurrent calls just with the interceptor, in case the bean is thread-safe and can be shared by multiple threads (and so can be e.g. `@ApplicationScoped), while additional method scope would guarantee thread-safety (each thread would get its own instance). I believe that in most cases, it would be safe to use Application scope for truly stateless beans and method scope is required only in specific cases. On the other hand, maybe to support the majority of cases, we could just have the interceptor and clarify that method calls need to be thread-safe. That wouldn't completely replace Stateless EJB behavior, but we don't need that in this case. |
|
I looked up the exact wording in the EJB spec again, and there, too, the focus is completely on concurrency and reentrantcy: From the EJB spec:
|
Unless I misunderstood you, that is still not implementable in the current CDI. And IIRC back when I inspected how WFLY did pooling of stateless beans years ago, it basically boiled down to EJB providing their own proxy that got injected into user bean. This proxy then had a pool of CDI Just thinking aloud but perhaps instead of a scope and directly injecting the bean, you could have something like |
|
@manovotn For the OmniServices implementation we handed out pool keys, to work more or less as the permits in a semaphore. You can see what we did here: https://github.com/omnifaces/omniservices/blob/develop/src/main/java/org/omnifaces/services/pooled/PooledContext.java |
Addressing #136
A pooled annotation in the spirit of one of the properties of the EJB @stateless, but specifically for CDI beans. This is directly based on the similarly named annotation that already exists in Concurro and has functionality implemented for it.