Skip to content

Fix synchronization for JWT increasing performance for token validation#726

Closed
zyclonite wants to merge 5 commits intoeclipse-vertx:masterfrom
zyclonite:fix_sync_auth_common_jws_jwe
Closed

Fix synchronization for JWT increasing performance for token validation#726
zyclonite wants to merge 5 commits intoeclipse-vertx:masterfrom
zyclonite:fix_sync_auth_common_jws_jwe

Conversation

@zyclonite
Copy link
Contributor

Motivation:

The observation was that reusing objects like Signature, Cipher and Mac and running it within synchronization blocks has a big impact on application performance. Used in a handler it serializes all eventloop threads and potentially introduces a lot of wait for all other threads.

Using the Signature.getInstance​(String algorithm) does not add a lot of overhead and is then safe to use within the same context for the operation.

Doing some performance tests on EC256 signatures got me nearly the same performance multiplicator as CPU cores available.

@zyclonite zyclonite force-pushed the fix_sync_auth_common_jws_jwe branch from 73ea166 to 0ead3a2 Compare June 12, 2025 18:08
… in JWE and JWS

Signed-off-by: Lukas Prettenthaler <rdx@wsn.at>
@zyclonite zyclonite force-pushed the fix_sync_auth_common_jws_jwe branch from 0ead3a2 to b3d397e Compare June 12, 2025 18:23
@vietj
Copy link
Member

vietj commented Jun 13, 2025

I'm wondering if we could not store those in a thread local when the thread is a vertx thread (excluding virtual threads) otherwise create it as we go like you are doing

@zyclonite
Copy link
Contributor Author

Even calling getInstance on the same Thread will get you different instances of the object. So the idea is, not to re-use them. So i am not sure if we would win a lot with that except getting it off the eventloop thread.
This might be done best on the authentication handler level i guess?

If your thought was in a different direction, let me know

@tsegismont
Copy link
Member

I think that what @vietj suggests is to create a something like ThreadLocalRandom for the Ciper or Signature instance, that would use a FastThreadLocal storage for reuse of cipher and signature when running on an event loop thread, otherwise creating objects on demand.

If you have time to do this it would be great.

@zyclonite
Copy link
Contributor Author

Understood but the Random use-case is a different one as this "should" be shared by definition, sharing the Signature object does not really add a lot of benefits from what i read so far.

I can do some further tests if it adds any performance gain.

@zyclonite
Copy link
Contributor Author

Did conduct some tests and the results are:

Using FastThreadLocal on a 12 core system had around 52ms versus 227ms with a plain getInstance for Signature
and 36ms over 2651ms for Cipher (20 million cycles)

I could not find a vert.x way of checking for VertxThread and/or isVirtual as the impl package is no longer accessible in vert.x 5.x with modules so i used FastThreadLocalThread.currentThreadHasFastThreadLocal() from netty for now

@zyclonite
Copy link
Contributor Author

have a look at my last commit if this is in the direction or your idea and let me know

@tsegismont
Copy link
Member

@zyclonite sorry for the late reply. I did an update for JWE in zyclonite#1 that you can replicate in JWK and JWS.

tsegismont and others added 2 commits July 6, 2025 09:37
Signed-off-by: Thomas Segismont <tsegismont@gmail.com>
@zyclonite
Copy link
Contributor Author

took me a while to find some time...

did a quick apply of your approach on the JWS impl as well but this approach would only work if the algorithm would always be the same for the same thread... that's the reason why the test fails now

Store algorithms in an array for fast index search.
Then use this index to retrieve the corresponding signature

Signed-off-by: Thomas Segismont <tsegismont@gmail.com>
@tsegismont
Copy link
Member

@zyclonite I wanted to create a pull request to your repo with c7d5d59 but incidentally pushed directly to your branch.

Anyway, that should fix the problem you mentioned in your previous comment.

Do we have some evidence (e.g. microbenchmarks) that creating signatures and ciphers on the fly is more expensive than reusing them?

@zyclonite
Copy link
Contributor Author

@tsegismont looks great, thanks for that solution.

I fear your example has the same issue -> JWE
I can fix it there as well but i found out that the JWE implementation does actually not work properly, is nearly not implemented at all and does not have tests. I tried to think ahead and start implementing at least the recommended things but this starts to break with the JWK approach as it gets very complicated and would require a lot of refactoring.

@tsegismont
Copy link
Member

I fear your example has the same issue -> JWE

I'm not sure what you mean

@zyclonite
Copy link
Contributor Author

initially i was referring to your commit 4126825 , that needs the same change with a selector on the algorithm, there is just no test that could fail in the same way.

the other statement was to the JWE implementation itself. not only that no test exist, it could have never worked based on that line

if (jwk.use() == null || "enc".equals(jwk.use())) {
(only if use=sig and that makes no sense)
so this class needs a proper implementation first, then tests but in a different PR.
i looked a bit into it but this will require a lot of refactoring of the JWK implementation as well...

@tsegismont
Copy link
Member

tsegismont commented Jul 17, 2025

@zyclonite thanks for clarifying. I think our immediate goal must be to fix the performance issue where the implementation is trusted and tested.

the other statement was to the JWE implementation itself. not only that no test exist, it could have never worked based on that line

@pmlopes any comment on this?

@tsegismont
Copy link
Member

@zyclonite is there an opportunity for you to compare the performance of single-use ciphers/signatures vs fast thread local caching when running on a FastThreadLocalThread? In this case, could you please report the results?

@vietj vietj added this to the 5.1.0 milestone Sep 4, 2025
@zyclonite
Copy link
Contributor Author

sorry, i lost a bit track of that issue and did not find time to do some extended testing but i am pretty sure it adds performance

shall i discard that PR as @vietj has started refactoring the whole crypto part of the implementation?

@vietj
Copy link
Member

vietj commented Sep 30, 2025

sorry, i lost a bit track of that issue and did not find time to do some extended testing but i am pretty sure it adds performance

shall i discard that PR as @vietj has started refactoring the whole crypto part of the implementation?

yes, you can, this code will be moved to vertx-core, you can help review the other PR

later we will introduce fast thread local in the code moved to vertx-core that will be easier to implement

@zyclonite zyclonite closed this Sep 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants