[JENKINS-75675] Refactor class loading logic in order to reduce memory consumption#10659
[JENKINS-75675] Refactor class loading logic in order to reduce memory consumption#10659basil merged 21 commits intojenkinsci:masterfrom
Conversation
|
Yay, your first pull request towards Jenkins core was created successfully! Thank you so much! |
|
Would this not be better reported/patched against OpenJDK's |
|
@basil, thank you for the quick review. I understand your concern, but I don’t believe OpenJDK is the right place for this optimization. This isn’t an issue with URLClassLoader itself — it’s designed to work correctly in typical Java applications, where an extra 10MB per class loader is a reasonable overhead in exchange for a simpler and faster implementation. The problem becomes significant in pluggable systems that use separate class loading. So, in my view, this should be addressed within the Jenkins plugin framework. I’ve implemented a 10-line optimization that is easy to maintain or revert if it causes any issues. This is feasible because I used Guava’s MapMaker. Guava cannot be used in the OpenJDK codebase, however. If you consider the ~5–10MB additional memory overhead per plugin acceptable, then let’s just reject the PR. In my production Jenkins cluster, the heap usage is around 4GB when idle — 2GB of which is consumed by 50 billion unnecessary lock objects. It works fine, but the purist in me feels that this is not how it should be. |
|
Alternative: It is possible to make jenkins.util.URLClassLoader2 not capable for parallel class loading by removing In this case, class loading will become serial — URLClassLoader will use a single lock instead of a dedicated lock per class name. |
I don't see anything in the I hate wasting memory and I am not against this PR as a short-term solution, but only after we have a discussion with the OpenJDK folks about whether or not this is an upstream problem and, if so, what the long-term plan would be. |
Right, this is just my understanding. I’ve reviewed the existing standard ClassLoader implementation and I don’t have strong arguments to ask the OpenJDK team to change a standard mechanism that already meets the needs of 99% of Java applications — just to benefit Jenkins and perhaps a few other (~10) Java-based pluggable platforms. That said, talking to the OpenJDK folks definitely makes sense if there’s a feasible path forward. |
jtnord
left a comment
There was a problem hiding this comment.
Sorry forgot to submit this earlier.
I Agree with Basil, this appears to be something for the JDK.
Anything here should be a temporary workaround until we could pickup a JDK with a fix (should the JDK team agree this is an issue)
25e29b6 to
b9045b6
Compare
fc1fa28 to
feb64f2
Compare
There was a problem hiding this comment.
Thanks very much! Sorry for the delay in reviewing this. I just took a look and added some commits with small improvements. Feel free to revert any of them if you think it makes your PR worse.
GPT-5 suggested using a per-instance monotonically increasing ID in getClassLoadingLock to avoid identityHashCode collision, as well as precomputing a per-instance prefix once to avoid rebuilding it each time. I'll leave it up to you as to whether you think one or both of those suggestions is valid.
Can you please test the ExistenceCheckingClassLoader portion with both Tomcat and Winstone? We only officially support Winstone, but a small percentage of Jenkins users still use Tomcat, and we try to have no regressions there. If this is asking too much, I can look into this myself.
I will also run a wider set of automated tests on this PR and report the results here.
This reverts commit 6ad9f3b.
Thank you for review! Added one more commit with improvements you mentioned. Will perform tomcat testing... |
basil
left a comment
There was a problem hiding this comment.
Very nice! I tested this interactively and verified the memory savings as in the Testing Done section. I also ran PCT and ATH successfully. The former isn't a very realistic test since it does not use these class loaders, and the latter is realistic but plugin coverage is low. The author of the PR says it has been deployed in production successfully, so that gives me confidence these changes are correct. I see no regressions, either functional or performance-wise.
timja
left a comment
There was a problem hiding this comment.
This looks like a great solution, thanks for bearing with us throughout the process.
Thanks Basil for doing the additional testing and reproducing the results
/label ready-for-merge
This PR is now ready for merge, after ~24 hours, we will merge it if there's no negative feedback.
Thanks!
| /** | ||
| * Replace the JDK's per-name lock map with a GC-collectable lock object. | ||
| * | ||
| * <p>Parallel-capable {@link ClassLoader} implementations keep a distinct lock object per class | ||
| * name indefinitely, which can retain huge maps when there are many misses. Returning an | ||
| * interned {@link String} keyed by this loader and the class name preserves mutual exclusion | ||
| * for a given (loader, name) pair but allows the JVM to reclaim the lock when no longer | ||
| * referenced. Interned Strings are heap objects and GC-eligible on modern JDKs (7+). | ||
| * | ||
| * @param className the binary name of the class being loaded (must not be null) | ||
| * @return a lock object unique to this classloader/class pair | ||
| */ |
There was a problem hiding this comment.
I almost forgot: can you please update this comment to refer to the upstream JDK issue? Any deviation from upstream in URLClassLoader2 is a temporary workaround, but based on the discussion in this thread there is a sound justification. So please explain that this code should be removed once the upstream issue (with a link) is fixed.
|
(I plan to squash-merge this with @jtnord listed as a co-author as well, because he came up with the important idea of the string interning lock.) |
|
I ran the Tomcat tests from https://github.com/jenkinsci/packaging/tree/master/molecule/servlet with the WAR from this PR, and the test passed, so I think this PR is ready for merge. |
|
Very impressive change. Thanks again @dukhlov! I saw that you had removed the cleanup to |
See JENKINS-75675.
Background
Our Jenkins installation relies heavily on Groovy pipelines, shared libraries, and similar mechanisms. Groovy uses dynamic invocation via MetaClass, CallSite, etc. During introspection, it attempts to load many classes that don’t exist, simply to determine whether a given token is a class, a variable, or something else.
My investigation shows that this behavior leads to memory leakage.
Problem
The base ClassLoader implementation in Java (which all other class loaders inherit from) supports two modes:
parallelCapable (currently used by all Jenkins core class loaders, including WebAppClassLoader and PlatformClassLoader)
non-parallel (legacy mode)
Parallel-capable class loaders create and retain a lock object per class name, indefinitely. So, if we have 1,000 class loading misses, every parallel-capable class loader in the hierarchy keeps 1,000 unused lock objects forever.
In the case of Jenkins's UberClassLoader, a typical setup with ~200 plugins results in around 500 class loaders. In our Jenkins instance, about 2,000 classes are loaded successfully, while we observe over 200,000 class loading misses.
As a result, the class loaders retain:
500 class loaders × 200,000 lock objects each
Plus the internal ConcurrentHashMap.Node objects used to store them
This results in roughly 2 GB of unnecessary memory consumption.
Solution
DelegatingClassLoaderimplementation which overrides baseClassLoaders.loadClassto remove not needed base locking mechanism (it don't load class itself just delegates loading to another class loader and locking will be done here)DelegatingClassLoaderinstead ofClassLoaderif possible.getClassLoadingLock()forUrlClassLoader2for use GCed locking object which won't be kept forevergetResource()call) and don't useloadClass()call not to create locking Object at allWebAppClassLoaderwith FilteringClassLoader to avoid not needed lock files creationTesting done
This PR tested by running 2-hour long proprietary automation tests based on jenkins scripted pipeline jobs
class loading logic is broadly used, so it definitely covered by this testing
Also I made
jcmd GC.class_histogram. Amount ofjava.lang.Objectandjava.util.concurrent.ConcurrentHashMap$Nodeinstances was decreased dramaticallyBefore fix:

After fix:

Proposed changelog entries
Proposed changelog category
/label internal
Proposed upgrade guidelines
N/A
Submitter checklist
@Restrictedor have@since TODOJavadocs, as appropriate.@Deprecated(since = "TODO")or@Deprecated(forRemoval = true, since = "TODO"), if applicable.evalto ease future introduction of Content Security Policy (CSP) directives (see documentation).Desired reviewers
@jenkinsci/core-pr-reviewers
Before the changes are marked as
ready-for-merge:Maintainer checklist
upgrade-guide-neededlabel is set and there is a Proposed upgrade guidelines section in the pull request title (see example).lts-candidateto be considered (see query).