Skip to content

Fix provider update race condition and Fury thread-local instance churn#1541

Draft
Copilot wants to merge 2 commits intomasterfrom
copilot/review-recent-issues
Draft

Fix provider update race condition and Fury thread-local instance churn#1541
Copilot wants to merge 2 commits intomasterfrom
copilot/review-recent-issues

Conversation

Copy link
Contributor

Copilot AI commented Mar 15, 2026

Two independent bugs: a race condition causing unavailableProviderException under load when new providers register, and unbounded Fury instance creation causing full GC at high throughput.

AbstractCluster: connection-before-address ordering (#1490)

updateAllProviders and updateProviders were registering addresses in AddressHolder before connections were established in ConnectionHolder. This opened a window where load balancers could select a provider whose transport wasn't ready yet.

Fix: establish connections first, then make addresses visible.

// Before — address visible before connection exists
addressHolder.updateAllProviders(providerGroups);
connectionHolder.updateAllProviders(providerGroups);

// After — connection ready before address is discoverable
connectionHolder.updateAllProviders(providerGroups);
addressHolder.updateAllProviders(providerGroups);

FurySerializer: stop destroying thread-local Fury instances (#1424)

Every encode/decode call ended with fury.clearClassLoader(contextClassLoader) in a finally block, which destroys the thread-local Fury instance unconditionally. The next call reconstructs it from scratch — at 20k TPS this produced ~4,500 Fury instances and triggered FGC.

Fix: remove clearClassLoader from the hot path. fury.setClassLoader(contextClassLoader) at the start of each call already handles classloader changes correctly (it creates a new Fury instance only when the classloader actually differs), so removing the teardown allows thread-local instances to be reused across calls.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • repository.jboss.org
    • Triggering command: /usr/lib/jvm/temurin-17-jdk-amd64/bin/java /usr/lib/jvm/temurin-17-jdk-amd64/bin/java -classpath /home/REDACTED/work/sofa-rpc/sofa-rpc/.mvn/wrapper/maven-wrapper.jar -Dmaven.home=/home/REDACTED/work/sofa-rpc -Dmaven.multiModuleProjectDirectory=/home/REDACTED/work/sofa-rpc/sofa-rpc org.apache.maven.wrapper.MavenWrapperMain -f pom.xml -B -V -e -Dfindbugs.skip -Dcheckstyle.skip -Dpmd.skip=true -Dspotbugs.skip -Denforcer.skip -Dmaven.javadoc.skip -DskipTests -Dmaven.test.skip.exec -Dlicense.skip=true (dns block)

If you need me to access, download, or install something from one of these locations, you can either:


📱 Kick off Copilot coding agent tasks wherever you are with GitHub Mobile, available on iOS and Android.

…ance creation overhead (#1424)

Co-authored-by: EvenLjj <15122299+EvenLjj@users.noreply.github.com>
Copilot AI changed the title [WIP] Review recent project issues Fix provider update race condition and Fury thread-local instance churn Mar 15, 2026
Copilot AI requested a review from EvenLjj March 15, 2026 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants