Skip to content

[FEA] Create a mutable URLClassLoader in the ShimLoader if no URLClassLoader is available #14506

@rishic3

Description

@rishic3

Credit goes to @eordentlich for the idea.

[gerashegalov] Context : this idea enables us to resolve a long-standing issue in spark-rapids since ShimLoader with parallel worlds was introduced. Our ScalaTests in the test module run against a plain single-shim aggregator jar instead of depending on the production dist jar, which is the main idea of having a separate test module.

Is your feature request related to a problem? Please describe.

Certain JVMs (e.g., a forked JVM created by Maven sure-fire plugin for testing) have only AppClassLoader in the classloader chain. Since JDK 9, AppClassLoader is no longer a URLClassLoader (JEP 261).

In these cases ShimLoader.findURLClassLoader() walks the entire chain, finds nothing mutable, and cannot inject shim directory URLs via addURL(). This causes ClassNotFoundException for shim classes like AwsStoragePlugin during plugin initialization:

java.lang.ExceptionInInitializerError
        at com.nvidia.spark.rapids.RapidsDriverPlugin.init(Plugin.scala:475)
        at org.apache.spark.internal.plugin.DriverPluginContainer.$anonfun$driverPlugins$1(PluginContainer.scala:53)
        ...
        at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:1093)
        at com.udf.CudfComparisonTest.setUp(CudfComparisonTest.java:36)
        ...
        at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:495)
Caused by: java.lang.ClassNotFoundException: com.nvidia.spark.rapids.AwsStoragePlugin
        at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
        at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
        at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525)
        at java.base/java.lang.Class.forName0(Native Method)
        at java.base/java.lang.Class.forName(Class.java:469)
        at org.apache.spark.util.SparkClassUtils.classForName(SparkClassUtils.scala:41)
        at org.apache.spark.util.Utils$.classForName(Utils.scala:94)
        at org.apache.spark.sql.rapids.execution.TrampolineUtil$.classForName(TrampolineUtil.scala:181)
        at com.nvidia.spark.rapids.RapidsPluginUtils$.$anonfun$loadExtensions$1(Plugin.scala:321)
        at com.nvidia.spark.rapids.RapidsPluginUtils$.loadExtensions(Plugin.scala:319)
        at com.nvidia.spark.rapids.RapidsPluginUtils$.$anonfun$getExtraPlugins$3(Plugin.scala:379)
        ...
        at com.nvidia.spark.rapids.RapidsPluginUtils$.<clinit>(Plugin.scala)

There appear to be two code paths for loading shim classes, only one handles the missing URLClassLoader case.

  1. ShimReflectionUtils (works.)

ShimReflectionUtils.loadClass() calls ShimLoader.getShimClassLoader(). When findURLClassLoader() returns None, there is a fallback to create a MutableURLClassLoader pre-populated with shim URLs. All internal RAPIDS class loading go through this path and work fine.

  1. loadExtensions (breaks.)

loadExtensions() is called from getExtraPlugins() to load extension classes listed in spark-rapids-extra-plugins. It calls into TrampolineUtil.classForName() which calls into Spark classForName. This resolves to the thread context classloader, which in the errant environments is AppClassLoader and not a URLClassLoader, causing the class lookup failure.

Describe the solution you'd like

loadExtensions can use the same workaround to create a mutable URLClassloader if it is not found. This can be achieved either by routing it through getShimClassLoader(), which already has the fallback, or modify updateSparkClassLoader() to implement the same fallback mechanism (creating a URLClassLoader with the sparkCL as the parent).

Describe alternatives you've considered

Currently, to workaround this issue, the end user has the following options:

  1. Manually create a child URLClassLoader of AppClassLoader themselves and set it as the thread context classloader before SparkSession.getOrCreate().

  2. Use a JVM where there is a URLCLassloader present — e.g., for Java surefire tests, disable forking so that the surefire IsolatedClassloader is created in the Maven JVM and can be used by the ShimLoader. But disabling forking is not always desirable as it can limit test throughput.

  3. Manually going through these steps to build a single-shim jar.

Metadata

Metadata

Assignees

No one assigned

    Labels

    testOnly impacts tests

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions