That's it, from the Jar package conflict to the class loading mechanism

I took over a system with a sense of age, and planned to write a series of articles on refactoring and problems encountered. [Reconstruction 01], tell you about Jar package conflicts and principles.

background

At present, the project management on the market is either based on Maven or Gradle, and recently took over a set of projects that add jar packages purely manually.

For projects that add jar packages purely manually, it has been the way for many years, and technical personnel who have been working for three or five years may not have experienced it. It is to find out the jar packages required in the project one by one, add them to a lib directory, and manually add the jar package dependencies in the IDE.

Adding jar package dependencies in this way is not only labor-intensive, but also prone to jar package conflicts. At the same time, analyzing conflict methods can only rely on experience.

Recently, I have encountered such a situation: a project can be started normally in the environment of developer A, but cannot be started in B, and the exception information is that no class can be found.

People with a little development experience can immediately conclude that it is caused by jar package conflicts. Let's take a look at how to solve and extend the knowledge points.

Temporary solution

Since it is temporarily impossible to refactor the project on a large scale, I dare not easily replace and upgrade the Jar package. Only temporary means can be used to solve it.

Here are a few steps to prepare for emergencies, which are usually tips for solving Jar dependency problems.

First: Look in the IDE for the class not found in the exception. For example, the IDEA MAC operating system, the shortcut key I use is command + shift + n.

Find conflicts

Taking the Assert class as an example, you can see that there are many packages that contain Assert, but the startup program reports that a method of this class cannot be found. The problem basically lies in the Jar package conflict.

Second, after locating the Jar package conflict, find the Jar package that the system should use.

For example, the classes in spring-core that need to be used here, not the classes in spring.jar. Then, you can use the JVM's class loading sequence mechanism to let the JVM load the spring-core jar package first.

Knowledge point: The JVM loads the jar packages in the same directory in the order of the jar packages. Once a class with the same full path name is loaded, the same class later will not be loaded.

Therefore, a temporary solution is to adjust the order in which the JVM compiles (loads) the Jar packages. This is supported in both Eclipse and Idea and can be tweaked manually.

Adjustment method in Eclipse:

Eclipse adjust order

Adjustment method in Idea:

Idea adjust the order

Adjust the jar package that needs to be loaded first, so that it can be loaded first, and finally solve the problem of jar package conflict temporarily.

An extension of the class loading mechanism

The above is only a temporary solution limited by the status quo of the project, and it must be transformed and upgraded in the end, based on Maven or Gradle for Jar package management, and at the same time to solve the problem of Jar package conflicts.

In this temporary solution, a key knowledge point of the JVM is involved: the isolation problem of the JVM's class loader and the parent delegation mechanism. If there is no relevant knowledge of the JVM class loading mechanism, it may not even be possible to think of the above temporary solution.

Classloader isolation issues

Each class loader has its own namespace for storing loaded classes. When a class loader loads a class, it searches through the Fully Qualified Class Name stored in the namespace to check if the class has already been loaded .

The only identification of a class by the JVM is ClassLoader id + PackageName + ClassName, so there may be two classes with exactly the same package name and class name in a running program. And if these two classes are not loaded by a ClassLoader, it is impossible to force an instance of one class into another class, which is ClassLoader isolation.

In order to solve the isolation problem of class loaders , the JVM introduces a parent delegation mechanism .

Parental delegation mechanism

The core of the parent delegation mechanism has two points: first, check whether the class has been loaded from the bottom up ; second, try to load the class from the top down .

class loader

There are generally four types of class loaders: startup class loaders, extension class loaders, application class loaders, and custom class loaders.

Regardless of the custom class loader for the time being, the specific execution process of the JDK's own class loader is as follows:

First: When AppClassLoader loads a class, it will delegate the class loading request to the parent class loader ExtClassLoader to complete;

Second: When ExtClassLoader loads a class, it delegates the class loading request to BootStrapClassLoader to complete;

Third: If BootStrapClassLoader fails to load (for example, the class is not found in %JAVA_HOME%/jre/lib), ExtClassLoader will be used to try to load;

Fourth: If ExtClassLoader also fails to load, AppClassLoader will be used to load, and if AppClassLoader also fails to load, an exception ClassNotFoundException will be reported.

Parent delegation implementation of ClassLoader

ClassLoader implements the parent delegation mechanism through the loadClass() method for dynamic class loading .

The source code of this method is as follows:

protected Class<?> loadClass(String name, boolean resolve)
        throws ClassNotFoundException{
        synchronized (getClassLoadingLock(name)) {
            // First, check if the class has already been loaded
            Class<?> c = findLoadedClass(name);
            if (c == null) {
                long t0 = System.nanoTime();
                try {
                    if (parent != null) {
                        c = parent.loadClass(name, false);
                    } else {
                        c = findBootstrapClassOrNull(name);
                    }
                } catch (ClassNotFoundException e) {
                    // ClassNotFoundException thrown if class not found
                    // from the non-null parent class loader
                }

                if (c == null) {
                    // If still not found, then invoke findClass in order
                    // to find the class.
                    long t1 = System.nanoTime();
                    c = findClass(name);

                    // this is the defining class loader; record the stats
                    sun.misc.PerfCounter.getParentDelegationTime().addTime(t1 - t0);
                    sun.misc.PerfCounter.getFindClassTime().addElapsedTimeFrom(t1);
                    sun.misc.PerfCounter.getFindClasses().increment();
                }
            }
            if (resolve) {
                resolveClass(c);
            }
            return c;
        }
    }

The loadClass method itself is a recursive upward calling process, which can be seen from the call of parent.loadClass in the above code.

Before performing other operations, first check whether the specified class has been loaded by the findLoadedClass method starting from the bottom class loader. If it has been loaded, decide whether to perform the connection process according to the resolve parameter, and return the Class object.

And Jar package conflicts often occur here. When the first class with the same name is loaded, it will return directly when checking in this step, and will not load the really needed classes. Then, when the program uses the class, it will throw an exception that the class cannot be found, or the class method cannot be found.

Load order of Jar packages

We have seen above that once a class is loaded, classes with the same globally qualified name may not be loaded. The order in which the Jar packages are loaded directly determines the order in which the classes are loaded.

The following factors are usually used to determine the loading order of Jar packages:

  • First, the load path where the Jar package is located. That is, the class loader that loads the Jar package is at the level in the JVM class loader tree structure. The paths of the Jar packages loaded by the four class loaders mentioned above have different priorities.
  • Second, the file load order of the file system. Because the ClassLoader of Tomcat, Resin and other containers obtains the list of files under the loading path in no order, which depends on the order returned by the underlying file system. When the file systems of different environments are inconsistent, there will be some environments. Problem, some environments conflict.

The problem I encountered belongs to a branch of the second factor, that is, the loading order of different Jar packages in the same directory is different. Therefore, the problem was temporarily solved by adjusting the loading order of the Jar packages.

Common manifestations of Jar package conflicts

Jar package conflicts are often very strange and difficult to troubleshoot, but there are also some common manifestations.

  • Throws java.lang.ClassNotFoundException: a typical exception, mainly because the class does not exist in the dependency. There are two reasons for this: first, this class is indeed not introduced; second, due to the conflict of Jar packages, the Maven arbitration mechanism selects the wrong version, resulting in no such class in the loaded Jar package.
  • throws java.lang.NoSuchMethodError: The specific method could not be found. Jar package conflicts, resulting in the selection of the wrong dependency version, the class pair in the dependency version does not have the method, or the method has been upgraded.
  • Throws java.lang.NoClassDefFoundError, java.lang.LinkageError, etc. for the same reasons as above.
  • No exception but different expected results: the wrong version is loaded, different versions have different underlying implementations, resulting in inconsistent expected results.

Loading order of Jar packages and classes when Tomcat starts

Finally, sort out the loading order of Jar packages and classes when Tomcat starts, including the directories loaded by default by the different types of class loaders mentioned above:

  • The java core api in the $java_home/lib directory;
  • The java extension jar package in the $java_home/lib/ext directory;
  • The classes and jar packages in the directory pointed to by java -classpath/-Djava.class.path;
  • The $CATALINA_HOME/common directory is loaded from top to bottom in the order of the folders;
  • The $CATALINA_HOME/server directory is loaded from top to bottom in the order of the folders;
  • The $CATALINA_BASE/shared directory is loaded from top to bottom in the order of the folders;
  • The class file under the project path /WEB-INF/classes;
  • The jar file under the project path /WEB-INF/lib;

In the above directory, the Jar packages in the same folder are loaded in order from top to next. If a class file has already been loaded into the JVM, the following same class file will not be loaded.

summary

Jar package conflict is a very common problem in our daily development. If we can understand the cause of the conflict and the underlying mechanism well, we can greatly improve the problem-solving ability and team influence. Therefore, such questions are asked in many interviews.

In this article, we focus on the causes and solutions of Jar package conflicts when dependencies are manually added. When solving this problem, some strategies for Maven's conflict management of Jar packages are often designed, such as the dependency transit principle, the shortest path first principle, the first declaration principle, etc. We will talk about it in detail in the next article.

Author: Program New Horizons

Original link:
https://mp.weixin.qq.com/s/IfaxCCSt1WTboFAP3RsG9w

If you think this article is helpful to you, you can retweet, follow and support

Guess you like

Origin blog.csdn.net/m0_67645544/article/details/124432600