JVM, you are too much

It goes without saying how important the JVM is to Java and how important it is to programmer interviews.

If you haven't realized the necessity of learning JVM, or don't know how to learn JVM, then after reading this article, you will know the answer.

I used to be very disdainful of learning JVM, but later I found out that I can't do it without learning. It's like I don't want to apologize after quarreling with my daughter-in-law. It's impossible not to apologize. Apology is a matter of time and cannot escape.

Then I understood:

The later you confess, the worse the outcome.

But I only knew when I was learning: JVM, you are too much, too hard to learn!

My learning process can be said to be very bumpy, but after going through the bumps, I found that there are many ways to learn JVM.

Based on my experience and communication with my peers, I think the best way to learn JVM is:

Do accurate learning at different levels of programmers.

The so-called precision learning is to learn the knowledge points that are of great help to one's work. Drive learning with work content, wait until you have accumulated more, and then conquer all JVM knowledge points in one fell swoop, and finally master the underlying principles of JVM.

Let me talk about how to learn step by step and step by step for beginners, advanced and senior programmers .

How to learn beginner programmers

For novice programmers who have just entered the industry, their work is generally to fix simple bugs and develop simple functions. How to code to avoid bugs is the core issue at this stage.

For this core problem, the JVM principle must be deeply grasped by two knowledge points.

1. Class initialization

The initialization of a class requires a deep understanding of it. Otherwise, some initialization bugs will be introduced into the project inadvertently.

For example, take a look at the following code:

public class ParentClass {
    private int parentX;
    public ParentClass() {
        setX(100);
    }
    public void setX(int x) {
        parentX = x;
    }
}

public class ChildClass extends ParentClass{
    private int childX = 1;
    public ChildClass() {}
    @Override
    public void setX(int x) {
        super.setX(x);
        childX = x;
        System.out.println("ChildX 被赋值为 " + x);
    }
    public void printX() {
        System.out.println("ChildX = " + childX);
    }

}

public class TryInitMain {
    public static void main(String[] args) {
        ChildClass cc = new ChildClass();
        cc.printX();
    }
}

If you are interested, you can run it and see the results. Once this code is put into the production environment, it is very difficult to troubleshoot.

2. Java memory structure and object allocation

The second knowledge point is the basic knowledge of Java memory structure and object allocation, especially the relationship between heap layout and object allocation in JVM memory.

For example, the layout of heap memory

Of course, after Java7, the new layout changed

Knowing the layout, you have to know the basic principles of java object allocation:

  • Objects are preferentially allocated in the Eden area
  • If the object is too large, it will be directly allocated to the old age

Only if you know this knowledge, you will not often write down the following bugs:

// 将全部行数读取的内存中 
List<String> lines = FileUtils.readLines(new File("temp/test.txt"), Charset.defaultCharset()); 
for (String line : lines) { 
    // pass 
} 

The above code, once a large file is read, is likely to crash the production environment.

Therefore, an in-depth understanding of the above two knowledge points is very useful for newbies to improve their code quality. Only when the code quality goes up can you get better development.

For these two knowledge points, I think it is best to learn through online articles. If you read the book directly, there are two biggest disadvantages:

  • Insufficient accumulation of knowledge leads to incomprehension
  • There are too many redundant knowledge points in the book, which are intertwined with each other, consume too much energy, and are not cost-effective

Therefore, it is recommended to search for articles to read based on knowledge points, rather than looking for books with principles.

How to learn advanced programmers

For friends at this stage, they are already proficient in writing robust code, and often develop a large functional module independently, and some may independently develop a complete small project.

At this time, they may face two situations:

1. Need to write some tool classes for the whole team to use

In this case, you most likely need syntactic sugar in Java, because syntactic sugar allows you to write very flexible and simple code. This includes generics, auto-unboxing, variadic and traversal loops.

However, when using these syntactic sugars, if you are not familiar with their implementation principles in the JVM, it is very easy to fall into a big trouble.

for example:

public class GenericPitfall {
    public static void main(String[] args) {
	    	List list = new ArrayList();
	    	list.add("123");
	    	List<Integer>  list2 = list;
	    	System.out.println(list2.get(0).intValue());
		}
}

2. Write performant code

When do you need performant code? The most common is to convert the previously poor synchronous implementation into an asynchronous implementation.

For this kind of requirement, the developer needs to be very familiar with the multi-threaded development of Java, and must have a deep understanding of the principle implementation of multi-threading in the JVM.

Otherwise, you can look at the following code:

class IncompletedSynchronization {
		int x;

		public int getX() {
	    	return x;
		}

		public synchronized void setX(int x) {
	    	this.x = x;
		}
}

Look at this again:

Object lock = new Object();
synchronized (lock) {
		lock = new Object();
}

If the above code is put into the production environment, the fate of staying up all night to troubleshoot the problem is doomed...

For the knowledge points here, I recommend reading through online articles, and because it involves concurrency knowledge, I recommend reading the chapter "Chapter 9. Threading and Synchronization Performance" in the second edition of "Java Performance".

There is still room to spare, and it is recommended to continue to read chapters 12-13 in the third edition of Zhou Zhiming's "In-depth Understanding of JAVA Virtual Machine". Zhou Zhiming's book is very in-depth, but it also brings a disadvantage: the threshold is high. At this point, if you don't understand, you can put it away.

Note that what I am talking about here is the principle of concurrency, not the practice of concurrency. If readers want to learn concurrent programming, I think "JAVA Concurrent Programming Practice" is a prerequisite, so I won't go into details.

How do senior programmers learn

At this time, you have begun to take on important responsibilities in project development, and some excellent friends have begun to lead the team. At this point, you might do the following:

1. Reasonable planning of project use of resources

Reasonable planning of project use resources requires a very in-depth understanding of garbage collection.

If you have a general concept of memory allocation and memory usage of Java objects in the novice period, then this garbage collection is a further expansion of this kind of knowledge.

Only by understanding the principles of various garbage collections, combined with the basic knowledge of Java memory layout, can you better plan what recycling algorithm to use for the project, and get the best performance at appropriate resource utilization.

For example, the appropriate ratio between the young generation and the old generation. For example, the ratio between Eden and Survivor regions in the Cenozoic.

2. Troubleshoot various online issues

Troubleshooting various issues requires a solid understanding of the various troubleshooting tools provided by the JVM.

These tools are further divided into two categories:

  • Basic command-line troubleshooting tools, such as jps, jstack, etc.
  • The second category is visual troubleshooting tools, such as VisualVM

However, mastering the use of tools is not enough. Because of the problem of garbage collection, it is necessary to analyze the GC log and then use the tool to locate the source of the problem.

So, it's best to be very proficient with both troubleshooting tools and GC logs.

for example:

2021-05-26T14:45:37.987-0200: 151.126:
[GC (Allocation Failure) 151.126: [DefNew: 629119K->69888K(629120K), 0.0584157 secs] 1619346K->1273247K(2027264K), 0.0585007 secs]
[Times: user=0.06 sys=0.00, real=0.06 secs]

2021-05-26T14:45:59.690-0200: 172.829:
[GC (Allocation Failure) 172.829: [DefNew: 629120K->629120K(629120K), 0.0000372 secs]172.829: [Tenured: 1203359K->755802K(1398144K), 0.1855567 secs] 1832479K->755802K(2027264K), [Metaspace: 6741K->6741K(1056768K)], 0.1856954 secs]
[Times: user=0.18 sys=0.00, real=0.18 secs]

From the above, it should be seen at a glance that the garbage algorithm uses the Serial collector, and there is a problem with the allocation of the young generation, and the size may need to be adjusted.

The knowledge points here are strongly opposed to reading articles on the Internet. Many details said on the Internet have problems and omissions. Therefore, reading books are recommended.

In the second edition of "Java Performance", "Chapter 5. An Introduction to Garbage Collection", "Chapter 6. Garbage Collection Algorithms" knowledge is enough.

Some people go to Chapter 3 in the third edition of "In-depth understanding of the JAVA virtual machine", which talks about garbage collectors and memory allocation strategies. It's still an old question here, and it's too detailed. I suggest to bypass Section 3.4 and talk about the details of the HotSpot algorithm.

It is very important to be safe here, but it is difficult to understand at this stage. I think in the future, I will do some low-level frameworks and come into contact with the idea of ​​checkpoint related to crash recovery, and then I will come back to learn, only then can I truly understand and master it.

How do technologists learn

At this level, you need to have a very in-depth understanding of the entire JVM, because you are the last guarantee for solving technical problems. Sometimes, it is even necessary to develop various tools for certain problems.

Once, a project always reported an error from time to time:

java.lang.OutOfMemoryError: GC overhead limit exceeded

Several colleagues have not solved this problem, so they came to me. I looked at it and suddenly remembered that I had seen relevant introductions in the official tuning guide "HotSpot Virtual Machine Garbage Collection Tuning Guide" before.

The JVM itself will run the GC when it runs out of memory, but if the memory reclaimed by each GC is not enough, the next GC will start soon.

The JVM has a default protection mechanism. If it is found that in a statistical period, 98% of the time is running the GC, and the memory recovery is less than 2%, this error will be reported.

What caused it? If you go to the code to troubleshoot this problem, it is really difficult. First of all, there are no stack errors to help locate the problem. Secondly, the amount of project code has grown, and it is a long time ago.

At this time, it is necessary to reverse the problem through a deep understanding of the overall JVM. I reasoned like this:

The problem of memory overflow and GC not being able to recycle shows two problems:

  1. Not enough memory in the heap
  2. Objects occupying memory are either not closed resources that should be closed, or are temporarily put together in large numbers

Then if I dump the memory file, and then analyze it, I will know which objects are occupying the memory.

A search found that a large number of strings are occupying memory.

Based on my previous speculation, the string is not a database connection, and there is definitely no problem that it should be closed but not closed. Then there is only one possibility left, that is, a large number of them are temporarily put up, resulting in the GC not being able to recycle.

Then a new question comes, what can be a large number of strings?

First guess the cache. According to this clue, go directly to the source code to search for Cache keywords, and read all the codes about Cache. The problem was found immediately.

It turns out that we have a function that parses a very large file. The format of the file is as follows:

It is necessary to store the contents of each line of this file in the database according to the columns.

Because the people who wrote the code are lazy, they want to shove it all into the database after parsing it once. Therefore, he made a Map, the Key of the Map is the same column name, and the Value is the parsed content of each row.

The result of writing the code in this way is that one line corresponds to a HashMap with three entries. If the file has hundreds of thousands of lines, there are hundreds of thousands of HashMaps. Then, these HashMaps are stored in a list, and then this list is put into a HashMap called xxxCache.

The schematic code is as follows:

public class ParseFile4OOM {
    public static void main(String[] args) {
        List<Map<String, String>> lst = new ArrayList<>();
        for (int i = 0; i < 100000; i++) {
            Map<String, String> map = new HashMap<>();
            map.put("Column1", "Content1");
            map.put("Column2", "Content2");
            map.put("Column3", "Content3");
            lst.add(map);
        }

        Map<String, List<Map<String, String>>> contentCache = new HashMap<>();
        contentCache.put("contents", lst);
    }
}

So what to do with this situation? The code can't be moved, it can only be optimized.

At that time, we were already using JDK8, which introduced the String constant pool. At the same time, in this business scenario, the volume of Hashmap is fixed, so it should not be allocated more space, so it is fixed at 3.

After optimization, the code is as follows:

public class ParseFile4OOM {
    public static void main(String[] args) {
        List<Map<String, String>> lst = new ArrayList<>();
        for (int i = 0; i < 100000; i++) {
            Map<String, String> map = new HashMap<>(3);
            map.put("Column1".intern(), "Content1".intern());
            map.put("Column2".intern(), "Content2".intern());
            map.put("Column3".intern(), "Content3".intern());
            lst.add(map);
        }

        Map<String, List<Map<String, String>>> contentCache = new HashMap<>();
        contentCache.put("contents".intern(), lst);
    }
}

Put the optimized code online, and the error is fixed!

Therefore, at this stage, the JVM has to be thoroughly understood. To understand the principle, you must rely on reading books.

Zhou Zhiming's "In-depth understanding of JAVA virtual machine" is a must, but it is not enough.

I also recommend reading the book "Oracle JRockit: The Definitive Guide". Although it is old, many of the contents in it, especially the first four chapters, really explain the principles of the JVM. How to flexibly scale the JVM to balance the relationship between resources and performance, let me talk about it, and my programming vision has opened up a lot.

So far, the learning methods of different stages are finished.

In general, JVM knowledge is extensive and complex, and if you want to master it, you can't do it overnight. Moreover, it is not easy for us programmers, we need to learn too much knowledge, but our energy is limited.

Therefore, for the principle of JVM, if some knowledge points are incomprehensible and useless, you can put them aside for a while to achieve accurate learning, and use the saved energy on other knowledge or even in your own life, and more Significant.

If you find it useful after reading it, I hope you can give it a thumbs up.


Hello, I am Siyuanwai, the technical director of a listed company, and I manage a technical team of more than 100 people.

I went from a non-computer graduate to a programmer, working hard and growing all the way.

I will write my own growth story into articles, boring technical articles into stories.

Welcome to pay attention to my official account. After following, reply to [666] to receive some technical collections I have organized.

{{o.name}}
{{m.name}}

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=324081625&siteId=291194637