A colleague wrote a shocking bug, which is not easy to be found. .

Click to follow the official account, Java dry goods will be delivered in time1e229e23a66b3bf7aa2131d3dcc083bf.png

1ec5ec298eab50b09f861cf45731dfa0.png There is no one of the strongest microservice frameworks in China!

eec38163d8dbcb40af0c431a6b6fb177.png Covers almost all Spring Boot operations!

26cd433c1d479b9481a78db0dcee5da2.png 2023 New Java Interview Questions (2500+)


Author: Shudong Jun

Link: https://juejin.cn/post/7064376361334358046

accident description

From 6:32, when a small number of users access the app, there will be abnormal access to the homepage. By 7:20, the homepage service will be unavailable on a large scale, and the problem will be resolved at 7:36.

overall process

At 6:58, the alarm was detected, and at the same time, it was found that the network was busy on the homepage of the feedback in the group. Considering that the store list service was launched and released a few nights ago , it was considered to roll back the code to deal with the problem urgently.

At 7:07, I contacted XXX successively to check and solve the problem.

7:36 The code was rolled back and the service returned to normal.

Incident Root Cause - Incident Code Simulation

public static void test() throws InterruptedException, ExecutionException {
    Executor executor = Executors.newFixedThreadPool(3);
    CompletionService<String> service = new ExecutorCompletionService<>(executor);
    service.submit(new Callable<String>() {
        @Override
        public String call() throws Exception {
            return "HelloWorld--" + Thread.currentThread().getName();
        }
    });
}

The root cause is that the ExecutorCompletionService did not call the take and poll methods.

The correct way to write it is as follows:

public static void test() throws InterruptedException, ExecutionException {
    Executor executor = Executors.newFixedThreadPool(3);
    CompletionService<String> service = new ExecutorCompletionService<>(executor);
    service.submit(new Callable<String>() {
        @Override
        public String call() throws Exception {
            return "HelloWorld--" + Thread.currentThread().getName();
        }
    });
    service.take().get();
}

The bloody case caused by one line of code is not easy to be discovered , because oom is a process of slow memory growth, and it will be ignored if you are a little careless. thunder.

It is indeed the fastest way for the operator to roll back or restart the server, but if it is not a quick analysis of the oom code afterwards, and unfortunately the rolled back version also contains the oom code, it will be more sad. As mentioned earlier, the traffic is small Yes, memory can be released by rolling back or restarting; but in the case of heavy traffic, unless you roll back to the normal version, GG.

In addition, if you are preparing for an interview to change jobs in the near future, it is recommended to brush up questions online in the Java interview library applet, covering 2000+ Java interview questions, covering almost all mainstream technical interview questions.

Get to the root of the problem

In order to better understand the "routine" of ExecutorCompletionService, we use ExecutorService as a comparison, so that we can better understand which scenarios to use ExecutorCompletionService.

First look at the ExecutorService code (it is recommended to go down and run for a while)

public static void test1() throws Exception{
    ExecutorService executorService = Executors.newCachedThreadPool();
    ArrayList<Future<String>> futureArrayList = new ArrayList<>();
    System.out.println("公司让你通知大家聚餐 你开车去接人");
    Future<String> future10 = executorService.submit(() -> {
        System.out.println("总裁:我在家上大号 我最近拉肚子比较慢 要蹲1个小时才能出来 你等会来接我吧");
        TimeUnit.SECONDS.sleep(10);
        System.out.println("总裁:1小时了 我上完大号了。你来接吧");
        return "总裁上完大号了";

    });
    futureArrayList.add(future10);
    Future<String> future3 = executorService.submit(() -> {
        System.out.println("研发:我在家上大号 我比较快 要蹲3分钟就可以出来 你等会来接我吧");
        TimeUnit.SECONDS.sleep(3);
        System.out.println("研发:3分钟 我上完大号了。你来接吧");
        return "研发上完大号了";
    });
    futureArrayList.add(future3);
    Future<String> future6 = executorService.submit(() -> {
        System.out.println("中层管理:我在家上大号  要蹲10分钟就可以出来 你等会来接我吧");
        TimeUnit.SECONDS.sleep(6);
        System.out.println("中层管理:10分钟 我上完大号了。你来接吧");
        return "中层管理上完大号了";
    });
    futureArrayList.add(future6);
    TimeUnit.SECONDS.sleep(1);
    System.out.println("都通知完了,等着接吧。");
    try {
        for (Future<String> future : futureArrayList) {
            String returnStr = future.get();
            System.out.println(returnStr + ",你去接他");
        }
        Thread.currentThread().join();
    } catch (Exception e) {
        e.printStackTrace();
    }
}

Three tasks, each task execution time is 10s, 3s, 6s respectively. Submit these three Callable tasks through the submit of the JDK thread pool.

Recommend an open source and free Spring Boot practical project: https://github.com/javastacks/spring-boot-best-practice

  • step1 The main thread submits the three tasks to the thread pool, stores the corresponding returned Future in the List, and then executes the output statement of "all notifications are over, wait for it."

  • Step2 executes the future.get() operation in the loop, blocking and waiting. The final result is as follows:

754c9f1ce9a56bb9061243164fa575d4.jpeg

Notify the president first, and wait for an hour to pick up the president first, and then pick up the R&D and middle management after receiving the president. Although they have already finished their work, they still have to wait for the president to go to the toilet~~

The longest -10s asynchronous task enters the list for execution first, so when the 10s task result is obtained during the loop process, the get operation will be blocked until the 10s asynchronous task is executed. Even if the 3s and 5s tasks have been executed long ago, they have to block and wait for the 10s tasks to be executed.

Seeing this, especially students who are doing gateway business, may resonate. Generally speaking, gateway RPC will call more than N downstream interfaces, as shown in the figure below

If all follow the ExecutorService method, and it happens that the interfaces called by the first few tasks take a long time and block waiting at the same time, it will be more sad. So ExecutorCompletionService came out in response to the situation. As a reasonable controller of task threads, the title of "task planner" is worthy of its name.

Same scenario ExecutorCompletionService code

public static void test2() throws Exception {
    ExecutorService executorService = Executors.newCachedThreadPool();
    ExecutorCompletionService<String> completionService = new ExecutorCompletionService<>(executorService);
    System.out.println("公司让你通知大家聚餐 你开车去接人");
    completionService.submit(() -> {
        System.out.println("总裁:我在家上大号 我最近拉肚子比较慢 要蹲1个小时才能出来 你等会来接我吧");
        TimeUnit.SECONDS.sleep(10);
        System.out.println("总裁:1小时了 我上完大号了。你来接吧");
        return "总裁上完大号了";
    });
    completionService.submit(() -> {
        System.out.println("研发:我在家上大号 我比较快 要蹲3分钟就可以出来 你等会来接我吧");
        TimeUnit.SECONDS.sleep(3);
        System.out.println("研发:3分钟 我上完大号了。你来接吧");
        return "研发上完大号了";
    });
    completionService.submit(() -> {
        System.out.println("中层管理:我在家上大号  要蹲10分钟就可以出来 你等会来接我吧");
        TimeUnit.SECONDS.sleep(6);
        System.out.println("中层管理:10分钟 我上完大号了。你来接吧");
        return "中层管理上完大号了";
    });
    TimeUnit.SECONDS.sleep(1);
    System.out.println("都通知完了,等着接吧。");
    //提交了3个异步任务)
    for (int i = 0; i < 3; i++) {
        String returnStr = completionService.take().get();
        System.out.println(returnStr + ",你去接他");
    }
    Thread.currentThread().join();
}

After running, the result is as follows:

ca42afc7d8778caa5044e6396264e126.jpeg

This time it is relatively more efficient. Although the president is notified first, but according to the speed of everyone's tuba, whoever finishes first will pick up the first one, and there is no need to wait for the president who has been on the tuba for the longest time (in real life, it is recommended to use the first The consequences of not waiting for the president emmm hahaha).

Put them together and compare the output results:

2a508e2576e20df5a7abedf0b5dc24a5.jpeg

The difference between the two pieces of code is very small. ExecutorCompletionService is used when getting the result

completionService.take().get();

Why use take() and then get()? ? ? ? Let's look at the source code

CompletionService interface and the implementation class of the interface

1. ExecutorCompletionService is the implementation class of CompletionService interfacef969ef453b0b201f4d4870146b6c8881.jpeg

2. Then follow the construction method of ExecutorCompletionService. You can see that the input parameter needs to pass a thread pool object. The default queue is LinkedBlockingQueue, but there is another construction method that can specify the queue type, as shown in the following two pictures, two construction methods .

The constructor of the default LinkedBlockingQueue80ccf4b5421b54307a0d1e96f85393b4.jpeg

Constructors for optional queue typese6ac77a71f0b6d20236a682b24ee1b5c.jpeg

3. The two methods of submit task submission both have a return value. In our example, the first method of the Callable type is used.4d33b901d3888f111725c362ad83fc56.jpeg

4. Comparing the submit method of ExecutorService and ExecutorCompletionService, we can see the difference (1) ExecutorService

c93aabe204a67a524059bd6660cbd811.jpeg
wecom-temp-7a3620b4ca55c25badcbc5f96bfeb75f.png

(2)ExecutorCompletionService

d18258840ab386b852b1d9672b36d960.jpeg
wecom-temp-8c8a582217d0ae65f7e3aff43ce71de2.png

5. The difference lies in QueueingFuture. What is the function of this? Let's continue to follow up to see

  • QueueingFuture inherits from FutureTask, and the position marked by the red line rewrites the done() method.

  • Put the task into the completionQueue queue. When the task execution is completed, the task will be put into the queue.

  • At this moment, the tasks in the completionQueue queue are all tasks that have been completed by done(), and this task is the future result we got one by one.

  • If the task method of completionQueue is called, the waiting task will be blocked. What is waiting must be a completed future, and we can get the result immediately by calling the .get() method.

98531212f83d1881690d938b6089fee1.jpeg
wecom-temp-aaf01e40f5f3fb8023e9d23243cef40f.png

Seeing this, I believe everyone should understand more or less.

  • After we use ExecutorService submit to submit tasks, we need to pay attention to the future returned by each task. However, CompletionService tracks these futures and rewrites the done method so that the completionQueue queue you are waiting for must be a completed task.

  • As a gateway RPC layer, we don't need to drag down all requests because of the slow response of a certain interface, and can use CompletionService in the business scenario that handles the fastest response.

but attention, attention, attention is also the core of this accident

When any one of the three methods below ExecutorCompletionService is called, the task execution result in the blocking queue will be removed from the queue and the heap memory will be released. Since the business does not need to use the return value of the task, no call is made take, poll method. As a result, the heap memory is not released, and the heap memory will continue to grow as the number of calls increases.

4437783db7f29a8374a77e226e93ae5b.jpeg

Therefore, if you don’t need to use the task return value in the business scenario, don’t use CompletionService. If you use it, remember to remove the task execution result from the blocking queue to avoid OOM!

Summarize

Knowing the cause of the accident, let's summarize the methodology. After all, Confucius said: Introspect yourself, think about your own mistakes, and cultivate yourself!

Before going live:

  • Strict code review habits must be handed over to the back person to read. After all, the code written by oneself can’t see the problem. I believe that every programmer has this confidence (this may be mentioned repeatedly in subsequent incidents! Very important )

  • Online record - note the last package version that can be rolled back (leave a way out for yourself)

  • After confirming the rollback before going online, whether the business can be downgraded, if it cannot be downgraded, the monitoring cycle of this go-live must be strictly extended. After go-live:

  • Continue to pay attention to memory growth (this part is very easy to be ignored, everyone pays less attention to memory than CPU usage)

  • Continue to pay attention to the growth of cpu usage

  • GC situation, whether the number of threads increases, whether there are frequent fullgc, etc.

  • Pay attention to service performance alarms, whether tp99, 999, and max have increased significantly


course recommendation

In addition to sharing technology on the official account, the stack leader spent most of his rest time creating the latest " Spring Cloud Alibaba Microservice Practical Course ". Once it was launched, it received rave reviews. The course covered almost all operations of Spring Cloud Alibaba! ! !

46541902e4be17c62ed0a160c3356fb4.png
Click to enlarge to view HD version

It has all been completed, and you can sign up to learn all the content at any time.

Recently, the stack leader has spent a lot of time updating to the third period, which is full of dry goods, including adapting to the latest  Spring Cloud Alibaba 2022.0.0.0  version. Other content of the course has also been upgraded and strengthened, such as Nacos configuration encryption and decryption, OAuth 2 version Updates, etc., one subscription, free updates forever.

In short, if you want to systematically study the Spring Cloud microservice system, the architecture design of microservice projects, and the connection and application of various mainstream frameworks and middleware, this course is just right for you. After learning, write various micro-service technologies directly in your resume, and you will have more confidence when interviewing and changing jobs to discuss salary.

Friends who subscribed early, learn and follow along all the way, and you will know how valuable it is. Don’t waste time learning the old micro-service technology, and learn the latest Spring Cloud micro-service technology with the stack manager, and avoid detours.

To avoid ineffective communication, read this article for a detailed introduction to the course: I have been working on the SSH project for three years, and it is about to be abolished!

Those who are interested can scan the QR code and contact the stack manager to register:

ae3c6b47707f2c4ab71b1ded97e8a709.png

Sign up for gift benefits through this page:

1. Free one-year VIP access to the small program "Java Interview Library", including 2500+ Java interview questions and answers, which has helped many small partners enter big factories.

2. Give away a 100 yuan newcomer coupon of Knowledge Planet " Java Technology Small Circle ", and you can join as long as 99 yuan, which contains a lot of knowledge and learning materials.

The above activities are limited to users who register from this page, and the latest content shall prevail at other times.

Add stack leader Wechat registration study

23b5b6263cbbb4c44c7727c466f6f435.jpeg

Please note for fast pass: 999

Guess you like

Origin blog.csdn.net/youanyyou/article/details/132094844