Another day to scratch the head! Thread is also OOM?

This article first discovers the Nuggets: Will Thread also be OOM?
Author: Ultimate catch shrimp households

OOM is actually a relatively common exception, but I don’t know if you guys have seen this exception.

java.lang.OutOfMemoryError: pthread_create (1040KB stack) failed: Try again
	at java.lang.Thread.nativeCreate(Thread.java)
	at java.lang.Thread.start(Thread.java:1076)
	at java.util.concurrent.ThreadPoolExecutor.addWorker(ThreadPoolExecutor.java:920)
	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1338)
...

Due to the weird optimizations of domestic mobile phone manufacturers, especially Huawei, they have particularly stringent requirements for thread construction. When the total number of threads in a process reaches a certain magnitude, thread OOM problems will occur.

In fact, some people have specifically analyzed this issue. I still don't like to copy other people's articles directly. But as a scholar, how can borrowing books be called stealing? Incredible OOM

OOM is generated on Huawei phones with Android 7.0 and above (EmotionUI_5.0 and above). The thread limit of these phones is very small (it should be the limits specially modified by Huawei rom), and each process only allows a maximum of 500 at the same time. Threads, so it is easy to reproduce.

  for (i in 0 until 3000) {
            Thread {
                while (true) {
                    Thread.sleep(1000)
                }
            }.start()
        }

This is an experiment done by the author. When the thread of Huawei mobile phone is created more than 500, it will crash. But I wrote a demo myself, and found that not all Huawei mobile phones are like this. The result of my test with NOVA7 is that about 3000 threads will crash.

Will there be more than 500 threads online?

How to check the current number of threads?

The Android Profiler tool is very powerful, and it contains the number of threads started by the current process and its CPU scheduling situation.

It can be seen from the figure that the THREADSfollowing is the current number of threads used. When an Android project with only a small amount of code is executed, there are actually about 30 threads. After OKHttp, Glide, third-party frameworks, Socket, and startup task stacks are connected, the number of threads will appear even more. A blowout growth.

Cause analysis of online problems?

I observed the thread usage of our project, and found that about 300 threads will be built after the project completes a simple initialization, which is actually quite touching. The online usage is very complicated, and the error in the error log is not the real cause of oom, but the last straw that killed the camel.

I actually had this problem when I was in my last company. At that time, we tracked the source code and found that Schedulers.io()this problem was caused by using rxjava .

  static final class CachedWorkerPool implements Runnable {
        private final long keepAliveTime;
        private final ConcurrentLinkedQueue<ThreadWorker> expiringWorkerQueue;
        final CompositeDisposable allWorkers;
        private final ScheduledExecutorService evictorService;
        private final Future<?> evictorTask;
        private final ThreadFactory threadFactory;

        CachedWorkerPool(long keepAliveTime, TimeUnit unit, ThreadFactory threadFactory) {
            this.keepAliveTime = unit != null ? unit.toNanos(keepAliveTime) : 0L;
            this.expiringWorkerQueue = new ConcurrentLinkedQueue<ThreadWorker>();
            this.allWorkers = new CompositeDisposable();
            this.threadFactory = threadFactory;

            ScheduledExecutorService evictor = null;
            Future<?> task = null;
            if (unit != null) {
                evictor = Executors.newScheduledThreadPool(1, EVICTOR_THREAD_FACTORY);
                task = evictor.scheduleWithFixedDelay(this, this.keepAliveTime, this.keepAliveTime, TimeUnit.NANOSECONDS);
            }
            evictorService = evictor;
            evictorTask = task;
        }

        @Override
        public void run() {
            evictExpiredWorkers();
        }

        ThreadWorker get() {
            if (allWorkers.isDisposed()) {
                return SHUTDOWN_THREAD_WORKER;
            }
            while (!expiringWorkerQueue.isEmpty()) {
                ThreadWorker threadWorker = expiringWorkerQueue.poll();
                if (threadWorker != null) {
                    return threadWorker;
                }
            }

            // No cached worker found, so create a new one.
            ThreadWorker w = new ThreadWorker(threadFactory);
            allWorkers.add(w);
            return w;
        }

        void release(ThreadWorker threadWorker) {
            // Refresh expire time before putting worker back in pool
            threadWorker.setExpirationTime(now() + keepAliveTime);

            expiringWorkerQueue.offer(threadWorker);
        }

        void evictExpiredWorkers() {
            if (!expiringWorkerQueue.isEmpty()) {
                long currentTimestamp = now();

                for (ThreadWorker threadWorker : expiringWorkerQueue) {
                    if (threadWorker.getExpirationTime() <= currentTimestamp) {
                        if (expiringWorkerQueue.remove(threadWorker)) {
                            allWorkers.remove(threadWorker);
                        }
                    } else {
                        // Queue is ordered with the worker that will expire first in the beginning, so when we
                        // find a non-expired worker we can stop evicting.
                        break;
                    }
                }
            }
        }

        long now() {
            return System.nanoTime();
        }

        void shutdown() {
            allWorkers.dispose();
            if (evictorTask != null) {
                evictorTask.cancel(true);
            }
            if (evictorService != null) {
                evictorService.shutdownNow();
            }
        }
    }

    public ScheduledThreadPoolExecutor(int corePoolSize,
                                       ThreadFactory threadFactory) {
        super(corePoolSize, Integer.MAX_VALUE,
              DEFAULT_KEEPALIVE_MILLIS, MILLISECONDS,
              new DelayedWorkQueue(), threadFactory);
    }

It can be analyzed from the above code that the IO implementation is actually a thread pool with a core number of 1, the maximum number of Integer.MAX_VALUEthreads, and the thread destruction time is 60s. This is actually introduced in many articles, and it can be regarded as a regular change. After we replaced this thread pool, it did reduce the OOM problem of the project thread.

   RxJavaPlugins.setInitIoSchedulerHandler {
            val processors = Runtime.getRuntime().availableProcessors()
            val executor = ThreadPoolExecutor(processors * 2,
                    processors * 10, 1, TimeUnit.SECONDS, LinkedBlockingQueue<Runnable>(processors*10),
                    ThreadPoolExecutor.DiscardPolicy()
            )
            Schedulers.from(executor)
        }

Tips Here you need to pay attention to the execution of RxJavaPlugins before calling rxjava for the first time, otherwise the code will become invalid.

Kotlin's coroutine's IO thread implementation mechanism is also a thread pool. As mentioned in the previous article, the implementation of the internal thread scheduler of the coroutine is actually the same as that of rxjava, which is a thread pool. I carefully observed the DefaultScheduler.IOimplementation.

open class ExperimentalCoroutineDispatcher(
    private val corePoolSize: Int,
    private val maxPoolSize: Int,
    private val idleWorkerKeepAliveNs: Long,
    private val schedulerName: String = "CoroutineScheduler"
) : ExecutorCoroutineDispatcher() {
    constructor(
        corePoolSize: Int = CORE_POOL_SIZE,
        maxPoolSize: Int = MAX_POOL_SIZE,
        schedulerName: String = DEFAULT_SCHEDULER_NAME
    ) : this(corePoolSize, maxPoolSize, IDLE_WORKER_KEEP_ALIVE_NS, schedulerName)

@JvmField
internal val IDLE_WORKER_KEEP_ALIVE_NS = TimeUnit.SECONDS.toNanos(
    systemProp("kotlinx.coroutines.scheduler.keep.alive.sec", 60L)
)

The thread survival time is 60s, and the maximum number of threads is obtained according to the system configuration. I checked stackoverflow and found that the value is 64 . Then the IO call of the coroutine is actually good, and it will not cause the thread OOM problem. And this value can actually be modified by development, or it can be restricted.

Then we can show real technology again

If you think I only have such a little level as above, then I will definitely not write this article to boast.

The above can only solve some thread pools that can be modified on the current project, so is there a way to directly modify the third-party thread pool construction? ? ? ? For example, third-party chat, some libraries from Ali, etc.

If we can set a large reservoir for ourselves in the current project, except for OkHttp, Glide and the like, and then define the total number of thread pools, then we will replace all the thread pools in the project. local.

It's a little excited to think about it, first think about how to do it, and then decide the methodology.

  1. Define a whitelist that does not need to be replaced
  2. Traverse to find all the classes and find the constructor of the thread pool.
  3. Replace the constructor with our shared thread pool.

It's transfrom again, why is it always me

First of all, I added a small function to the original double-click optimized demo, which is the ones I listed above, to complete the replacement operation of the thread pool construction by class search and then replacement.


public class ThreadPoolMethodVisitor extends MethodVisitor {

   public ThreadPoolMethodVisitor(MethodVisitor mv) {
       super(Opcodes.ASM5, mv);
   }

   @Override
   public void visitMethodInsn(int opcode, String owner, String name, String desc, boolean itf) {
       boolean isThreadPool = isThreadPool(opcode, owner, name, desc);
       if (isThreadPool) {
           JLog.info("owner:" + owner + " name:" + name + " desc:" + desc);
           mv.visitInsn(Opcodes.POP);
           mv.visitMethodInsn(Opcodes.INVOKESTATIC, "com/wallstreetcn/sample/utils/TestIOThreadExecutor",
                   "getTHREAD_POOL_SHARE",
                   "()Lcom/wallstreetcn/sample/utils/TestIOThreadExecutor;", itf);
       } else {
           super.visitMethodInsn(opcode, owner, name, desc, itf);
       }
   }

   @Override
   public void visitInsn(int opcode) {
       super.visitInsn(opcode);
   }

   @Override
   public void visitLineNumber(int line, Label start) {
       super.visitLineNumber(line, start);
   }

   boolean isThreadPool(int opcode, String owner, String name, String desc) {
       List<PoolEntity> list = ThreadPoolCreator.INSTANCE.getPoolList();
       for (PoolEntity poolEntity : list) {
           if (opcode != poolEntity.getCode()) {
               continue;
           }
           if (!owner.equals(poolEntity.getOwner())) {
               continue;
           }
           if (!name.equals(poolEntity.getName())) {
               continue;
           }
           if (!desc.equals(poolEntity.getDesc())) {
               continue;
           }
           return true;
       }
       return false;
   }

}

The above is a MethodVisitor, any method block will be accessed by this class, and then we can modify this method block according to the access information, and key information such as method name and class name.

I have generated a list here. I will put all the entities related to the thread pool construction into this list, and then take the current method call to match it. When it is found that it is a thread pool constructor, we are right The code is modified and inserted and replaced with our shared thread pool. In this way, we can replace the thread pool structure in the compilation process and constrain the construction of all thread pools in the project.

Except for this?

In fact, you can also restrict developers through a part of the static scanning situation. You are not allowed to create a thread directly by using a new thread. In this way, you can also manage this part of OOM by writing a lint yourself.

I have written the demo of lint, please take a look if you have time, and forget it if you don’t have time https://github.com/Leifzhang/AndroidLint

End of sentence

Thank you everyone for following me, sharing Android dry goods, and exchanging Android technology.
If you have any insights on the article, or any technical questions, you can leave a message in the comment area to discuss, and I will answer you religiously.
Everyone is also welcome to come to my B station to play with me. There are video explanations of the advanced technical difficulties of various Android architects to help you get promoted and raise your salary.
Through train at station B: https://space.bilibili.com/544650554

Guess you like

Origin blog.csdn.net/Androiddddd/article/details/109495540