Android Component Series: Talk About Handler Mechanism Again (Native)

The author has written an article about the Handler mechanism of the Java layer before. From the perspective of application development, I introduced the design background of the Handler mechanism in detail, and how to write a set of Handler by myself.

In this article, we will go deep into the Native layer and Looper#loop()explore the reasons behind why the main thread is not stuck

The following, enjoy:

1. Opening

Starting from Android 2.3, Google changed the blocking/wake-up scheme of Handler from Object#wait() / notify()to use Linux epollto implement

The reason is that the Native layer also introduces a set of message management mechanisms for use by C/C++ developers, while the existing blocking/wake-up solutions are prepared for the Java layer and only support Java

Native hopes to be like Java: mainthe thread enters the blocking state when there is no message, and when there is an expired message that needs to be executed, the mainthread can wake up in time to process . How to do? There are two options

Or, continue to use Object#wait() / notify( ), when Native adds new messages to the message queue, notify the Java layer when it needs to be woken up

Or, re-implement a blocking/wake-up scheme at the Native layer, deprecate Object#wait() / notify()it, and Java calls Native through jni to enter the blocking state

We all know the ending, Google chose the latter

In fact, if you just transplant the blocking/awakening of the Java layer to the Native layer, you don't need to resort to epollthis big killer, and the Native call pthread_cond_waitcan also achieve the same effect

epollAnother reason for choosing is that the Native layer supports monitoring 自定义 Fd(for example, Input events epollare socketfdforwarded to the APP process through monitoring ), and once there is a need to monitor multiple stream events, you can only use Linux I/O multiple multiplexing technology

Understanding epoll of I/O multiplexing

Having said all that, what is it epoll?

epollThe full name eventpollis one of the implementations of Linux I/O multiplexing. In epolladdition selectand poll, we only discussepoll

To understand epoll, we first need to understand what is"流"

In Linux, any object that can perform I/O operations can be regarded as a stream , one 文件, socket, pipe, we can all regard them as streams

Next, let's discuss the I/O operation of the stream. By calling read(), we can read data from the stream ; by calling, we can writewrite() data to the stream

Now assume a situation where we need to read data from the stream, but there is no data in the stream yet

int socketfd = socket();
connect(socketfd,serverAddr);
int n = send(socketfd,'在吗');
n = recv(socketfd); //等待接受服务器端 发过来的信息
...//处理服务器返回的数据

A typical example is that the client wants socketto , but the server has not sent the data back , what should I do at this time?

  • Blocking: The thread blocks to the recv()method until the data is read and then continues to execute downwards
  • Non-blocking: the method returns -1 immediatelyrecv() without reading data , and the user thread polls the method until data is returnedrecv()

Ok, now we have two solutions, blocking and non-blocking , and then we initiate 100 network requests at the same time to see how each of these two solutions will be handled

Let's talk about the blocking mode first. 阻塞模式In the following , a thread can only process I/O events of one stream at a time. If you want to process multiple streams at the same time, you can only use 多线程 + 阻塞 I/Othe scheme. However, each socketcorresponding thread will cause a lot of resource occupation , especially for long connections , the thread resources will not be released. If there are many connections in succession, the memory of the machine will be run out soon.

Below非阻塞模式 , we found that it is 单线程possible to process multiple streams at the same time . As long as you keep visiting all streams from beginning to end, you can know which streams have data ( the return value is greater than -1 ), but this approach is not very efficient, because if all streams have no data, then only will waste CPU

Found a problem? When there are only two solutions, blocking and non-blocking , once there is a need to monitor multiple stream events, the user program can only choose , either to waste thread resources ( 阻塞型 I/O) or waste CPU resources ( 非阻塞型 I/O) , there is no other more efficient solution

Moreover, this problem is unsolvable on the user program side. The kernel must create a mechanism to take over the monitoring events of these streams , because any events must be read and forwarded by the kernel, and the kernel can always know the first time. event occurs

This mechanism that allows user programs to "listen to multiple stream read and write events at the same time" is called I/O multiplexing!

Then we look epollat the three functions provided by :

int epoll_create(int size);
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event);
int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);
  • epoll_create()for creating a epollpool

  • epoll_ctl()Used to perform fdthe "add, delete and modify" operation, the last parameter eventis to tell the kernel what events need to be monitored . Taking the network request as an example, socketfdthe monitoring is 可读事件that once the data returned by the server is received, socketfdthe object will receive a callback notification , indicating socketthat there is data available for reading.

  • epoll_wait()It is a method of blocking user threads. Its second parameter eventsaccepts a collection object . If multiple events occur at the same time, the eventsobject can get the collection of events that occurred from the kernel.

Understanding Linux eventfd

Understand, epolllet's again Linux eventfd, it eventfdis specially used to deliver events, fdand the function it provides is also very simple: cumulative count

int efd = eventfd();
write(efd, 1);//写入数字1
write(efd, 2);//再写入数字2
int res = read(efd);
printf(res);//输出值为 3

With the write()function , we can eventfdwrite a value of inttype into , and, as long as there is no read operation, eventfdthe value stored in will keep accumulating

The stored value can be read out through the read()function , and, until no new value is added, calling the method again will block until someone rewrites the value toeventfdread()eventfd

eventfdWhat is implemented is the function of counting, as long as the eventfdcount is not 0, it means that it fdis readable . Combined with epollthe features of , we can easily create生产者/消费者模型

epoll + eventfdAs a consumer, most of the time it is blocked and dormant, and once a request is queued ( eventfda value is written ), the consumer will wake up immediately for processing. The underlying logic of the Handler mechanism is to use epoll+eventfd

Well, with epollthe eventfdfoundation, we will officially enter the Native world of the Handler mechanism.

2. Enter Native Handler

Most Android engineers have more or less understanding of the Handler mechanism, so we will not repeat too much about the basic use and implementation principles of Handler, and go straight to the topic

Let's focus on a few jni methods in the MessageQueue class: nativeInit(), nativePollOnce()andnativeWake()

They correspond to the three links in the Native message queue 初始化消息队列and消息的循环与阻塞消息的分送与唤醒

/frameworks/base/core/java/android/os/MessageQueue.java
class MessageQueue {

    private native static long nativeInit();
    private native void nativePollOnce(long ptr, int timeoutMillis); /*non-static for callbacks*/
    private native static void nativeWake(long ptr);

}

Initialization of message queue

Let's first look at the first step, the initialization process of the message queue

The method will be called in the Java MessageQueuenativeInit() constructor , and synchronization will also create a message queue NativeMessageQueue object in the Native layer to save the messages sent by the Native developer

/frameworks/base/core/java/android/os/MessageQueue.java
MessageQueue(boolean quitAllowed) {
    mQuitAllowed = quitAllowed;
    mPtr = nativeInit();
}

Look at the code, in the constructor of NativeMessageQueue , trigger the creation of Looper objects ( Native layer )

/frameworks/base/core/jni/android_os_MessageQueue.cpp
class android_os_MessageQueue {

    void android_os_MessageQueue_nativeInit() {
        NativeMessageQueue* nativeMessageQueue = new NativeMessageQueue();
    }

    NativeMessageQueue() {
        mLooper = Looper::getForThread();
        if (mLooper == NULL) {
            mLooper = new Looper(false);
            Looper::setForThread(mLooper);
        }
    }
}

The processing logic of creating a Looper object in Native is the same as that of Java: first go to 线程局部存储区get the Looper object, if it is empty, create a new Looper object and save it to线程局部存储区

Let's continue, and then look at the Native Looper initialization process

/system/core/libutils/Looper.cpp
class looper {

    Looper::Looper() {
        int mWakeEventFd = eventfd();
        rebuildEpollLocked();
    }

    void rebuildEpollLocked(){
        int mEpollFd = epoll_create();//哎,这儿非常重要,在 Looper 初始化时创建了 epoll 对象
        epoll_ctl(mEpollFd, EPOLL_CTL_ADD, mWakeEventFd, & eventItem);//把用于唤醒消息队列的eventfd 添加到 epoll 池
    }

}

Here comes the key! ! !

The constructor of Looper first creates an eventfdobject : mWakeEventFd, which is used to monitor whether new messages are added to the MessageQueue . This object is very important, be sure to remember it!

In the rebuildEpollLocked()method , the epollobject is created again: mEpollFd, and the just applied is mWakeEventFdregistered to the epollpool

At this point, the two core objects mEpollFdthat mWakeEventFdare all successfully initialized!

Let's sort out the 消息队列的初始化steps :

  1. When the Java layer initializes the message queue, the nativeInit()method , and a NativeMessageQueue object is created in the native layer
  2. When the message queue of the Native layer is created, a Native Looper object is also created
  3. In the Native Looper constructor, call eventfd()Generate mWakeEventFd, which is the core of the subsequent wake-up message queue
  4. Finally call the rebuildEpollLocked()method , initialize an epollinstance mEpollFd, and then mWakeEventFdregister it with the epollpool

At this point, the initialization of the message queue of the Native layer is completed, and the Looper object mEpollFdholds mWakeEventFdand

Message looping and blocking

After the Java and Native message queues are created, the entire thread will be blocked in the Looper#loop()method . The call chain in the Java layer is roughly like this:

Looper#loop()
    -> MessageQueue#next()
        -> MessageQueue#nativePollOnce()
}

The last step of MessageQueue calls nativePollOnce()is a jni method, which is specifically implemented in the Native layer

Let's go down and see what's been done in Native

/frameworks/base/core/jni/android_os_MessageQueue.cpp
class android_os_MessageQueue {

    //jni方法,转到 NativeMessageQueue#pollOnce()
    void android_os_MessageQueue_nativePollOnce(){
        nativeMessageQueue->pollOnce(env, obj, timeoutMillis);
    }
    class NativeMessageQueue : MessageQueue {

        /转到 Looper#pollOnce() 方法
        void pollOnce(){
            mLooper->pollOnce(timeoutMillis);
        }
    }
}

nativePollOnce()pollOnce()Method of forwarding to NativeMessageQueue after receiving the request

NativeMessageQueue#pollOnce()and does nothing in , just forwards the request toLooper#pollOnce()

It seems that the main logic is in Looper, let's look down

//system/core/libutils/Looper.cpp
class looper {

    int pollOnce(int timeoutMillis){
        int result = 0;
        for (;;) {
            if (result != 0) {
                return result;
            }
            result = pollInner(timeoutMillis);//超时
        }
    }

    int pollInner(int timeoutMillis){
        int eventCount = epoll_wait(mEpollFd, eventItems, EPOLL_MAX_EVENTS, timeoutMillis);//调用 epoll_wait() 等待事件的产生
    }
}

see it? The execution logic of thread blocking and wake-up is all here!

pollOnce()Will keep polling the pollInner()method , checking its return valueresult

The resulttype is Looper.hthe enumeration class declared in the file, and there are 4 results:

  • -1 means use to wake()wake , usually a new message that needs to be executed immediately is added to the queue
  • -2 means that multiple events occur at the same time, it may be that a new message is added, or it may be listening. 自定义 fdAn I/O event occurs
  • -3 means that the set timeout period has expired
  • -4 means error, don't know where to use it

There is no message in the message queue, or the set timeout period has not expired, or 自定义 fdno event occurs, the thread will block until the pollInner()method call

pollInner(), the epoll_wait()system wait for the generation of the event

The title of this section is 消息的循环与阻塞that now that the thread has been blocked pollInner(), we can sort out the logic before and after the blocking occurs:

After the message queue is successfully initialized, the Java layer Looper#loop()will start to poll infinitely, and keep getting the next message. If the message queue is empty, call epoll_waitthe thread to enter the blocking state and give up the CPU scheduling

The whole calling process from Java to Native is roughly like this:

Looper#loop()
    -> MessageQueue#next()
        -> MessageQueue#nativePollOnce()
            -> NativeMessageQueue#pollOnce() //注意,进入 Native 层
                -> Looper#pollOnce()
                    -> Looper#pollInner()
                        -> epoll_wait()

Message sending/wake-up mechanism

Well, now the message queue is empty, and after the analysis in the previous section, we found that the user thread is blocked to the native layer Looper#pollInner()method , let's send a message to the message queue to wake it up

As we said earlier, both Java and Native maintain a set of message queues, so their entry points for sending messages are also different.

Java development and use Handler#sendMessage() / post(), C/C++ development and useLooper#sendMessage()

Let's look at Java first

/frameworks/base/core/java/android/os/Handler.java
class Handler {

    boolean enqueueMessage(MessageQueue queue, Message msg, long uptimeMillis) {
        msg.target = this;
        return queue.enqueueMessage(msg, uptimeMillis);
    }
}

/frameworks/base/core/java/android/os/MessageQueue.java
class MessageQueue {

    boolean enqueueMessage(Message msg, long when) {
        //...按照到期时间将消息插入消息队列
        if (needWake) {
            nativeWake(mPtr);
        }
    }

}

When using Handler to send a message, no matter whether it is called sendMessageor post, the MessageQueue#enqueueMessage()method enqueue the message, and the order of entry is in accordance with the execution time.

If the message we send needs to be executed immediately, then set the needWakevariable to true, then nativeWake()wake up

Note: nativeWake()The method is also called by jni. After layer-by-layer forwarding, it is finally called to the wake()method The call chain of the entire forwarding process is clear and very simple. I will not analyze it here.

The way of sending messages in Java is finished, and then let's see how the Native layer sends messages

/system/core/libutils/Looper.cpp
class looper {

    void Looper::sendMessageAtTime(uptime, handler,message) {
        int i = 0;
        int messageCount = mMessageEnvelopes.size();
        while (i < messageCount && uptime >= mMessageEnvelopes.itemAt(i).uptime) {
            i += 1;
        }
        mMessageEnvelopes.insertAt(messageEnvelope(uptime, handler, message), i, 1);
        // Wake the poll loop only when we enqueue a new message at the head.
        if (i == 0) {
            wake();
        }
    }
}

Looking at the above code, the Native layer sends messages to the message queue through the sendMessageAtTime()method , and the processing logic for adding messages is similar to the Java processing logic:

It is added to the mMessageEnvelopescollection time, and the message with the closest execution time is placed in the front. If it is found that the thread needs to be woken up, the wake()method is called

Well, the methods of sending messages in Java and Native have been introduced

We found that although the way they send messages, the type of messages, and the message queues delivered are different, when the thread needs to be woken up , both Java and Native will execute the Looper#wake()method.

Earlier we said that "the bottom layer of the Handler mechanism is epoll+eventfd "

Readers may wish to guess, how the thread here was awakened?

/system/core/libutils/Looper.cpp
class looper {

    void Looper::wake() {
        int inc = 1;
        write(mWakeEventFd, &inc);
    }
}

The answer is very simple, a write()one-line method call mWakeEventFdwrites ( small hint, mWakeEventFdthe type iseventfd )

Why mWakeEventFdwrite a 1, the thread can be woken up? ? ?

mWakeEventFdAfter the value is written, the state will change from 不可读to 可读, and the kernel listens to fdthe change in the readable and writable state , and will return the event from the kernel to the epoll_wait()method call

Once the epoll_wait()method returns, the blocking state will be cancelled, and the thread will continue to execute downward.

Well, let's summarize 消息的发送与唤醒a few key steps in :

  1. The Java layer sends a message, calls the MessageQueue#enqueueMessage()method , and if the message needs to be executed immediately, then calls nativeWake()the execution wakeup
  2. The Native layer sends messages and calls Looper#sentMessageAtTime()methods . The processing logic is similar to Java. If you need to wake up the thread, callLooper#wake()
  3. Looper#wake()The wake-up method is very simple, mWakeEventFdwrite
  4. 初始化队列is mWakeEventFdregistered for epolllistening, so once there mWakeEventFdis new content from , the epoll_wait()blocking call will return, which has already played a role in waking up the queue

Call ~ 消息的发送与唤醒The basically over, and the next step is the highlight of Handler: the message distribution processing after the thread wakes up

Distributed processing of messages after wake-up

The thread will block the pollInner()method After the thread wakes up, it will also pollInner()continue to execute in the method.

After the thread wakes up, it first determines why it woke up, and then executes different logic according to the wake-up type.

pollInner()The method is a little long, and it can be roughly divided into 5 steps. I have marked the steps, and we will go through them a little bit.

/system/core/libutils/Looper.cpp
class looper {

    int pollInner(int timeoutMillis){
        int result = POLL_WAKE;
        // step 1,epoll_wait 方法返回
        int eventCount = epoll_wait(mEpollFd, eventItems, timeoutMillis); 
        if (eventCount == 0) { // 事件数量为0表示,达到设定的超时时间
            result = POLL_TIMEOUT;
        }
        for (int i = 0; i < eventCount; i++) {
            if (eventItems[i] == mWakeEventFd) {
                // step 2 ,清空 eventfd,使之重新变为可读监听的 fd
                awoken();
            } else {
                // step 3 ,保存自定义fd触发的事件集合
                mResponses.push(eventItems[i]);
            }
        }
        // step 4 ,执行 native 消息分发
        while (mMessageEnvelopes.size() != 0) {
            if (messageEnvelope.uptime <= now) { // 检查消息是否到期
                messageEnvelope.handler->handleMessage(message);
            }
        }
        // step 5 ,执行 自定义 fd 回调
        for (size_t i = 0; i < mResponses.size(); i++) {
            response.request.callback->handleEvent(fd, events, data);
        }
        return result;
    }

    void awoken() {
        read(mWakeEventFd) ;// 重新变成可读事件
    }

}

Step 1: epoll_wait The method returns to indicate that an event has occurred, and the return value eventCountis the number of events that occurred. If it is 0, it means that the set timeout time is reached, and the following judgment logic will not go. If it is not 0, then we start to traverse the event set returned by the kernel eventItemsand execute different logic according to the type.

Step 2: If the event type is a message queue eventfd, it means that someone has submitted a message to the message queue that needs to be executed immediately. We only need to read the eventfddata make it triggerable again 可读事件, fdand then wait for the method to end.

Step 3: The event is not in the message queue eventfd, indicating that there are other places registered for monitoring fd, then, we will save the event to the mResponsescollection , and we will need to respond to this event later and notify the registered object

Step 4: Traverse the native message collection mMessageEnvelopes, check the expiration time of each message, and if the message expires, hand it over to the handler for distribution. For the distribution logic, refer to Java Handler

step 5: Traverse the mResponsescollection , 自定义 fdconsume the registered elsewhere, and respond to their callback methods

The logic executed after waking up is still very complicated. Let's summarize:

After the user thread is awakened, the messages of the Native layer are distributed first, followed by the notification 自定义 fdof the event ( if any ), and finally the pollInner()method ends, returning to the Java layer Looper#loop(). The method executes the message distribution to the Java layer. Only when the Java Handler finishes executing the message distribution, a loop()cycle is completed

After that, because Looper#loop()is an infinite loop, it will immediately enter the loop again, continue to call next()the method to get the message, block until pollInner(), and execute the distribution from the pollInner()wake -up. After the execution ends, it will enter the next loop, endless reincarnation.

mainThis process will repeat for the life of the thread until the APP process finishes running...

3. Conclusion

The above is the entire content of the Handler Native article, which mainly introduces how several key jni methods in Java MessageQueue are implemented at the bottom.

After analyzing all the code logic, we will find that the implementation of Native Handler is not complicated, and the key blocking and wake -up parts are realized with the help of the Linux system epollmechanism

Therefore, as long as we understand the epollmechanism , and then look Looper#pollInner()at the internal logic in the source code, we can understand what the whole Handler mechanism is all about.

Author: Yi Baoshan
Link: https://juejin.cn/post/7146239048191836190
Source: Rare Earth Nuggets

Guess you like

Origin blog.csdn.net/m0_64420071/article/details/127030611