Look at the essence of Java AIO through the phenomenon|Dewu Technology

1 Introduction

There are many such articles about the differences and principles of Java BIO, NIO, and AIO, but they are mainly discussed between BIO and NIO, while there are very few articles about AIO, and many of them are just introductions. Take a look at the concepts and code examples.

When learning about AIO, the following phenomena have been noticed:

1. Java 7 was released in 2011, which added a programming model called AIO called asynchronous IO, but nearly 12 years have passed, and the usual development framework middleware is still dominated by NIO, such as the network framework Netty, Mina, Web container Tomcat, Undertow.

2. Java AIO is also called NIO 2.0. Is it also based on NIO?

3. Netty dropped the support of AIO. https://github.com/netty/netty/issues/2515

4. AIO seems to have only solved the problem and released a loneliness.
These phenomena will inevitably confuse many people, so when I decided to write this article, I didn’t want to simply repeat the concept of AIO, but how to analyze, think and understand the essence of Java AIO through the phenomenon.

2. What is asynchronous

2.1 Asynchrony as we know it

The A of AIO means Asynchronous. Before understanding the principle of AIO, let's clarify what kind of concept "asynchronous" is.
Speaking of asynchronous programming, it is still relatively common in normal development, such as the following code examples:

@Async
public void create() {
    //TODO
}
​
public void build() {
    executor.execute(() -> build());
}

Whether it is annotated with @Async or submitting tasks to the thread pool, they all end up with the same result, which is to hand over the task to be executed to another thread for execution.
At this time, it can be roughly considered that the so-called "asynchronous" is multi-threaded and executes tasks.

2.2 Are Java BIO and NIO synchronous or asynchronous?

Whether Java BIO and NIO are synchronous or asynchronous, we first do asynchronous programming according to the idea of ​​asynchrony.

2.2.1 BIO example

byte [] data = new byte[1024];
InputStream in = socket.getInputStream();
in.read(data);
// 接收到数据,异步处理
executor.execute(() -> handle(data));
​
public void handle(byte [] data) {
    // TODO
}

When BIO read(), although the thread is blocked, when receiving data, a thread can be started asynchronously to process.

2.2.2 NIO example

selector.select();
Set<SelectionKey> keys = selector.selectedKeys();
Iterator<SelectionKey> iterator = keys.iterator();
while (iterator.hasNext()) {
    SelectionKey key = iterator.next();
    if (key.isReadable()) {
        SocketChannel channel = (SocketChannel) key.channel();
        ByteBuffer byteBuffer = (ByteBuffer) key.attachment();
        executor.execute(() -> {
            try {
                channel.read(byteBuffer);
                handle(byteBuffer);
            } catch (Exception e) {
​
            }
        });
​
    }
}
​
public static void handle(ByteBuffer buffer) {
    // TODO
}

In the same way, although NIO read() is non-blocking, it can block waiting for data through select(). When there is data to read, it starts a thread asynchronously to read and process data.

2.2.3 Deviations in Understanding

At this time, we swear that whether Java's BIO and NIO are asynchronous or synchronous depends on your mood. If you are happy to give it a multi-thread, it is asynchronous.

But if this is the case, after reading a lot of blog articles, it is basically clarified that BIO and NIO are synchronized.

So where is the problem? What caused the deviation in our understanding?

That is the problem of the frame of reference. When studying physics before, whether the passengers on the bus are moving or stationary requires a frame of reference. If the ground is used as a reference, he is moving, and the bus is used as a reference, he is stationary.

The same is true for Java IO. A reference system is needed to define whether it is synchronous or asynchronous. Since we are discussing which mode of IO is, it is necessary to understand the IO read and write operations, while others start another one. Threads to process data are already out of the scope of IO reading and writing, and they should not be involved.

2.2.4 Trying to define async

Therefore, taking the event of IO read and write operations as a reference, we first try to define the thread that initiates IO read and write (the thread that calls read and write), and the thread that actually operates IO read and write. If they are the same thread, then Call it synchronous, otherwise asynchronous .

  • Obviously, BIO can only be synchronous. Calling in.read() blocks the current thread. When data is returned, the original thread receives the data.

  • And NIO is also called synchronization, and the reason is the same. When calling channel.read(), although the thread will not block, it is still the current thread that reads the data.

According to this idea, AIO should be the thread that initiates IO read and write, and the thread that actually receives the data may not be the same thread.
Is this the case? Let’s start the Java AIO code now.

2.3 Java AIO program example

2.3.1 AIO server program

public class AioServer {
​
    public static void main(String[] args) throws IOException {
        System.out.println(Thread.currentThread().getName() + " AioServer start");
        AsynchronousServerSocketChannel serverChannel = AsynchronousServerSocketChannel.open()
                .bind(new InetSocketAddress("127.0.0.1", 8080));
        serverChannel.accept(null, new CompletionHandler<AsynchronousSocketChannel, Void>() {
​
            @Override
            public void completed(AsynchronousSocketChannel clientChannel, Void attachment) {
                System.out.println(Thread.currentThread().getName() + " client is connected");
                ByteBuffer buffer = ByteBuffer.allocate(1024);
                clientChannel.read(buffer, buffer, new ClientHandler());
            }
​
            @Override
            public void failed(Throwable exc, Void attachment) {
                System.out.println("accept fail");
            }
        });
        System.in.read();
    }
}
​
public class ClientHandler implements CompletionHandler<Integer, ByteBuffer> {
    @Override
    public void completed(Integer result, ByteBuffer buffer) {
        buffer.flip();
        byte [] data = new byte[buffer.remaining()];
        buffer.get(data);
        System.out.println(Thread.currentThread().getName() + " received:"  + new String(data, StandardCharsets.UTF_8));
    }
​
    @Override
    public void failed(Throwable exc, ByteBuffer buffer) {
​
    }
}

2.3.2 AIO client program

public class AioClient {
​
    public static void main(String[] args) throws Exception {
        AsynchronousSocketChannel channel = AsynchronousSocketChannel.open();
        channel.connect(new InetSocketAddress("127.0.0.1", 8080));
        ByteBuffer buffer = ByteBuffer.allocate(1024);
        buffer.put("Java AIO".getBytes(StandardCharsets.UTF_8));
        buffer.flip();
        Thread.sleep(1000L);
        channel.write(buffer);
 }
}

2.3.3 Asynchronous Definition Conjecture Conclusion

Run the server and client programs separately

640.png

In the result of running the server,

The main thread initiates a call to serverChannel.accept, and adds a CompletionHandler to monitor the callback. When a client connects, the Thread-5 thread executes the completed callback method of accept.

Immediately afterwards, Thread-5 initiated the clientChannel.read call, and added a CompletionHandler to monitor the callback. When receiving data, Thread-1 executed the completed callback method of read.

This conclusion is consistent with the asynchronous conjecture above. The thread that initiates the IO operation (such as accept, read, write) is not the same as the thread that finally completes the operation. We call this IO mode AIO .

Of course, defining AIO in this way is just for our understanding. In practice, the definition of asynchronous IO may be more abstract.

3. AIO example prompts thinking questions

1. Who created the thread executing the completed() method, and when was it created?

2. How to implement AIO registration event monitoring and execution callback?

3. What is the essence of monitoring callback?

3.1 Question 1: Who created the thread that executes the completed() method, and when was it created

Generally, such a problem needs to be understood from the entrance of the program, but it is related to the thread. In fact, it is possible to locate how the thread runs from the running status of the thread stack.

Only run the AIO server program, the client does not run, print the thread stack (Note: the program runs on the Linux platform, and other platforms are slightly different)

6401.png

Analyze the thread stack and find that the program starts so many threads

1. Thread Thread-0 is blocked on the EPoll.wait() method

2. Thread Thread-1, Thread-2. . . Thread-n (n is the same as the number of CPU cores) takes() tasks from the blocking queue, and blocks waiting for a task to return.

At this point, the next conclusion can be tentatively drawn:

After the AIO server program is started, these threads are created, and the threads are all in a blocked waiting state.

In addition, I found that the running of these threads is related to Epoll. When it comes to Epoll, we have the impression that Java NIO is implemented with Epoll at the bottom of the Linux platform. Is Java AIO also implemented with Epoll ? In order to confirm this conclusion, we discuss from the next question

3.2 Question 2: How to implement AIO registration event monitoring and execution callback

With this problem in mind, when I read and analyzed the source code, I found that the source code is very long, and source code parsing is a boring process, which can easily drive readers away.

For the understanding of long process and logically complex code, we can grasp its several contexts and find out which core processes.

Take the registration listener read as an example clientChannel.read(…), its main core process is:

1. Register event -> 2. Listen to event -> 3. Process event

3.2.1 1. Registration event

6402.png

The registration event calls the EPoll.ctl(…) function, and the last parameter of this function is used to specify whether it is one-time or permanent. The above code events | EPOLLONSHOT literally means that it is a one-off.

3.2.2 2. Monitor events

6408.png

3.2.3 3. Handling events

6409.png

64010.png

64011.png

3.2.4 Summary of core processes

64012.png

After analyzing the above code flow, you will find that the three events that must be experienced for each IO read and write are one-time, that is, after the event is processed, this process is over. If you want to continue the next IO To read and write, you have to start all over again. In this way, there will be a so-called death callback (the next callback method is added to the callback method), which greatly increases the complexity of programming.

3.3 Question 3: What is the essence of monitoring callbacks?

Let me talk about the conclusion first. The essence of the so-called monitoring callback is the user-mode thread, which calls the kernel-mode function (accurately speaking, API, such as read, write, epollWait). When the function has not returned, the user thread is blocked. When the function returns, the blocked thread is woken up and the so-called callback function is executed .

To understand this conclusion, we must first introduce several concepts

3.3.1 System calls and function calls

function call:

Find a function and execute related commands in the function

System call:

The operating system provides a programming interface to user applications, the so-called API.

System call execution process:

1. Pass system call parameters

2. Execute trapped instructions, switch from user mode to core mode, because system calls generally need to be executed in core mode

3. Execute the system call program

4. Return to user state

3.3.2 Communication between user mode and kernel mode

User mode -> Kernel mode, just through system calls.

Kernel mode -> user mode, the kernel mode does not know what functions the user mode program has, what are the parameters, and where is the address. Therefore, it is impossible for the kernel to call functions in the user mode, but only by sending signals. For example, the kill command to close the program is to let the user program exit gracefully by sending signals.

Since it is impossible for the kernel state to actively call functions in the user state, why is there a callback? It can only be said that this so-called callback is actually a self-directed and self-performed user state. It not only monitors, but also executes the callback function.

3.3.3 Verify the conclusion with practical examples

In order to verify whether this conclusion is convincing, for example, IntelliJ IDEA, which is usually used to develop and write code, listens to mouse and keyboard events and handles events.

According to the convention, first print the thread stack, and you will find that the "AWT-XAWT" thread is responsible for monitoring events such as mouse and keyboard, and the "AWT-EventQueue" thread is responsible for event processing.

64013.png

Locating to the specific code, you can see that "AWT-XAWT" is doing a while loop, calling the waitForEvents function to wait for the event to return. If there is no event, the thread has been blocked there.

64014.png

4. What is the essence of Java AIO?

1. Since the kernel mode cannot directly call user mode functions, the essence of Java AIO is to implement asynchrony only in user mode. It does not achieve asynchrony in the ideal sense.

ideal asynchronous

What is asynchrony in the ideal sense? Here is an example of online shopping

Two roles, consumer A and courier B

  • When A is shopping online, fill in the home address to pay and submit the order, which is equivalent to registering the monitoring event

  • The merchant delivers the goods, and B delivers the item to A's door, which is equivalent to a callback.

After A places the order online, he does not need to worry about the subsequent delivery process, and can continue to do other things. B does not care whether A is at home or not when delivering the goods. Anyway, just throw the goods at the door of the house. The two people do not depend on each other and do not interfere with each other .

Assuming that A's shopping is done in user mode, and B's express delivery is done in kernel mode, this kind of program operation mode is too ideal, and it cannot be realized in practice.

Asynchrony in reality

A lives in a high-end residential area, and cannot enter at will, and the courier can only be delivered to the gate of the residential area.

A bought a relatively heavy product, such as a TV, because A was going to work and was not at home, so he asked a friend C to help move the TV to his house.
Before A leaves for work, he greets the security guard D at the door, saying that a TV will be delivered today. When it is delivered to the gate of the community, please call C and ask him to come and pick it up.

  • At this point, A places an order and greets D, which is equivalent to registering an event. In AIO it is EPoll.ctl(...) registration event.

  • The security guard squatting at the door is equivalent to listening to the event. In AIO, it is the Thread-0 thread. Do EPoll.wait(…)

  • The courier delivered the TV to the door, which is equivalent to the arrival of an IO event.

  • The security guard notifies C that the TV has arrived, and C comes to move the TV, which is equivalent to handling the incident.

In AIO, Thread-0 submits tasks to the task queue.

Thread-1 ~n to fetch data and execute the callback method.

During the whole process, security guard D had to squat all the time, and could not leave even an inch, otherwise the TV would be stolen when it was delivered to the door.

Friend C also has to stay at A's house. He is entrusted by someone, but the person is not there when the things arrive. This is a bit dishonest.

Therefore, the actual asynchrony and the ideal asynchrony are independent of each other and do not interfere with each other. These two points are contrary to each other . The role of security is the greatest, and this is the highlight moment of his life.

Registering events, listening to events, processing events, and enabling multi-threading in the asynchronous process, the initiators of these processes are all handled by the user mode, so Java AIO only implements asynchrony in the user mode, which is blocked first with BIO and NIO , the essence of starting asynchronous thread processing after blocking wakeup is the same.

2. Java AIO is the same as NIO, and the underlying implementation methods of each platform are also different. EPoll is used in Linux, IOCP is used in Windows, and KQueue is used in Mac OS. The principle is the same, all require a user thread to block and wait for IO events, and a thread pool to process events from the queue.

3. The reason why Netty removed AIO is that AIO is not higher than NIO in terms of performance. Although Linux also has a set of native AIO implementations (similar to IOCP on Windows), Java AIO is not used in Linux, but is implemented with EPoll.

4. Java AIO does not support UDP

5. The AIO programming method is slightly complicated, such as "death callback"

{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/5783135/blog/8570287