Understanding Network IO model of Linux from the operating system level

I / O (INPUT OUTPUT), comprising a file I / O, network I / O. The computer world of speed contempt:

  • Memory read data: nanosecond.

  • Gigabit Ethernet data read: subtle level. 1000 ns = 1 microsecond, a thousand times slower than the memory card.

  • Disk read data: milliseconds. 1 ms = 100,000 ns, 10 times slower than a hard disk memory.

  • A CPU 1 ns clock cycle up and down, is relatively close to the memory of the CPU, the other can not afford.

CPU processing speed is much greater than the data I / O speed of data preparation.

Any programming language will encounter problems with this CPU processing speed and I / O speeds do not match!

How to Network Programming Network I / O optimization: how efficient use of the CPU for network data processing? ? ?

Related concepts

How to understand the network I / O it from the operating system level? World set their own definition of the concept of a computer. If you do not understand these concepts, we can not really understand the nature of technology and design ideas. So in my opinion, these concepts are the basis to understand the technical and computer world.

1.1 synchronous and asynchronous, blocking and non-blocking

Understanding Network I / O not avoid the topic: synchronous and asynchronous, blocking and non-blocking. Take Sanji boil water for example, (like user behavior Sanji's program, like water heating system kernel calls), National Cheng Kung University vernacular translation of the concept of these two groups can be understood.

  • Synchronous / asynchronous concerned that after the water to a boil or need me to deal with.
  • Blocking / non-blocking is concerned that during this time the water boil is not done other things.

1.1.1 synchronous blocking

After ignition, Shadeng, do not wait until the water was determined never did anything (blocking), open water off the heat (synchronous).

1.1.2 Non-blocking synchronization

After ignition, watch TV (non-blocking) from time to time to see the water is not open, turn off the heat (synchronous) After the water.

1.1.3 Asynchronous blocking

Switch is pressed, water Shadeng opening (blocking), automatic power-off (asynchronously) After the water.

Network programming model that does not exist.

1.1.4 asynchronous non-blocking

Switch is pressed, the Why Why (non-blocking), the automatic power-off (asynchronously) After the water.

1.2 kernel space and user space

  • Kernel is responsible for network and file read and write data.
  • User invokes the data obtained through the network and file system.

1.2.1 user mode kernel mode

  • Program to read and write data have the system call happen.
  • Through the system call interface thread switch from user mode to kernel mode, the kernel to read and write data, then switch back.
  • Different spatial state of the process or thread.

1.2.2 switching thread

Switching time-consuming user mode and kernel mode, fee resources (memory, CPU) optimization suggestions:

  • Fewer switches.
  • Shared space.

1.3 socket - socket

  • With sockets, you can network programming.
  • Applications through the system call socket (), to establish a connection, receiving and transmitting data (I / O).
  • SOCKET supports non-blocking, non-blocking calls to the application, supporting asynchronous, application to asynchronous call

1.4 handle file descriptors -FD

** Network programming need to know FD? ? ? FD is what the hell? ? ? ** Linux: everything is a file, FD is a reference document. Look like everything is an object in JAVA? Program operation is referenced object. The number of objects created in JAVA limited memory, the same number of FD is also limited.

Linux when dealing with files and network connections, you need to open and close the FD. Each process will have a default FD:

  • 0 standard input stdin
  • 1 to stdout
  • 2 error output stderr

1.5 network server processing the request

  • After establishing the connection.
  • Wait for data ready (CPU idle).
  • Copying the data from the kernel to process (CPU idle).

** how to optimize it? ** For one I / O access (for example to read), the data is first copied to the operating system kernel buffer, and operating system kernel will be copied from the buffer to the address space of the application. So, when a read operation occurs, it will go through two stages:

  • Wait for data preparation (Waiting for the data to be ready).
  • Copying the data from the kernel to process (Copying the data from the kernel to the process).

It is because of these two phases, Linux system upgrade iteration appears in the following three network mode solution.

I / O models

2.1 阻塞 I/O - Blocking I/O

Description: The most original network I / O model. Process will block until data copying is completed. Disadvantages: high concurrency, service and client peer connection, problems caused by multi-thread:

  • The CPU resources, context switching.
  • Geometric memory costs rise, the cost of a JVM thread about 1MB.
public static void main(String[] args) throws IOException {
        ServerSocket ss = new ServerSocket();
        ss.bind(new InetSocketAddress(Constant.HOST, Constant.PORT));
        int idx =0;
        while (true) {
            final Socket socket = ss.accept();//阻塞方法
            new Thread(() -> {
                handle(socket);
            },"线程["+idx+"]" ).start();
        }
    }

    static void handle(Socket socket) {
        byte[] bytes = new byte[1024];
        try {
            String serverMsg = "  server sss[ 线程:"+ Thread.currentThread().getName() +"]";
            socket.getOutputStream().write(serverMsg.getBytes());//阻塞方法
            socket.getOutputStream().flush();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
复制代码

2.2 Non-blocking I / O - Non Blocking IO

Introduction: The process is repeated system calls, and returns the result immediately. Disadvantage: When the process has 1000fds, polling occurs on behalf of the user process 1000 Kernel system call, user mode and kernel mode switching back and forth, a geometric increase in cost.

public static void main(String[] args) throws IOException {
        ServerSocketChannel ss = ServerSocketChannel.open();
        ss.bind(new InetSocketAddress(Constant.HOST, Constant.PORT));
        System.out.println(" NIO server started ... ");
        ss.configureBlocking(false);
        int idx =0;
        while (true) {
            final SocketChannel socket = ss.accept();//阻塞方法
            new Thread(() -> {
                handle(socket);
            },"线程["+idx+"]" ).start();
        }
    }
    static void handle(SocketChannel socket) {
        try {
            socket.configureBlocking(false);
            ByteBuffer byteBuffer = ByteBuffer.allocate(1024);
            socket.read(byteBuffer);
            byteBuffer.flip();
            System.out.println("请求:" + new String(byteBuffer.array()));
            String resp = "服务器响应";
            byteBuffer.get(resp.getBytes());
            socket.write(byteBuffer);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
复制代码

2.3 I / O multiplexer - IO multiplexing

Description: A single thread can handle multiple network connections at the same time. Kernel is responsible for polling all socket, a socket when the data arrives, and informs the user process. Multiplexer in turn supports three call in Linux kernel code iterative process, ie SELECT, POLL, EPOLL three kinds multiplexed network I / O model. Java code will hereinafter be explained in conjunction with the drawing.

2.3.1 I / O multiplexer - select

Description: connection request arrived at the re-inspection process. Disadvantages:

  • Handle cap - open by default FD limited, 1024.
  • Repeat Initialization - each select call (), the need to traverse from a user fd set copy mode to kernel mode, the kernel.
  • One by one investigation of all FD state is not efficient.

select the service side like a covered socket inserted row, on the client side of the connection even one jack, built a channel, then turn the channel register read and write events. A ready, you must remember to remove the read or write event handler, or the next can handle.

public static void main(String[] args) throws IOException {
        ServerSocketChannel ssc = ServerSocketChannel.open();//管道型ServerSocket
        ssc.socket().bind(new InetSocketAddress(Constant.HOST, Constant.PORT));
        ssc.configureBlocking(false);//设置非阻塞
        System.out.println(" NIO single server started, listening on :" + ssc.getLocalAddress());
        Selector selector = Selector.open();
        ssc.register(selector, SelectionKey.OP_ACCEPT);//在建立好的管道上,注册关心的事件 就绪
        while(true) {
            selector.select();
            Set<SelectionKey> keys = selector.selectedKeys();
            Iterator<SelectionKey> it = keys.iterator();
            while(it.hasNext()) {
                SelectionKey key = it.next();
                it.remove();//处理的事件,必须删除
                handle(key);
            }
        }
    }
    private static void handle(SelectionKey key) throws IOException {
        if(key.isAcceptable()) {
                ServerSocketChannel ssc = (ServerSocketChannel) key.channel();
                SocketChannel sc = ssc.accept();
                sc.configureBlocking(false);//设置非阻塞
                sc.register(key.selector(), SelectionKey.OP_READ );//在建立好的管道上,注册关心的事件 可读
        } else if (key.isReadable()) { //flip
            SocketChannel sc = null;
                sc = (SocketChannel)key.channel();
                ByteBuffer buffer = ByteBuffer.allocate(512);
                buffer.clear();
                int len = sc.read(buffer);
                if(len != -1) {
                    System.out.println("[" +Thread.currentThread().getName()+"] recv :"+ new String(buffer.array(), 0, len));
                }
                ByteBuffer bufferToWrite = ByteBuffer.wrap("HelloClient".getBytes());
                sc.write(bufferToWrite);
        }
    }
复制代码

2.3.2 I / O multiplexer - poll

Description: design new data structures (linked lists) to provide efficiency. poll and select, in essence, compared to little change, but the poll is no way to select a limit on the maximum number of file descriptors. Disadvantages: one by one investigation of all FD state is not efficient.

2.3.3 I / O multiplexer - epoll

Description: There is no limit to the number fd, user mode to kernel mode need to copy only once, using the event notification mechanism to trigger. Epoll_ctl registered by fd, fd is ready once the callback mechanism will be activated by the corresponding callback fd, related to I / O operations. Disadvantages:

  • Cross-platform, Linux support is best.
  • The underlying implementation complexity.
  • Synchronize.
 public static void main(String[] args) throws Exception {
        final AsynchronousServerSocketChannel serverChannel = AsynchronousServerSocketChannel.open()
                .bind(new InetSocketAddress(Constant.HOST, Constant.PORT));
        serverChannel.accept(null, new CompletionHandler<AsynchronousSocketChannel, Object>() {
            @Override
            public void completed(final AsynchronousSocketChannel client, Object attachment) {
                serverChannel.accept(null, this);
                ByteBuffer buffer = ByteBuffer.allocate(1024);
                client.read(buffer, buffer, new CompletionHandler<Integer, ByteBuffer>() {
                    @Override
                    public void completed(Integer result, ByteBuffer attachment) {
                        attachment.flip();
                        client.write(ByteBuffer.wrap("HelloClient".getBytes()));//业务逻辑
                    }
                    @Override
                    public void failed(Throwable exc, ByteBuffer attachment) {
                        System.out.println(exc.getMessage());//失败处理
                    }
                });
            }

            @Override
            public void failed(Throwable exc, Object attachment) {
                exc.printStackTrace();//失败处理
            }
        });
        while (true) {
            //不while true main方法一瞬间结束
        }
    }
复制代码

Of course, the above disadvantage compared to the advantages it can be ignored. JDK provides asynchronous manner, but in the actual underlying Linux environment or epoll, but more than one cycle, not a true asynchronous non-blocking. And, like the image above code calls, code and business code that handles network connections decoupling was not good enough. Netty provides a concise, decoupling, clearly structured API.

 public static void main(String[] args) {
        new NettyServer().serverStart();
        System.out.println("Netty server started !");
    }

    public void serverStart() {
        EventLoopGroup bossGroup = new NioEventLoopGroup();
        EventLoopGroup workerGroup = new NioEventLoopGroup();
        ServerBootstrap b = new ServerBootstrap();
        b.group(bossGroup, workerGroup)
                .channel(NioServerSocketChannel.class)
                .childHandler(new ChannelInitializer<SocketChannel>() {
                    @Override
                    protected void initChannel(SocketChannel ch) throws Exception {
                        ch.pipeline().addLast(new Handler());
                    }
                });
        try {
            ChannelFuture f = b.localAddress(Constant.HOST, Constant.PORT).bind().sync();
            f.channel().closeFuture().sync();
        } catch (InterruptedException e) {
            e.printStackTrace();
        } finally {
            workerGroup.shutdownGracefully();
            bossGroup.shutdownGracefully();
        }
    }
}

class Handler extends ChannelInboundHandlerAdapter {
    @Override
    public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception {
        ByteBuf buf = (ByteBuf) msg;
        ctx.writeAndFlush(msg);
        ctx.close();
    }

    @Override
    public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause) throws Exception {
        cause.printStackTrace();
        ctx.close();
    }
}
复制代码

bossGroup processing network requests housekeeper (who), when the network connection is ready, give workers workGroup to work, (who).

to sum up

review

  • Synchronous / asynchronous, the connection is established, the user program to read and write, if the end user procedures still need to call the system read () to read the data, that is synchronous, asynchronous and vice versa. Windows to achieve a true asynchronous, the kernel code is very complicated, but the user program is transparent.

  • Blocking / non-blocking, the connection is established, the user program while waiting for read and write, is not doing something else. If that is non-blocking, blocking and vice versa. Most operating systems are supported.

Why Redis, Nginx, Netty, Node.js so sweet?

These technologies are accompanied by the Linux kernel system calls iteration provides efficient processing of network requests and appear. Understand the underlying computer knowledge to a deeper understanding of I / O, I know these, to know why. And the king of mutual encouragement!

Guess you like

Origin juejin.im/post/5e020b53e51d455824271f53