[java network programming] io model

1. Synchronous blocking IO (BIO)

Full name of BIO (blocking I/O): synchronous blocking.
When the client has a connection request, the server needs to start a thread to read and write data on the client's connection. If the client does not read and write, the thread will also wait, which will cause blocking.

1. tcp code

Server:

        //1、创建server端socket
        ServerSocket serverSocket = new ServerSocket();
        //2、绑定9000端口
        serverSocket.bind(new InetSocketAddress(9000));
        //3、监听端口(没有客户端连接时会阻塞)
        Socket socket = serverSocket.accept();

        //4、读客户端发来的消息(没有消息时会阻塞)
        byte[] bytes = new byte[1024];
        socket.getInputStream().read(bytes);
        System.out.println("收到客户端消息:"+new String(bytes,StandardCharsets.UTF_8));

        //5、响应客户端消息
        socket.getOutputStream().write("我是服务端".getBytes(StandardCharsets.UTF_8));

        //6、关闭socket
        socket.close();

client

        //1、创建socket
        Socket socket = new Socket();
        //2、连接服务端socket
        socket.connect(new InetSocketAddress(9000));
        //3、发送消息
        socket.getOutputStream().write("我是客户端".getBytes(StandardCharsets.UTF_8));

        //4、接收消息
        byte[] bytes = new byte[1024];
        socket.getInputStream().read(bytes);
        System.out.println("收到服务端消息:"+new String(bytes,StandardCharsets.UTF_8));

        //5、关闭socket
        socket.close();

2. udp code

Server

        //1、创建socket对象,指定监听的端口号
        DatagramSocket socket = new DatagramSocket(9999);

        byte[] buf = new byte[1024];
        //2、创建packet,接收数据,指定字节数组buf接收数据,最多接收buf.length长度的数据
        DatagramPacket packet = new DatagramPacket(buf, 0, buf.length);

        //3、接收数据,数据存放在packet对象的buf数组中(socket中的receive方法会阻塞线程,等待客户端发送数据)
        socket.receive(packet);
        System.out.println("服务器接收到的数据为" + new String(buf,0, packet.getLength()));

        //4、关闭socket
        socket.close();

client

        //1、创建socket负责发送数据
        DatagramSocket socket = new DatagramSocket();

        byte[] buf = "hello world".getBytes();
        //2、打包好数据报,指定要发送到的ip和端口号
        DatagramPacket packet = new DatagramPacket(buf, 0,buf.length, InetAddress.getByName("localhost"),9999);

        //3、发送数据给服务器端
        socket.send(packet);

        //4、关闭socket
        socket.close();

shortcoming:

  1. After each client establishes a connection, it needs to create an independent thread to read and write data and process business with the client.
  2. When the number of concurrency is large, a large number of processes will be created to handle connections, and system resources will have a large overhead
  3. After the connection is established, if the thread serving the client has no data to read, the thread will be blocked on the Read operation and wait for the data to be read before reading, resulting in a waste of thread resources

2. Synchronous non-blocking IO (NIO)

When the user process issues a read operation, if the data in the kernel is not ready, it will not block the user process, but immediately return an error. From the user's point of view, after initiating a read operation, it does not Need to wait, but wait for a result immediately. When the user process judges that it is an error, he knows that the data exchange is not ready, so it sends the read operation again. Once the data in the kernel is ready, and receives the user process again. read, then he will copy the data to the user memory at this time, and then return.

The sockets created by default are all blocking, and non-blocking IO requires the socket to be set to NONBLOCK. Note that the NIO mentioned here is not Java's NIO (New IO) library.

The core of the Java NIO system lies in: channel (Channel) and buffer (Buffer) . A channel represents an open connection to an IO device (eg: file, socket). If you need to use the NIO system, you need to obtain the channel used to connect the IO device and the buffer used to hold the data. Then operate the buffer and process the data. In short, Channel is responsible for transmission and Buffer is responsible for storage

1. Channel Channel

java.nio.channels.Channel Under the package, the implementation class is:

  • FileChannel
  • SocketChannel
  • ServerSocketChannel
  • DatagramChannel

2. Buffer Buffer

(1) Buffer type

Commonly used ByteBuffer

(2) Buffer allocation location

heap area

The buffer is allocated through  allocate() the method, and the buffer is established in the memory of the JVM.

off-heap area

The buffer is allocated through  allocateDirect() the method, and the buffer is established in the physical memory.

the code

        //定义一个list保存建立连接的socketChannel
        List<SocketChannel> channelList = new LinkedList<>();

        //打开一个socket通道
        ServerSocketChannel serverSocketChannel = ServerSocketChannel.open();
        serverSocketChannel.bind(new InetSocketAddress(9000));
        //设置为非阻塞
        serverSocketChannel.configureBlocking(false);

        while (true) {
            //接收连接
            SocketChannel socketChannel = serverSocketChannel.accept();
            if (socketChannel != null) {
                System.out.println("连接成功");
                socketChannel.configureBlocking(false);
                //添加到channelList集合中
                channelList.add(socketChannel);
            }

            //遍历集合进行数据读取
            Iterator<SocketChannel> iterator = channelList.iterator();
            while (iterator.hasNext()){

                //非阻塞模式,read方法不会阻塞
                ByteBuffer byteBuffer = ByteBuffer.allocate(1024);
                int len = iterator.next().read(byteBuffer);
                if(len>0){
                    System.out.println("收到消息"+new String(byteBuffer.array()));
                }else {
                    iterator.remove();
                    System.out.println("客户端断开连接");
                }
            }

        }

shortcoming:

Polling all the time consumes a lot of CPU.

3. IO multiplexing

Also known as asynchronous blocking IO, Selector in Java and epoll in Linux are both of this model. Reusing here refers to reusing one or several threads, using one or a group of threads to handle multiple IO operations, reducing system overhead, and not having to create and maintain too many processes/threads;

the code


        //打开一个socket通道
        ServerSocketChannel serverSocketChannel = ServerSocketChannel.open();
        serverSocketChannel.bind(new InetSocketAddress(9000));
        //设置为非阻塞
        serverSocketChannel.configureBlocking(false);

        //创建epoll(打开selector处理channel)
        Selector selector = Selector.open();

        //把ServerSocketChannel注册到selector上
        serverSocketChannel.register(selector, SelectionKey.OP_ACCEPT);

        while (true) {
            //阻塞等待需要处理的事件发生
            selector.select();

            //获取selector中注册的全部事件的 Selectionkey 实例
            Set<SelectionKey> selectionKeys = selector.selectedKeys();
            Iterator<SelectionKey> iterator = selectionKeys.iterator();
            while (iterator.hasNext()) {
                SelectionKey selectionKey = iterator.next();

                //如果是Accept连接事件
                if (selectionKey.isAcceptable()) {
                    ServerSocketChannel serverSocketChannel1 = (ServerSocketChannel)selectionKey.channel();
                    SocketChannel socketChannel = serverSocketChannel1.accept();
                    socketChannel.configureBlocking(false);
                    //这里只注册了读事件,如果需要给客户端发数据可以注册写事件
                    socketChannel.register(selector,SelectionKey.OP_READ);
                    System.out.println("连接客户端");

                } else if (selectionKey.isReadable()) {//如果是读事件,则进行读取
                    SocketChannel socketChannel1 = (SocketChannel)selectionKey.channel();
                    ByteBuffer byteBuffer = ByteBuffer.allocate(1024);
                    int len = socketChannel1.read(byteBuffer);
                    if(len>0){
                        System.out.println("收到消息"+new String(byteBuffer.array()));
                    }else {
                        iterator.remove();
                        System.out.println("客户端断开连接");
                    }
                }

            }
        }

shortcoming:

When a large number of connections have read and write events (unable to execute until selector.select()), for example: there are 100,000 connections, and 90,000 have read and write events, business processing is very time-consuming, which will result in the inability to establish new connections.

underlying principle

IO multiplexing uses two system calls (select/poll/epoll and recvfrom), and blocking IO only calls recvfrom; the core of select/poll/epoll is that it can process multiple connections at the same time, not faster, so the number of connections is not If it is high, the performance is not necessarily better than multithreading + blocking IO. In the multiplexing model, each socket is set to non-blocking, and blocking is blocked by the select function block, not by the socket.

(1) select mechanism

When the client operates the server, these three file descriptors (fd for short): writefds (write), readfds (read), and exceptfds (abnormal). select will block and monitor the 3 types of file descriptors, and will return when there is data, readable, writable, abnormal or timed out; after returning, find the ready descriptor fd by traversing the entire array of fdset, and then perform the corresponding IO operate.

Advantages:
  Supported on almost all platforms, good cross-platform support
Disadvantages:

  1. Since the polling method is used to scan the entire disk, the performance will decrease as the number of file descriptors FD increases.
  2. Every time select() is called, the fd collection needs to be copied from the user state to the kernel state and traversed (the message is passed from the kernel to the user space)
  3. By default, the number of FDs opened by a single process is limited to 1024. The macro definition can be modified, but the efficiency is still slow.

(2) Poll mechanism

The basic principle is the same as select, except that there is no limit on the maximum file descriptor, because a linked list is used to store fd.

(3) epoll mechanism

The high performance of epoll is due to its three functions

1. When the epoll_create() system starts, apply for a B+ tree structure file system in the Linux kernel, and return the epoll object, which is also an fd.

 

  1. Every time epoll_ctl() creates a new connection, it operates the epoll object through this function, modifies, adds and deletes the corresponding link fd in this object, and binds a callback function.

  1. epoll_wait() polls all callback sets and completes the corresponding IO operations

advantage:

  1. There is no fd limit, the supported FD limit is the maximum number of file handles in the operating system, and 1G memory supports about 100,000 handles
  2. Efficiency is improved, using callback notification instead of polling, the efficiency will not decrease as the number of FDs increases
  3. Kernel and user space mmap the same block of memory implementation

 as shown in the picture

Four, aio (asynchronous io)

Asynchronous IO is implemented based on the event and callback mechanism, that is, the application operation will return directly without being blocked there. When the background processing is completed, the operating system will notify the corresponding thread to perform subsequent operations.
At present, the application of AIO is not very extensive. Netty also tried to use AIO before, but it did not improve the performance much, so it gave up again.

In JAVA7, based on the IO of asynchronous Channel, multiple channel interfaces and classes beginning with Asynchronous were added under the java.nio.channels package for AIO communication. Java7 called them NIO.2.

Guess you like

Origin blog.csdn.net/sumengnan/article/details/125077198