Remote communication protocol learning (two) using the protocol to communicate in java

The previous article introduced the basic concepts of network communication, this article uses java to achieve communication

Use protocol to communicate

After the tcp connection is established, you can send and receive messages based on this connection channel. TCP and UDP are both transmission protocols extended for certain types of application scenarios based on the Socket concept. So what is a socket? Socket is a kind of The abstraction layer, through which the application sends and receives data, is just like the application opens a file handle and reads and writes the data to the disk. Use sockets to add applications to the network and communicate with other applications on the same network. Different types of sockets are related to different types of underlying protocol clusters. The main socket types are stream socket and datagram socket .

The stream socket uses TCP as an end-to-end protocol (the bottom layer uses the IP protocol) to provide a reliable byte stream service.
The datagram socket uses the UDP protocol (the bottom layer also uses the IP protocol) to provide a "best effort" data message service.

Insert picture description here
Next, we use the API provided by Java to show the case of the client and server communication of the TCP protocol and the case of the client and server communication of the UDP protocol, and then further understand the underlying principles

Communication based on TCP protocol

Implement a simple function of sending a message from the client to the server

Receiving end

class Test {
    
    

    public static void main(String[] args) throws IOException {
    
    
        ServerSocket serverSocket = null;
        BufferedReader in = null;
        try {
    
    
            /*TCP 的服务端要先监听一个端口，一般是先调用
            bind 函数，给这个 Socket 赋予一个 IP 地址和端
            口。为什么需要端口呢?要知道，你写的是一个应用
            程序，当一个网络包来的时候，内核要通过 TCP 头里
            面的这个端口，来找到你这个应用程序，把包给你。
            为什么要 IP 地址呢?有时候，一台机器会有多个网
            卡，也就会有多个 IP 地址，你可以选择监听所有的
            网卡，也可以选择监听一个网卡，这样，只有发给这
            个网卡的包，才会给你。*/
            serverSocket = new ServerSocket(8081);
            Socket socket = serverSocket.accept();//阻塞等待客户端连接
            // 连接建立成功之后，双方开始通过 read 和 write函数来读写数据，就像往一个文件流里面写东西一样。
            in = new BufferedReader(new InputStreamReader(socket.getInputStream()));
            System.out.println(in.readLine());
        } catch (Exception e) {
    
    
            e.printStackTrace();
        } finally {
    
    
            if (in != null) {
    
    
                try {
    
    
                    in.close();
                } catch (IOException e) {
    
    
                    e.printStackTrace();
                }
            }
            if (serverSocket != null) {
    
    
                serverSocket.close();
            }
        }
    }

}

Sender

public class Test02 {
    
    

    public static void main(String[] args) {
    
    
        Socket socket = null;
        PrintWriter out = null;
        try {
    
    
            socket = new Socket("127.0.0.1", 8081);
            out = new PrintWriter(socket.getOutputStream(), true);
            out.println("Hello, xhc");
        } catch (IOException e) {
    
    
            e.printStackTrace();
        } finally {
    
    
            if (out != null) {
    
    
                out.close();
            }
            if (socket != null) {
    
    
                try {
    
    
                    socket.close();
                } catch (IOException e) {
    
    
                    e.printStackTrace();
                }
            }
        }
    }
}

Let’s look at another example
based on TCP to achieve two-way communication dialogue function

TCP is a full-duplex protocol. Data communication allows data to be transmitted in both directions at the same time. Therefore, full-duplex is a combination of two simplex communication methods. It requires the sending and receiving devices to have independent receiving and sending capabilities . Let's make a simple implementation

Server side

public class Server {
    
    

    public static void main(String[] args) throws IOException {
    
    
        try {
    
    
            //创建一个ServerSocket在端口8081监听客户请求
            ServerSocket server = null;
            server = new ServerSocket(8081);
            Socket socket = null;
            try {
    
    
                // 使用accept() 阻塞等待客户请求
                socket = server.accept();
                // 有客户请求来则产生一个socket对象，并继续执行
            } catch (Exception e) {
    
    
                System.out.println("Error." + e);
            }
            String line;
            // 由socket对象得到输入流，并构造响应的BufferedReader对象
            BufferedReader is = new BufferedReader(new InputStreamReader(socket.getInputStream()));
            // 由socket对象得到输出流，并构造PrintWriter对象
            PrintWriter os = new PrintWriter(socket.getOutputStream());
            // 由系统标准输入设备（监听键盘输入）构造BufferedReader对象
            BufferedReader sin = new BufferedReader(new InputStreamReader(System.in));
            // 在显示器上输出打印从客户端接收到的数据
            System.out.println("Client:" + is.readLine());
            // 从键盘读取输入的数据
            line = sin.readLine();
            while (!line.equals("bye")) {
    
    
                // 如果该字符串为bye，停止循环
                os.println(line);// 向客户端传送输入的数据
                os.flush();//刷新输出流，使Client能马上接收到该字符串
                System.out.println("Server:" + line);
                System.out.println("Client:" + is.readLine());
                line = sin.readLine();// 继续从键盘读取数据
            }
            os.close();//关闭Socket输出流
            is.close();//关闭Socket输入流
            socket.close();//关闭Socket
            server.close();//关闭ServerSocket
        } catch (Exception e) {
    
    
            System.out.println("Error:" + e);
        }
    }
}

Client side

public class Client {
    
    
    public static void main(String[] args) {
    
    
        try {
    
    
            // 向本级8081端口发出客户端请求
            Socket socket = new Socket("127.0.0.1",8081);
            // 监听键盘输入
            BufferedReader sin = new BufferedReader(new InputStreamReader(System.in));
            // 由Socket对象得到输出流，并构造PrintWriter对象
            PrintWriter os = new PrintWriter(socket.getOutputStream());
            // 由Socket对象得到输入流，并构造BufferedReader对象
            BufferedReader is = new BufferedReader(new InputStreamReader(socket.getInputStream()));
            String readline;
            readline = sin.readLine();
            while (!readline.equals("bye")){
    
    
                os.println(readline);
                os.flush();
                System.out.println("Client:"+readline);
                System.out.println("Server:"+is.readLine());
                readline = sin.readLine();
            }
            os.close();
            is.close();
            socket.close();
        }catch (Exception e){
    
    
            System.out.println("ERROR"+e);
        }
    }
}

Summary
We use a diagram to briefly describe the socket link establishment and communication model.
Insert picture description here
Through the above simple case, it is basically clear how to use socket sockets in a Java application to establish a communication process based on the tcp protocol. Next, we are going to understand what the underlying communication process of tcp is like

Understand the communication process of the TCP protocol.
First of all, for TCP communication, there is a sending buffer and a receiving buffer in the kernel of each TCP Socket. The full duplex working mode of TCP and the sliding window of TCP depend on this. Two independent Buffers and the filling status of the Buffer.
The receive buffer caches the data to the kernel. If the application process has not called the read method of the Socket to read, the data will always be cached in the receive buffer. Regardless of whether the process reads the Socket, the data sent from the opposite end will be received by the kernel and cached in the kernel receiving buffer of the Socket.
What read needs to do is to copy the data in the kernel receive buffer to the application layer user's Buffer. When a process calls Socket's send to send data, it usually copies the data from the application layer user's Buffer to the Socket's kernel sending buffer, and then send will return at the upper layer. In other words, when send returns, the data will not necessarily be sent to the opposite end.

Insert picture description here
As we mentioned earlier, the receiving buffer of the Socket is used by TCP to buffer the data received on the network until it is read by the application process. If the application process has not read it all the time, after the Buffer is full, what happens is that the window in the TCP protocol of the peer is notified to close, to ensure that the TCP receiving buffer will not be removed, and to ensure reliable TCP transmission. If the other party sends out data that exceeds the window size regardless of the window size, the receiver will discard the data.

Sliding Window Protocol
This process involves TCP's sliding window protocol. Sliding window (Sliding window) is a flow control technology. In the early network communication, the communication parties would not consider the congestion of the network to send data directly. Since everyone does not know the network congestion status and sends data at the same time, the intermediate nodes block and drop packets, and no one can send the data, so there is a sliding window mechanism to solve this problem; both the sender and the receiver maintain a sequence of data frames, this sequence Called window

Insert picture description here

The sending window
is the sequence number list of the frames that the sender is allowed to send continuously. The maximum number of frames that the sender can send continuously
without waiting for a response is called the size of the sending window.
Receiving window
The sequence number table of the frames that the receiver is allowed to receive. All frames that fall within the receiving window must be processed by the receiver, and frames that fall outside the receiving window are discarded. The number of frames that the receiver allows to receive each time is called the size of the receiving window.

https://media.pearsoncmg.com/aw/ecs_kurose_compnetwork_7/cw/content/interactiveanima tions/selective-repeat-protocol/index.html

block

After understanding the basic communication principle, let's think about another question. In the previous code demonstration, we used socket.accept to receive a client request. Accept is a blocking method, which means that the TCP server can only handle one client at a time. Request. When a client sends a connection request to a server that is already occupied by other clients, although the data can be sent to the server after the connection is established, the server will not respond to the new request until the server has processed the previous request. The client responds, and this type of server is called an " iterative server ". The iterative server processes client requests in order, that is, the server must process the previous request before responding to the next client request. But in practical applications, we cannot accept such a treatment. So we need a way to handle each connection independently without interfering with each other. The multi-threading technology provided by Java just meets this demand. This mechanism allows the server to conveniently handle multiple client requests.

One client corresponds to one thread

Creating a thread for each client actually has some drawbacks, because creating a thread requires CPU resources and memory resources. In addition, as the number of threads increases, system resources will become a bottleneck and eventually reach an uncontrollable state, so we can also implement the function of multiple client requests through the thread pool, because the thread pool is controllable. The
Insert picture description here
above model Although the IO processing method is optimized, no matter whether it is a thread pool or a single thread, the number of threads processed is limited. For the operating system, if the number of threads is too large, it will cause the overhead of CPU context switching. Therefore, this method cannot solve the fundamental problem.
So after Java1.4, NIO (New IO) was introduced

BIO (Blocking IO) Blocking IO

When the client data is copied from the network card buffer to the kernel buffer, the server will always block. Take the socket interface as an example. When recvfrom is called in the process space, the process is blocked from the time when recvfrom is called to when it returns, so it is called a blocking IO model.
Insert picture description here

NIO (New IO) non-blocking IO

If we want this server to be able to handle more connections, how can we optimize it? The first thing we think of is how to ensure that this blocking becomes non-blocking. So the non-blocking IO model is introduced. The principle of the non-blocking IO model is very simple, that is, the process space calls recvfrom. If there is no data in the kernel buffer at this time, it will directly return an EWOULDBLOCK error, and then the application will check through continuous polling This state depends on whether the kernel has data coming.
Insert picture description here

I/O reuse model

Non-blocking still requires continuous polling and retry of the process. Can it be achieved that when the data is readable, a notification will be given to the program? So an IO multiplexing model is introduced here. The essence of I/O multiplexing is through a mechanism (system kernel buffering I/O data) , So that a single process can monitor multiple file descriptors , once a descriptor is ready (usually read or write), it can notify the program to perform corresponding read and write operations

What is fd?: In Linux, the kernel treats all external devices as a file to operate. Reading and writing a file will call the system command provided by the kernel and return an fd (file descriptor). And for a socket to read and write there will be a corresponding file descriptor, called socketfd

Common multiplexing methods

Common IO multiplexing methods are [select, poll, epoll], which are all IO multiplexing methods provided by Linux API

The select
process can pass one or more fd to the select system call, the process will be blocked on the select operation, so select can help us detect whether multiple fd are in the ready state.
Disadvantages:

Since it can monitor multiple file descriptors at the same time, if there are 1,000, if one of the fd is in the
ready state at this time , then the current process needs to poll all fd linearly, that is, the more fd is monitored, the more performance overhead Big.
At the same time, the select fd that can be opened in a single process is limited, the default is 1024, which is indeed a bit less for those TCP connections that need to support tens of thousands of single machines

epoll
linux also provides epoll system calls. epoll is based on an event-driven approach instead of sequential scanning, so the performance is relatively higher. The main principle is that when one of the monitored fd is ready, it will inform the current process. Which fd is ready, then the current process only needs to read data from the specified fd.
In addition, the fd online supported by epoll is the largest file handle of the operating system, and this number is much greater than 1024

Since epoll can tell the application process which fd is readable through events, we also call this IO asynchronous non-blocking IO, of course it is pseudo-asynchronous, because it also needs to copy the data from the kernel synchronously to the user space , The real asynchronous non-blocking, it should be that the data has been completely prepared, I only need to read from the user space
Poll: Do not study

the benefits of multiplexing.
I/O multiplexing can multiplex multiple I/O blocks to the same select block, so that the system can handle multiple at the same time in a single thread. Client requests. Its biggest advantage is that the system overhead is small, and there is no need to create new processes or threads, which reduces the resource overhead of the system