Java source code analysis and interview questions-Socket source code and interview questions

This series of related blog, Mu class reference column Java source code and system manufacturers interviewer succinctly Zhenti
below this column is GitHub address:
Source resolved: https://github.com/luanqiu/java8
article Demo: HTTPS: // GitHub. com / luanqiu / java8_demo
classmates can look at it if necessary)

The
Chinese translation of the introduction language Socket is called socket. Many students who have worked for four or five years have not used this API, but as long as this API is used, it must be at the core code of an important project.

Everyone usually uses various open source rpc frameworks, such as Dubbo, gRPC, Spring Cloud, etc., rarely need to write network calls, the following three sections can help you supplement this content, when you really need it, Can be used as an example in the manual.

This article and the article "ServerSocket Source Code and Interview Questions" mainly talk about the source code of Socket and ServerSocket. The chapter "Working in Practice: Socket Combined with the Use of Thread Pool" mainly talks about how the two APIs are implemented in actual work.

1 Socket overall structure

The structure of Socket is very simple. Socket is like a shell. It wraps various operations such as socket initialization and connection creation. The underlying implementation is implemented by SocketImpl. The business logic of Socket itself is very simple.

There are not many attributes of Socket, there are socket status, SocketImpl, read and write status, etc., the source code is as follows: the
Insert picture description here
socket status changes are corresponding to the operation method, such as the new socket (createImpl method) , The state will change to created = true, after connecting (connect), the state will change to connected = true and so on.

2 Initialization

There are many constructors of Socket, which can be divided into two categories:

  1. Specify a proxy type (Proxy) to create a set of nodes. There are three types: DIRECT (direct connection), HTTP (proxy for HTTP and FTP advanced protocols), SOCKS (SOCKS proxy), and the three different code methods correspond to different SocketImpl , Respectively: PlainSocketImpl, HttpConnectSocketImpl, SocksSocketImpl, in addition to the type, Proxy also specifies the address and port;
  2. The default SocksSocketImpl is created, and the address and port need to be passed in the constructor. The source code is as follows:
// address 代表IP地址,port 表示套接字的端口
// address 我们一般使用 InetSocketAddress,InetSocketAddress 有 ip+port、域名+port、InetAddress 等初始化方式
public Socket(InetAddress address, int port) throws IOException {
    this(address != null ? new InetSocketAddress(address, port) : null,
         (SocketAddress) null, true);
}

The address here can be an ip address or a domain name, such as 127.0.0.1 or www.wenhe.com.

Let's take a look at the source code of this low-level constructor called by this constructor:

// stream 为 true 时,表示为stream socket 流套接字,使用 TCP 协议,比较稳定可靠,但占用资源多
// stream 为 false 时,表示为datagram socket 数据报套接字,使用 UDP 协议,不稳定,但占用资源少
private Socket(SocketAddress address, SocketAddress localAddr,
               boolean stream) throws IOException {
    setImpl();
 
    // backward compatibility
    if (address == null)
        throw new NullPointerException();
 
    try {
        // 创建 socket
        createImpl(stream);
        // 如果 ip 地址不为空,绑定地址
        if (localAddr != null)
            // create、bind、connect 也是 native 方法
            bind(localAddr);
        connect(address);
    } catch (IOException | IllegalArgumentException | SecurityException e) {
        try {
            close();
        } catch (IOException ce) {
            e.addSuppressed(ce);
        }
        throw e;
    }
}

It can be seen from the source code:

  1. When constructing Socket, you can choose TCP or UDP, the default is TCP;
  2. If the address and port are passed in when constructing a Socket, then when constructing, it will try to create a socket at this address and port;
  3. The parameterless constructor of Socket will only initialize SocksSocketImpl and will not bind to the current address port. We need to manually call the connect method to use the current address and port;
  4. Socket can be understood as a language-level abstraction of network communication. The creation, connection and closing of the underlying network is still the standard specified by the TCP or UDP network protocol itself. Socket only uses Java language to make a layer of encapsulation, which makes us more convenient. use.

3 connect to the server

The connect method is mainly used to connect the Socket client to the server. If the bottom layer is the TCP layer protocol, it is to establish a connection with the server through a three-way handshake to prepare for the communication between the client and the server. The bottom source code is as follows:

public void connect(SocketAddress endpoint, int timeout) throws IOException {
}

The connect method requires two input parameters. The first input parameter is SocketAddress, which represents the address of the server. We can use InetSocketAddress for initialization, such as: new InetSocketAddress ("www.wenhe.com", 2000).

The second input parameter is the meaning of the timeout time (in milliseconds), which indicates the maximum waiting time for the client to connect to the server. If the current waiting time is exceeded, the connection is still not successfully established, and a SocketTimeoutException exception is thrown. If it is 0, it means infinite wait.

4 Socket commonly used setting parameters

The common setting parameters of Socket can be found in the SocketOptions class. Next, let's analyze them one by one. Most of the following understandings come from class annotations and networks.

4.1 setTcpNoDelay

This method is used to set the TCP_NODELAY attribute. The comment of the attribute is this: This setting is only effective for TCP, mainly to prohibit the use of Nagle algorithm, true means prohibited, false means used, and the default is false.

For the Nagle algorithm, we quote the explanation on Wikipedia:

The Nag algorithm is to improve the performance of the [TCP / IP] network by reducing the number of packets sent. It was named by John Nag when he was at Ford Aerospace.
Nag's document describes what he calls a "small packet problem"-an application continually submits small units of data, and some often occupy only 1 byte. Because the TCP packet has 40 bytes of header information (TCP and IPv4 each occupy 20 bytes), this results in a 41-byte packet with only 1 byte of available information, causing huge waste. This situation often occurs during the Telnet work phase-most keyboard operations will generate 1 byte of data and submit it immediately. To make matters worse, under a slow network connection, a large number of these data packets will be transmitted at the same time, causing congestion and collision.
The Nag algorithm works by coalescing a certain amount of output data and submitting it once. In particular, as long as there are data packets that have been submitted that have not yet been confirmed, the sender will continue to buffer the data packets until a certain amount of data is accumulated before submitting.

Summarize the scene where the algorithm is turned on and off:

  1. If the Nagle algorithm is turned off, for small data packets, such as a mouse move and a click, the client will immediately interact with the server, the real-time response is very high, but frequent communication consumes a lot of network resources;
  2. If the Nagle algorithm is enabled, the algorithm will automatically merge small data packets and wait until it reaches a certain size (MSS) before interacting with the server. The advantage is that it reduces the number of communications, the disadvantage is that the real-time response will be lower.

When the Socket is created, the Nagle algorithm is turned on by default. You can choose whether to turn off the Nagle algorithm according to real-time requirements.

4.2 setSoLinger

The setSoLinger method is mainly used to set the SO_LINGER attribute value.

The comment probably means this: when we call the close method, the default is to return directly, but if the value is assigned to SO_LINGER, the close method will be blocked. Within SO_LINGER time, wait for the communication parties to send data. If the time has passed, it has not yet At the end, TCP RST will be sent to force close TCP.

Let's take a look at the setSoLinger source code:

// on 为 false,表示不启用延时关闭,true 的话表示启用延时关闭
// linger 为延时的时间,单位秒
public void setSoLinger(boolean on, int linger) throws SocketException {
    // 检查是否已经关闭
    if (isClosed())
        throw new SocketException("Socket is closed");
    // 不启用延时关闭
    if (!on) {
        getImpl().setOption(SocketOptions.SO_LINGER, new Boolean(on));
    // 启用延时关闭,如果 linger 为 0,那么会立即关闭
    // linger 最大为 65535 秒,约 18 小时
    } else {
        if (linger < 0) {
            throw new IllegalArgumentException("invalid value for SO_LINGER");
        }
        if (linger > 65535)
            linger = 65535;
        getImpl().setOption(SocketOptions.SO_LINGER, new Integer(linger));
    }
}

4.3 setOOBInline

The setOOBInline method is mainly used to set the SO_OOBINLINE attribute.

The note says: If you want to accept TCP urgent data (TCP emergency data), you can turn on this option, by default this option is turned off, we can send emergency data through the Socket # sendUrgentData method.

After querying a lot of information, it is recommended to avoid setting this value as much as possible and prohibit the use of TCP emergency data.

4.4 setSoTimeout

The setSoTimeout method is mainly used to set the SO_TIMEOUT attribute.

The note says: Used to set the timeout time for blocking operations. The blocking operations mainly include:

  1. ServerSocket.accept () The server waits for the client's connection;
  2. SocketInputStream.read () Client or server read input timeout;
  3. DatagramSocket.receive()。

We must set this option before the blocking operation . If the time is up, the operation is still blocking and an InterruptedIOException will be thrown (Socket will throw a SocketTimeoutException exception, different sockets may throw different exceptions).

For Socket, if the timeout time is set to 0, it means there is no timeout time, and it will wait indefinitely when blocking.

4.5 setSendBufferSize

The setSendBufferSize method is mainly used to set the SO_SNDBUF attribute. The input parameter is of type int, which means that the size of the buffer at the sending end (output end) is set in bytes.

The input parameter size must be greater than 0, otherwise an IllegalArgumentException will be thrown.

In general, we take the default. If the value is set too small, it is likely that the network interaction will be too frequent. If the value is set too large, then the interaction will be less and the real-time performance will be lower.

4.6 setReceiveBufferSize

The setReceiveBufferSize method is mainly used to set the SO_RCVBUF attribute. The input parameter is int type, which means that the size of the buffer at the receiving end is set in bytes.

The input parameter size must be greater than 0, otherwise an IllegalArgumentException will be thrown.

In general, after the socket is established, we can modify the window size at will, but when the window size is greater than 64k, we need to pay attention to:
1. The buffer value must be set before the Socket connects to the client ;
2. It must be bound to the ServerSocket Set the buffer value before the local address .

4.7 setKeepAlive

The setKeepAlive method is mainly used to set the SO_KEEPALIVE attribute. It is mainly used to detect whether the socket on the server is still alive. The default setting is false, which will not trigger this function.

If SO_KEEPALIVE is enabled, TCP automatically triggers the function: if there is no communication between the client and the server socket within two hours, TCP will automatically send a keepalive probe to the other party, and the other party must respond to this probe (assuming that the client sends to Server), there are three predictions:

  1. The server uses the expected ACK reply, indicating that everything is normal;
  2. The server responds with RST, indicating that the server is in a crash or restart state and terminates the connection;
  3. No response from the server (will try multiple times), indicating that the socket has been closed.

4.8 setReuseAddress

The setReuseAddress method is mainly used to set the SO_REUSEADDR attribute. The input parameter is a Boolean value, and the default is false.

After the socket is closed, it will wait for a period of time before it is actually closed. If a new socket comes to bind the same address and port at this time, if setReuseAddress is true, the binding can be successful. Otherwise, the binding fails.

5 Summary

If you are always doing business code, Socket may be used very little, but when you interview about the network protocol, or when you have the opportunity to do middleware in the future, you will have a high probability of contacting Socket, so learn more, It is also good as a knowledge reserve.

Published 40 original articles · won praise 1 · views 5354

Guess you like

Origin blog.csdn.net/aha_jasper/article/details/105609521