Hundred battles c++ (network)

miscellaneous notes

  1. The lower layer provides services to the upper layer, and the lower protocol is transparent to the upper entity.
  2. MAC, bridge (can be seen as a multi-interface switch) works at the data link layer. Routers work at the network layer.
  3. The network layer provides only the simplest, most flexible, connectionless, best-effort datagram service upwards
  4. ARP: IP->MAC broadcasts ARP packets in the LAN, and sets the lifetime (cache)

 

The MTU maximum transmission unit is 1500 bytes (data + header). The flag MF=1 means that there is still MF=0. It means that there is no more. DF=0 means it can be fragmented, and DF=1 means it cannot be fragmented.

The chip displacement is the relative position, that is, the original position of the current data header and then divided by 8. Because it is 8 bytes as a unit.

  1. Subnet mask function: reduce the waste of IP addresses. (doesn't need so many hosts)
  2. ICMP (Error Report/Interrogation) messages. Works at the IP layer. IGMP is a protocol used to establish and maintain multicast group membership between an ip host and its directly adjacent multicast routers .
  3. CIDR address aggregation: longest prefix matching principle

Subnet IP address :: = {<network ID>, <subnet ID>, <host ID>}

Class IP address ::= {<network number>,<host number>}

Cidr address aggregation IP address ::= {<network prefix>,<host number>}

OSI TCP/IP five-layer model, what hardware is working on each layer

 

Physical layer devices: repeaters, hubs.

Data Link Layer: Bridge or Switch

Above the Network Layer: Gateway

The upper layer of the computer network calls the lower layer services, and the lower layer is transparent to the upper layer.

What happens after the browser enters an address and press Enter?

Enter the URL in the browser. First, the browser needs to parse the URL into an IP address. First, check whether the local hosts file has this mapping relationship. If not, the DNS protocol is used. First, the host will query the DNS cache, and if not, send a query request to the local DNS.

DNS query is divided into two ways, one is recursive query, the other is iterative query. If it is an iterative query, the local DNS server sends a query request to the root domain name server, the root domain name server informs the first-level domain name server of the domain name, and then the local server sends a query request to the first-level domain name server, and so on until the query reaches the first-level domain name server The IP address of the domain name. The DNS server is based on UDP, so the UDP protocol will be used.

Recursive query means that if the local domain name server inquired by the host does not know the IP address of the queried domain name, then the local domain name server will continue to send query request messages to other root domain name servers as a DNS client (that is, continue to query for the host), Instead of letting the host do the next query by itself.

After obtaining the IP address, the browser will establish an http connection with the server. http is a stateless connection established on tcp. (Application layer) Therefore, the http protocol is used, and the format of the http protocol message has been mentioned above. http generates a get request message and passes the message to the TCP layer for processing. If https is used, the http data will be encrypted first. If necessary, the TCP layer fragments the HTTP packet first, and the fragmentation is based on the path MTU and MSS. The TCP packets are then sent to the IP layer, using the IP protocol. The IP layer selects routes through routing, and sends them to the destination address hop by hop. Of course, the addressing in a network segment is realized through the Ethernet protocol (or other physical layer protocols, such as PPP, SLIP). The Ethernet protocol needs the physical address up to the destination IP address, and the ARP protocol is required.

The server processes the request and returns an html response, the browser accepts the requested page source code, the browser renders the html, the browser sends an object request embedded in the html, the browser sends an asynchronous request, and closes the tcp connection.

What is the difference between TCP and UTP?

TCP is connection-oriented (three-way handshake), and a connection needs to be established before communication; UDP is connectionless, and no connection is required before communication.

TCP achieves reliable transmission through confirmation, sequence number, retransmission, flow control, and congestion control; UDP does not guarantee reliable transmission, and it does its best to deliver.

TCP is oriented towards byte streams, so they can be fragmented and reassembled at the receiving end; UDP is oriented towards datagrams.

TCP point-to-point, UDP point-to-multipoint udp can use the same socket to receive or send a series of datagrams

Tcp header 20 bytes, udp8 bytes

Tcp is a slow and heavyweight protocol, and udp is a fast and lightweight protocol.

three handshake

SYN

Synchronization sequence number, indicating that this message is a connection request message or a connection acceptance message

ACK

Acknowledgment bit, acknowledgment of received message

FIN

Termination bit, indicating that the sender has completed sending data and is used to release a connection

RST

Reset the connection, indicating that a serious error occurred in the tcp connection.

PSH

Push bits, sending data to the receiving process as fast as possible

Serial number seq: 4 bytes, used to mark the sequence of data segments, TCP encodes all data bytes sent in the connection with a serial number, the number of the first byte is randomly generated locally; After the serial number is assigned, a serial number is assigned to each message segment; the serial number seq is the data number of the first byte in the message segment.

    Confirmation number ack: occupies 4 bytes, and expects to receive the sequence number of the first data byte of the next message segment; the sequence number indicates the number of the first byte of data carried by the message segment; and the confirmation number refers to What is expected to receive the number of the next byte; therefore, the number + 1 of the last byte of the current message segment is the confirmation number.

    Confirmation ACK: 1 bit, only when ACK=1, the confirmation number field is valid. When ACK=0, the confirmation number is invalid

    Synchronization SYN: Used to synchronize the serial number when the connection is established. When SYN=1, ACK=0, it means: this is a connection request segment. If the connection is agreed, make SYN=1 and ACK=1 in the response message segment. Therefore, SYN=1 means that this is a connection request, or a connection acceptance message. The SYN flag will only be set to 1 when the TCP connection is established, and the SYN flag will be set to 0 after the handshake is completed.

Terminate FIN: used to release a connection. FIN=1 means: the data of the sender of this message segment has been sent, and the transport connection is required to be released

PS: Capitalized words such as ACK, SYN, and FIN represent flag bits, and their values ​​are either 1 or 0; lowercase words such as ack and seq represent serial numbers.

 

The first handshake: When the connection is established, the client sends a syn packet (syn=j) to the server, and enters the SYN_SENT state, waiting for the server to confirm;

The second handshake: the server receives the syn packet, must confirm the customer's SYN (ack=j+1), and at the same time send a SYN packet (syn=k), that is, the SYN+ACK packet, and the server enters the SYN_RECV state at this time;

The third handshake: the client receives the SYN+ACK packet from the server and sends an acknowledgment packet ACK (ack=k+1) to the server. After the packet is sent, the client and server enter the ESTABLISHED (TCP connection successful) state and complete three times shake hands.

Why does TCP use a three-way handshake to establish a connection? Is it okay to use a second handshake? Why use four waves to release the connection.

No, the three-way handshake is used to prevent the invalid connection request segment from being suddenly transmitted to the server, resulting in an error. Because the connection request segment sent by the client did not arrive at the server in time due to network delays and other reasons, the client waited for a period of time and then sent a connection request to the server again, and the establishment was successful, and the data transmission was successfully completed. But at this time, the first segment of the message arrives at the server. If the second handshake is used, the server will return the corresponding information to establish a connection, but at this time the client ignores the connection, resulting in a waste of server resources.

When closing the connection, when receiving the FIN message notification from the other party, it only means that the other party has sent all the data, and it does not mean that all your data has been sent to the other party, so wait until your own data is sent, and then send the FIN message arts. For each FIN message, an ACK message is required, so a total of four waves are required.

           Said how the network is closed after the connection is established, waved four times

 

1) The client process sends a connection release message and stops sending data. Release the header of the data message, FIN=1, and its sequence number is seq=u (equal to the sequence number of the last byte of the previously transmitted data plus 1), at this time, the client enters FIN-WAIT-1 (stop waiting 1) Status. TCP stipulates that even if the FIN segment does not carry data, it still consumes a sequence number.

2) The server receives the connection release message, sends a confirmation message, ACK=1, ack=u+1, and brings its own serial number seq=v, at this time, the server enters CLOSE-WAIT (close waiting )state. The TCP server notifies the high-level application process, and the client is released in the direction of the server. At this time, it is in a half-closed state, that is, the client has no data to send, but if the server sends data, the client still has to accept it. This state will continue for a while, that is, the duration of the entire CLOSE-WAIT state.

3) After the client receives the confirmation request from the server, at this time, the client enters the FIN-WAIT-2 (terminate waiting 2) state, waiting for the server to send a connection release message (before this, it needs to accept the last message sent by the server) data).

4) After the server sends the last data, it sends a connection release message to the client, FIN=1, ack=u+1, because it is in the half-closed state, the server is likely to send some more data, assuming that at this time The serial number is seq=w. At this time, the server enters the LAST-ACK (final confirmation) state, waiting for the client's confirmation.

5) After the client receives the connection release message from the server, it must send an acknowledgment, ACK=1, ack=w+1, and its own serial number is seq=u+1. At this time, the client enters TIME- WAIT (wait for time) state. Note that the TCP connection has not been released at this time, and the client will enter the CLOSED state only after the time of 2∗MSL (maximum segment lifetime) has elapsed, and the corresponding TCB is revoked by the client.

6) As long as the server receives the confirmation from the client, it will immediately enter the CLOSED state. Similarly, after the TCB is revoked, this TCP connection ends. It can be seen that the server ends the TCP connection earlier than the client.

The reason for a large number of TIME_WAIT is how to solve it

Packet loss and timeout occurred in the fourth wave. Caused by a large number of short connections (http 1.0 uses short connections by default).

Set the SO_LINGER option of the socket in the client program;

The client machine increases the value of tcp_max_tw_buckets;

Modify ipv4.ip_local_port_range to increase the range of available ports, but it can only alleviate the problem, not solve the problem fundamentally;

Change short connection to long connection (there are too many short connections)

socket api: close() 和 shutdown()

 For a TCP connection, the party that first calls close() will enter the TIME_WAIT state. In addition, there are some details about close() that need to be explained.
       The default action of calling close() on a tcp socket is to mark the socket as closed and immediately return to the process of calling the api. At this time, from the perspective of the application layer, the socket fd can no longer be used by the process, that is, it can no longer be used as a parameter for read or write. Then, 4 waves of TCP will be initiated to completely close the TCP connection.
       Calling close() is the normal way to close a TCP connection, but there are two limitations in this way, which is why shutdown() was introduced:
       1) close() actually just decrements the reference count of socket fd by 1, only when When the reference count of the socket fd is reduced to 0, the TCP transport layer will initiate a 4-way handshake to actually close the connection. Shutdown can directly initiate the 4-way handshake required to close the connection without being limited by the reference count;
       2) close() will terminate the TCP duplex link. Due to the full-duplex nature of the TCP connection, there may be such an application scenario: the local peer will no longer send data to the remote peer, and the remote peer may still have data to send. In this case, if the local peer wants Informing the remote peer that it will no longer send data but will continue to receive data, it is not possible to use close(), and shutdown() can complete this task.

What is the TIME_WAIT state? Entering the TIME_WAIT state to wait for 2MSL has two main purposes:

The TIME_WAIT state is the state that the party that actively closes the TCP connection (that is, the party that initiates the FIN packet first) enters the state after sending the last ACK packet. The system needs to wait for 2MSL (maximum segment lifetime) in the TIME_WAIT state before releasing the connection (port). According to RFC 793 MSL is 2 minutes, and the general TCP implementation has 30 seconds, 1 minute and 2 minutes.

To achieve reliable release of TCP full-duplex connections. The party that actively closes the connection has time to resend the ACK packet when the other party has not received the last ACK packet (the other party will resend the FIN at this time, and the time interval between receiving two FINs must be less than 2MSL);

On the other hand, the connection (IP and port combination) in TIME_WAIT cannot be reused, so as to ensure that the reallocated socket will not be affected by the remaining delayed retransmission packets. Because the sender's message can only survive one MSL, and the response can only survive one MSL.

What if the connection is established, but the client suddenly fails?

TCP also has a keep-alive timer. Obviously, if the client fails, the server cannot wait forever, wasting resources in vain. The server will reset the timer every time it receives a request from the client. The time is usually set to 2 hours. If it has not received any data from the client within two hours, the server will send a probe segment. After that, every 75 sent every second. If there is still no response after sending 10 detection messages in a row, the server will think that the client has failed, and then close the connection

There are only 65535 ports, how to achieve millions of concurrency

After the TCP server listens to the specified port to receive the client connection, it creates a new socket for reading and writing data with the client, but the socket does not need or bind a new port, so for the TCP server, There is no shortage of ports.

TCP connection identifier (fd file descriptor)

src-ip src-port dest-ip dest-port protocol

What does TCP rely on to implement a security mechanism (error control mechanism)

Serial number:: When the data arrives at the receiver, the receiver needs to send a response to indicate that the data segment has been received, and the confirmation serial number will indicate the serial number that needs to be accepted next time, 2*RTT (round trip time of the message segment) + an offset value .

Confirmation mechanism (cumulative mechanism) and timeout retransmission.

flow control

Flow control is achieved through a sliding window.

Solved the problem of unreliable data between networks, such as packet loss, repeated packets, errors, out-of-order, and improved throughput.

The sliding window of Tcp is not a fixed size, but the receiver notifies the sender of its window size through the notification window field in the header of the tcp message.

TCP will use window control to increase the transmission speed, which means that within a window size, it is not necessary to wait for a response to send the next piece of data. The window size is the maximum value that can continue to send data without waiting for confirmation. If window control is not used, each data that does not receive an acknowledgment must be resent.

If the ack is not received within the time limit, the previously unacknowledged packet will be resent (timeout retransmission), and the delay of the sender window will remain unchanged. If ack is received, the window will be delayed and moved forward; if the notification window remains unchanged, the window will be moved forward. If the notification window shrinks, the front edge of the window does not move, or the movement decreases.

If the sending and receiving buffers are used up, return to the starting sequence number and start counting again.

congestion control

Slow start, fast retransmission, fast recovery.

Maxwindow=min(rwnd,cwnd)

Rwnd>cwnd; The amount of data that can be sent is controlled by network congestion

Rwnd<cwnd ; the amount of data that can be sent is controlled by the ability to accept

Slow start: Gradually increase the sending window from small to large, and grow exponentially. When the slow start threshold (ssthresh) is reached, the congestion avoidance algorithm starts, adding one each time. When a data transmission timeout occurs, cwnd is reset to 1, and ssthresh*=1/2.

At the beginning, the sender first sets cwnd (congestion window) =1, and sends the first message segment M1 . After the receiver receives M1 , the sender increases cwnd to 2 after receiving the receiver’s confirmation, and then the sender sends M2 , M3 , after the sender receives the acknowledgment from the receiver, cwnd increases to 4 , and the slow start algorithm doubles the congestion window cwnd every time the slow start algorithm passes through a transmission round (it is considered that the sender has successfully received the acknowledgment from the receiver).

Fast retransmission and fast recovery.

The fast retransmission algorithm requires the receiver to send a repeated confirmation immediately after receiving an out-of-sequence segment, instead of waiting for the piggyback confirmation when it sends data. The receiver successfully accepted M1 and M2 from the sender and sent ACK respectively . Now the receiver did not receive M3 , but received M4 . Obviously, the receiver cannot confirm M4 , because M4 is an out-of-sequence message segment. If the receiver does nothing according to the principle of reliable transmission, but according to the fast retransmission algorithm, when receiving M4 , M5 and other message segments, it will repeatedly send M2 ACKs to the sender , if the receiver receives If there are three repeated ACKs, then the sender does not have to wait for the retransmission timer to expire, and the sender retransmits the unacknowledged segment as soon as possible.

Fast recovery (congestion avoidance)

  1. When the sender receives three confirmations in a row, it executes the multiplication reduction algorithm to halve the slow start threshold ( ssthresh ), but then does not execute the slow start algorithm.
  2. At this time, the slow start algorithm is not executed, but cwnd is set to half of ssthresh , and then the congestion avoidance algorithm is executed to slowly increase the congestion window.

Please talk about the interaction process (ARP) of the TCP/IP data link layer

The network layer waits until the data link layer uses the mac address as the communication target, and when the data packet arrives at the network and is ready to be sent to the data link layer, it will first go to its own arp cache table ( where the ip-mac correspondence is stored ) to find the target ip If the mac address is found, the mac address of the target ip is encapsulated into the header of the link layer data packet. If it is not found in the cache, a broadcast will be initiated: who is ip XXX tell ip XXX, all machines that receive the broadcast will check whether the ip is their own, if it is their own, they will send their own mac address in the form of single dial Reply to the requesting machine

Guess you like

Origin blog.csdn.net/hebtu666/article/details/127204723