Basic knowledge of computer network (very detailed), from basic entry to proficiency, just read this article

1. Network model

1.1 OSI seven-layer reference model

The seven-layer model, also known as the OSI (Open System Interconnection) reference model, is an open system interconnection and is a standard model for network communication . Generally called the OSI reference model or seven-layer model.

It is a seven-layer, abstract model body that includes not only a series of abstract terms or concepts, but also specific protocols.

  1. Physical layer: Responsible for transmitting the original bit stream, digital-to-analog conversion, and analog-to-digital conversion.
  2. Data Link Layer: Responsible for reliable transmission of data frames between directly connected nodes, error detection and correction, and defining physical addresses (MAC addresses).
  3. Network layer: Responsible for the transmission and routing of data packets between different nodes in the network, using logical addresses (IP addresses) for addressing.
  4. Transport layer: Responsible for providing end-to-end data transmission services, ensuring reliable data transmission, processing data segmentation and reassembly, and providing error recovery and flow control functions.
  5. Session layer: Responsible for establishing, managing and terminating sessions between applications, and providing functions such as data synchronization and session recovery.
  6. Presentation layer: Responsible for formatting and parsing data, encrypting and decrypting data, and processing data compression and conversion.
  7. Application layer: Responsible for providing communication services between network applications, including file transfer, email, remote login and other applications.

1.2 TCP/IP model

The TCP/IP model is the de facto model of network communication and is divided into four layers from bottom to top:

  1. data link layer
  2. Network layer
  3. transport layer
  4. Application layer

The TCP/IP**** model is the model actually used in engineering. In engineering implementation, you only need to pay attention to the transmission layer and application layer!

The TCP/IP model architecture is as follows:

Interview question: What is the difference between the OSI model and the TCP/IP model?

OSI (Open Systems Interconnection, Open Systems Interconnection) is the standard model of network communication , but the TCP/IP model is a de facto model and simplifies the OSI model.

The difference between the TCP/IP model and the TCP/IP model is that the lower two physical layers and data link layer are merged into the data link layer; the upper three layers session layer, presentation layer and application layer are merged into the application layer;

As an encapsulation of the network below the transport layer, Socket provides a unified interface for the three layers above. The reasons for doing this have the following benefits:

1) The upper three layers are used to handle the business details of the application, without paying attention to the communication details, and are quite different; the lower four layers implement communication details, without paying attention to the business details, and are more versatile;

2) The upper three layers are usually within the user process, while the lower four layers are provided as part of the OS kernel;

1.3 TCP/IP protocol suite

Interview question: What protocols does TCP/IP include? What are the functions of each?

(1) Transport layer protocol:

  1. TCP : Transmission Control Protocol, Transmission Control Protocol. A connection-oriented protocol that provides reliable full-duplex byte streams (Full Duplex is a communication method that allows data to be transmitted in both directions at the same time, that is, the sender and receiver can transmit data simultaneously. Send and receive data.), use stream sockets (stream sockets are suitable for scenarios that require reliable data transmission and two-way communication, such as file transfer, HTTP communication, etc.). It has mechanisms such as message confirmation, timeout, and retransmission;
  2. UDP : User Datagram Protocol, User Datagram Protocol. A connectionless protocol that uses datagram sockets (datagram sockets are suitable for scenarios with high real-time requirements and low data transmission order and reliability requirements, such as audio and video transmission, DNS query, etc.) , there is no guarantee that message delivery will be successful;
  3. SCTP : Stream Control Transmission Protocol, flow control transmission protocol. A protocol that provides reliable full duplex connections. SCTP is multi-homed, and both ends of the communication can involve multiple groups of (IP, port) communications. Provide message services; (The main difference between SCTP and TCP is that SCTP provides multi-stream and multi-homing capabilities)

(2) Network layer protocol:

  1. IPv4 : Internet Protocol version 4, using 32-bit addresses (available addresses are limited);
  2. ICMP : Internet Control Message Protocol, network control message protocol, handles message communication between hosts and routers;
  3. IGMP : Internet Group Management Protocol, network group management protocol, used for broadcast communications;
  4. ARP : Address Resolution Protocol, which maps an IPv4 address to a hardware address (MAC address) and is used in broadcast networks;
  5. RARP : Reverse Address Resolution Protocol, reverse address resolution protocol, maps the hardware address (MAC address) to an IPv4 address;
  6. IPv6 : Internet Protocol version 6, using 64-bit addresses (huge available address space);
  7. ICMPv6 : Internet Control Message Protocol version 6, provides address resolution, network group management and other functions for the IPv6 protocol, and has the functions of ICMPv4, IGMP, and ARP protocols;

(3) Data link layer protocol (just understand it):

  1. BPF : BSD packet filter, BSD packet filter. Provides access capabilities to the data link layer; (Berkeley kernel);
  2. DLPI : datalink provider interface, data link provider interface. Provides access capabilities to the data link layer; (SVR4 kernel);

1.4 TCP/UDP/SCTP protocol

Interview question: What is the difference between TCP and UDP protocols?

(1) UDP protocol : A connectionless transmission protocol that encapsulates application layer data into a UDP datagram, and then encapsulates the UDP datagram into an IP datagram and sends it out;

1) UDP datagrams have a length (the message length is used to tell the client the start and end when reading the datagram), and the message length is transmitted to the peer along with the message (not the TCP streaming message);

2) The UDP protocol does not have transmission reliability: it does not provide mechanisms such as confirmation, sequence number, R1TT estimation, timeout, and retransmission;

  1. There is no guarantee that UDP datagrams will reach the final destination (no peer message confirmation mechanism, timeout, and retransmission mechanism);
  2. The order in which each UDP datagram reaches the destination is not guaranteed (no message sequence number);
  3. There is no guarantee that UDP datagrams are delivered only once (no message sequence number);

3) The UDP protocol is a connectionless protocol, and the sockets at both ends have no binding relationship;

  1. A UDP client can create a socket and send a datagram to one server, and then immediately use the same socket to send another datagram to another server;
  2. A UDP server can use the same socket to receive datagrams from several different clients;

4) UDP protocol is a full-duplex protocol;

Replenish:

Datagram and Stream are two data transmission methods commonly used in network communications. They have the following differences:

1) Datagram:

- Datagram is a connectionless transmission method, and each datagram is an independent, self-contained data unit.

- Each datagram has its own destination and source addresses and can be transmitted independently across the network.

- The transmission of datagrams is unreliable, and problems such as packet loss, duplication, and out-of-order may occur.

- Example: UDP (User Datagram Protocol) uses datagram transmission.

2) Data stream (Stream):

- Data streaming is a connection-oriented transmission method. Data establishes a persistent, ordered byte stream between the sender and the receiver.

- A data stream is a continuous, unbounded sequence of bytes with no clear division units.

- The transmission of data streams is reliable, ensuring orderly, complete and error-free transmission of data.

- Example: TCP (Transmission Control Protocol) uses data streaming transmission.

Datagrams and data streams are two different data transmission methods. Datagrams are independent, self-contained data units. Each datagram has its own destination address and source address. The transmission is unreliable. The data stream is a persistent byte stream established on the connection. The transmission is reliable, ensuring the orderly, complete and error-free transmission of data. The choice of which method to use depends on the specific application requirements and the characteristics of the communication method.

(2) TCP protocol : There is a connection protocol that, based on data flow, will encapsulate upper-layer messages into multiple segments, and each segment will be sent as an IP datagram;

1) The TCP protocol is based on data stream transmission and has no message length itself;

2) The TCP protocol ensures the reliability of transmission (more reliable, not 100% reliable);

  1. Provide peer message confirmation mechanism : after the message is sent, it will be considered successful only after it waits for confirmation from the peer;
  2. Provide timeout and retransmission mechanism : after the message is sent, if no confirmation is received after a certain period of time, the message will be resent; until it succeeds or the specified retry time is exceeded;
  3. Provide RTT estimation : Provide dynamic estimation of the RTT (round-triptime) algorithm between the client and the server, so as to know the approximate time required for an RTT;
  4. Provide message sequence number : Provide a sequence number (byte offset) for each segment so that when the peer receives the message, it can sort and deduplicate the message based on the sequence number;
  5. Provide flow control : Provide an advertised window to inform the peer how much data it can currently receive; thereby ensuring that the amount of data sent by the sender will not overflow the receiving buffer; the advertised window changes dynamically as the message is received, and when it is 0 When, it means that the receiving buffer is full, and you need to wait for the application to read data from the buffer before receiving data from the peer again; that is, the TCP negative feedback mechanism limits the rate of the sending end through the rate of the receiving end;

Note: The TCP protocol does not guarantee 100% reliability, but allows the upper layer to detect abnormalities in message delivery while ensuring reliability as much as possible;

3) The TCP protocol has a connection protocol, and both ends must establish a connection before they can communicate;

4) The TCP protocol is a full-duplex protocol, and applications on a given connection can send and receive data at any time;

(3) SCTP protocol : There is a connection protocol that provides reliability, sequencing, flow control, and full-duplex data transmission; it represents the communication method of both ends through association, and a connection only involves communication between two IP addresses; An association refers to the first communication between two ends, but not necessarily the communication between two IP addresses;

1) SCTP is message-oriented and has a message length. Each message is delivered to the peer in order (TCP does not have a message length and relies on the upper-layer application to design the length of the application layer protocol);

2) Each association contains multiple streams (connections). Loss of messages on a single stream will not block the transmission of messages on other streams (TCP only has a single stream and will block);

3) SCTP provides multihoming features, that is, both ends can support multiple IP addresses, and one endpoint can have multiple redundant connections through different infrastructures, thereby enhancing the robustness against network failures;

Supplement :

In SCTP (Stream Control Transmission Protocol), multi-stream (Multi-Stream) and multi-homing (Multi-Homing) are one of its features, providing more flexible and efficient data transmission capabilities.

1) Multi-Stream:

- Multi-stream means that multiple independent data streams can be transmitted simultaneously in a single SCTP connection.

- SCTP allows applications to create multiple logical streams on the same SCTP connection, each with its own sequence number and transmission control.

- The multi-stream feature allows applications to group different types of data into different streams to achieve parallel transmission and optimize network resource utilization.

- Applications can create, select, and manage multiple streams as needed to meet the needs of different data streams.

2) Multi-Homing:

- Multihoming means that multiple network interfaces or IP addresses can be used simultaneously for communication in an SCTP connection.

- SCTP allows one endpoint (host) to bind multiple IP addresses on multiple network interfaces, and select the appropriate address for data transmission based on network conditions during the communication process.

- The multihoming feature provides redundancy and load balancing capabilities. When a network interface or IP address is unavailable, it can automatically switch to other available interfaces or addresses.

- Applications can take advantage of the multihoming feature to improve connection reliability and availability and adapt to data transmission needs in multi-network environments.

1.5 TCP protocol

1.5.1 TCP link establishment: three-way handshake

Interview question: What is TCP’s three-way handshake? What is the process?

The general process of TCP link building is as follows:

1) Passive open : The server calls the three functions socket, bind, and listen to start listening;

2) Active open : The client calls the connect function to connect to the server: the client sends a SYN segment to tell the client to send the initial sequence number; the SYN segment does not carry data, so the IP datagram in which it is located only contains An IP header, a TCP header and optional TCP options;

3) The server sends ACK to confirm the SYN sent by the client, and at the same time sends its own SYN to the client, telling the server to send the initial sequence number;

4) The client sends ACK to confirm the SYN sent by the server;

Note 1: The server sends ACK and SYN in the same segment;

Note 2: The initial serialization serial number of the server and the initialization serial number of the client are two independent fields, which respectively represent the sequence number of the message sent by each;

Note 3: Every time SYN is sent, it will carry the current sequence number; every time ACK is replied, it will carry the sequence number of the next message, that is, the current sequence number + 1;

Scenario example: Before two people start a phone call, they first confirm whether the other party can hear the call.

Client: Hello, can you hear me? (SYN)

Server: I can hear it, can you hear it? (ACK+SYN)

Client: Can be heard (ACK)

… …(begin follow-up conversation)

What are TCP options?

The connection control parameters carried when establishing a TCP link.

TCP_MAXSEG: Set MSS (Max Segment Size). When sending SYN, inform the peer of the maximum segment size that can be accepted; that is, the maximum amount of data that can be accepted in each TCP segment in the current connection. The client will set the maximum size of the segments to be sent accordingly. Set by TCP_MAXSEG;

SO_RCVBUF: Set the receive buffer size, thereby informing the peer of the size of the notification window;

SO_SNDBUF: Set the send buffer size;

1.5.2 TCP closes the connection: four waves

Interview question: What are TCP’s four waves? What is the process?

The process of TCP closing the connection:

1) Active close: End A calls the close function and sends a FIN section with a sequence number to inform end B that the data has been sent;

2) Passive close: After receiving the FIN, the B end replies with an ACK segment to the A end; at the same time, the received data is treated as a

EOF (End of File) is passed to the upper-layer application to notify the upper-layer application that the data has been sent;

3) After receiving the EOF, the upper-layer application on the B-side calls the close function and also sends a FIN section to inform the A-side that the data has been sent;

4) After receiving the FIN segment, end A replies with an ACK segment to end B; now both ends are closed;

Note 1: The reason why both ends need to send FIN is because TCP is full-duplex and both ends may send data, so both ends need to send FIN to inform the other end that the sending is complete;

Note 2: Every time FIN is sent, it will carry the current sequence number; every time ACK is replied, it will carry the sequence number of the next message, that is, the current sequence number + 1;

Note 3: After executing steps 1) and 2), the passively closed end (end B) can still send data to the actively closed end (end A). This is called half-close. ;

Note 4: When a Unix process terminates voluntarily (calling exit or returning from main) or involuntarily (receives a process termination signal), all open file handles will be Close, which will also cause any open TCP connection to also send a FIN section to inform the peer to close; (kernel behavior, this is also a process hang, but when the operating system is not down, the peer can sense it online to the reason for the broken link)

Note 5: Both the client and the server can actively close the connection (call the close function);

Scene example: It seems like two people say goodbye to each other before ending the phone call.

A: I’m fine, I’m hanging up the phone (FIN)

B: OK (ACK)

B: I’m fine now. You can hang up the phone now (FIN)

A: OK (ACK, end of conversation)

Question: Why can the server reply ACK and SYN simultaneously in one segment during the three-way handshake ? And among the four wavings, the passively closed

Does one end need to send ACK and FIN in two segments ?

Because TCP is fully duplex, when waving four times, the closing of both ends is an independent action, and each can control whether to close its own channel for sending messages. Therefore, it is divided into 2 groups of messages (one FIN and one ACK);

Supplement :

1) In the second step of the three-way handshake, the server can combine the ACK and SYN responses in one segment and send them. This is because the server already knows the client's initial sequence number (ISN) after receiving the client's SYN segment, and the server also needs to send its own initial sequence number (ISN) to the client. Therefore, the server can combine ACK and SYN responses and send them in one section to reduce the round-trip time (RTT) of communication.

2) Because the actively disconnected party can receive data but cannot send data after disconnecting, the passive disconnected party has not yet disconnected and can both receive and send data. The passive disconnection party and the active disconnection party may have other needs, so they need to execute ACK and SYN separately.

1.6 TCP state machine

Interview question: What are the states of a TCP connection? What are their respective meanings?

Server-side link building state machine

CLOSED -> LISTEN: The server calls the listen function and opens it passively;

LISTEN -> SYN_RCVD: The server receives SYN and sends SYN + ACK;

SYN_RCVD -> ESTABLISHED: The server receives ACK and completes the link establishment;

Client link building state machine

CLOSED -> SYN_SENT: The client sends SYN;

SYN_SENT -> CLOSED: The client waited for timeout and did not receive ACK + SYN;

SYN_SENT -> ESTABLISHED: The client receives ACK + SYN and sends ACK;

Actively shut down the state machine

ESTABLISHED -> FIN_WAIT_1: End A calls the close function, sends FIN, and actively closes the connection;

FIN_WAIT_1 -> FIN_WAIT_2: End A receives ACK;

FIN_WAIT_1 -> CLOSING: When end A receives FIN, it may be that end B called the close function at the same time and closed the connection at the same time;

FIN_WAIT_1 -> TIME_WAIT: Side A receives SYN + ACK and replies with ACK, indicating that side B takes the initiative to establish a link;

FIN_WAIT_2 -> TIME_WAIT: End A receives FIN and replies with ACK;

CLOSING->TIME_WAIT: End A receives ACK;

TIME_WAIT-> CLOSED: Side A waits for 2MSL timeout;

Passive shutdown state machine

ESTABLISHED -> CLOSE_WAIT: End A receives FIN and recovers ACK, indicating that the connection from end B to end A is closed;

CLOSE_WAIT -> LAST_ACK: End A sends FIN, indicating that the connection from end A to end B is closed;

LAST_ACK -> CLOSED: End A receives ACK;

The complete data sending and receiving process of TCP**** is as follows:

Note 1: After the link is established, the ACK requested by the server from the client is sent together with the data to be written. This process reduces one IO interaction and is called piggybacking; piggybacking has a time window. If the window is exceeded, ACK packets are sent independently;

Note 2: When closing a connection, the actively closed end will experience the TIME_WAIT state, while the passively closed end will experience the CLOSE_WAIT state;

Supplement :

The time window means that if there is no data to be sent at that time, the sending of ACK will not wait too long (if the waiting time is too long, the client will retransmit if it does not receive the ACK), and the ACK will be sent independently;

If you send and receive a packet separately, using the TCP protocol requires at least 8 packets (3 handshakes + 4 waves + 1 send), while using the UDP protocol only requires 2 packets (1 request packet +1 reply group). It can be seen that when using UDP than using TCP protocol, the amount of data in the interaction process is smaller, so the performance is better, but reliability is also sacrificed;

1.7 TIME_WAIT state

Interview question: What is the status of TIME_WAIT in the TCP protocol? Why does this state exist?

The end that actively closes the connection will experience the TIME_WAIT state, which will last for 2MSL before changing to the CLOSED state; the recommended value of TIME_WAIT is 2min, but usually 1~4min;

MSL (Maximum Segment Lifetime): Maximum segment life cycle, indicating the longest time that IP datagrams can survive in the network. IP datagrams exceeding this time will be ignored;

Hop limit : Each IP datagram contains a hop limit, which is the maximum number of routers that the datagram can cross in the link. If this number is exceeded, the IP datagram will be ignored; hop limit The field is 8 bits, so the maximum value is 255;

Lost duplicate /wandering duplicate: Due to routing anomalies, the TCP segment wanders in the network, causing the segment to be sent repeatedly after timeout, resulting in duplication;

The meaning of TIME_WAIT**** state:

1) Achieve reliable TCP full-duplex connection termination: after the last ACK is lost, it may need to be retransmitted, so the TIME_WAIT state is required for transfer;

2) Allow old duplicate segments to disappear in the network: If the TCP connection is disconnected and a connection is established online using the same IP and port, the new connection may receive a report from the old connection; In order to prevent this from happening, you need to wait for 2MSL in the TIME_WAIT state to ensure that all the packets of the previous connection disappear before enabling a new connection;

1.8 Connection port number

TCP, UDP, and SCTP protocols all have the concept of port numbers, and all use 16-bit integers to represent the port number range;

There is no conflict between port numbers between different protocols, so the same port number for TCP, UDP, and SCTP can be opened at the same time;

When establishing a link with the specified port on the server, the client also needs to use a temporary port to communicate with the server. The temporary port is automatically allocated by the protocol stack;

Socket pair : A TCP communication link, including a pair of sockets (IP + port), which is a four-tuple that defines two connection endpoints (local IP address, local TCP port number, foreign IP address, Foreign TCP port number);

Commonly used port number ranges:

1) 0~1023, the port number of the public service, such as ssh port number 22;

2) 1024~49151, used to register the opened listening port number, such as the 8080 port of the web service;

3) 49152~65535, used to open the temporary port number and communicate with the server;

Note: The range of port numbers may vary depending on different OSs. Generally, ports before 1024 are reserved ports and cannot be used freely;

Question: What determines how many connections a server can carry?

For the client, at the physical level , since a Socket connection is also regarded as a file, it will occupy a file handle (client-side machine);

At the logical level , the five-tuple (protocol, source IP, source port, target IP, target port) is used to uniquely identify a connection. Therefore, the number of connections that can be opened depends on the five-tuple. How many combinations are there?

Misunderstanding: Since the range of ports that a machine can open is always less than 65535, it is easy to understand that a client machine can

The upper limit of the connection established to the server is the number of ports that can be opened. In fact, port is a logical concept, and the number of established connections depends on the combination of the values ​​of each element in the above five-tuple.

Question: Suppose there are a fixed number of client machines and want to initiate millions of connections to the same server. How to overcome port restrictions?

Since the number of machines is fixed, in the five-tuple ( protocol, source IP , source port, target IP , target port ), the protocol, source IP, and target IP are fixed. For the same target port (server port), the maximum number of ports that can be opened by the same client machine will not exceed 65535. Therefore, if you want to initiate millions of connections, you need to provide multiple target ports (server ports). ), each port can bear 60,000 multiple client connections, breaking through port restrictions.

1.9 TCP data sending process (understanding)

Each TCP socket has a sending buffer (data will be sent only when the sending buffer is full (if data is sent byte by byte, the efficiency is lower); it is like filling a pool with water, and the A scoop of water is more efficient than a drop of water), set through the SO_SNDBUF option; the data sending process is as follows:

1) TCP layer: When the application calls write, the kernel copies all data from the buffer of the application process and writes it to the send buffer of the socket;

2) TCP layer: If the sending buffer cannot store all the data, the application process will be put into sleep state. For blocking IO, the kernel will not return from the write system call until the sending buffer frees up space. All data of the user process is copied to the send buffer;

Note: Only after receiving the ACK from the peer, the confirmed data in the sending buffer can be continuously cleared;

3) TCP layer: The local TCP transmits data to the IP layer in blocks of MSS size or smaller; each data block has a TCP header (20 bytes); (MSS value is passed by the peer MSS option )

4) IP layer: Install an IP header to the TCP segment to form an IP datagram, and search the routing table according to the destination IP address to determine the outgoing interface; then send the data to the corresponding data link, during which the IP layer may The IP datagram will be divided into multiple fragments and sent;

5) Data link layer: Each data link maintains an output queue. If the queue is full, the new packet will be discarded and an error will be returned along the protocol stack; the TCP layer captures the error and attempts to retransmit it. Corresponding sections; this process will not be perceived by upper-layer applications;

Note: A write system call returns successfully, which only means that the sending buffer can be reused, but does not mean that the peer has received the sent data;

1.10 Timeout retransmission and fast retransmission

Timeout Retransmission and Fast Retransmission are used in computer networks.

Two retransmission mechanisms to handle lost packets and network congestion.

1 **) Timeout retransmission**

Timeout Retransmission means that the sender sets a timer after sending a data packet. If no acknowledgment (ACK) or duplicate acknowledgment (Duplicate ACK) is received within the specified time, the sender will consider the data packet lost, and then resend the packet. This mechanism is based on the assumption of timeout, that is, the cause of packet loss may be network congestion, transmission delay, etc. The disadvantage of timeout retransmission is that it needs to wait for the timeout period before retransmitting the data packet, which will lead to inefficient network transmission.

Behavior : If an acknowledgment is not received within the timeout, the sender assumes that the packet is lost and resends the packet.

Trigger conditions : Timeout retransmission usually occurs when network congestion or transmission delay causes packet loss.

2 **) Fast retransmission**

Fast Retransmission is a retransmission mechanism based on the receiver's feedback mechanism. When the sender sends a data packet, the receiver will send an acknowledgment (ACK) to the sender, indicating that the data packet has been successfully received. However, if the receiver receives a packet out of sequence (for example, the receiver has received packet 5 but not packet 3, it will confirm packet 4 repeatedly and send the data three times. After packet 4, the sender retransmits packet 3), and the receiver will send a Duplicate ACK to the sender, indicating that a certain data packet is lost. When the sender receives three consecutive repeated confirmations, it will immediately retransmit the corresponding data packet without waiting for the timeout period. This can quickly recover lost data packets and improve network transmission efficiency.

Principle : After receiving an out-of-order data packet, the receiver will send a Duplicate ACK to the sender.

Behavior : When the sender receives a certain number of duplicate confirmations in a row (usually 3), it immediately retransmits the corresponding data packet without waiting for the timeout.

Trigger condition : Fast retransmission usually occurs when network packet loss causes data packets to be out of order. The receiver indicates to the sender that a packet has been lost by sending a duplicate acknowledgment.

In summary, timeout retransmission is to resend the data packet after the timer set by the sender expires, while fast retransmission is to retransmit the data packet immediately after the receiver sends a duplicate confirmation. These two retransmission mechanisms are used in different situations to improve the reliability and efficiency of data transmission.

1.11 TCP Header structure

Tip: Data Transfer Time Estimation

Use ping() to estimate TCP data transmission time. One ping() uses an 84-byte IP datagram;

Assume that the average RTT length measured by ping() for 30 times is 175ms, and 2000 bytes of data need to be sent, and 40 bytes are sent each time, then a total of 50 times are sent;

Each time it is sent, in addition to 40 bytes of data, it also contains 20 bytes of IP Header and 20 bytes of TCP Header, a total of 80 bytes; it can be estimated that the approximate time it takes to send these 2000 bytes is: 175ms * 50 times = 8750ms;

1.12 TCP reliable transmission

(1) What is reliable transmission?

For communication between two nodes in the network, a long link must be passed in the middle, and there may be many problems during this period. Therefore, we believe that transmitting data directly in the network is unreliable.

The so-called reliability means that the data can be received by the peer in a normal, complete and orderly manner. The TCP protocol relies on the following mechanisms to achieve reliable communication:

  1. ACK mechanism
  2. Serial number mechanism
  3. rearrangement mechanism
  4. Window mechanism

Supplement :

In the reliable transmission of TCP (Transmission Control Protocol), there are two important mechanisms: reordering and window.

1) Rearrangement mechanism:

Reordering refers to the fact that packets in the network may arrive at the receiver in a non-sequential manner. This may be due to network topology, routing, congestion, etc. TCP uses sequence numbers to identify sent data packets, and the receiver uses the sequence numbers to restore the correct order of the data.

When the receiver receives out-of-order packets, it caches these packets and then sorts them by sequence number to ensure that the packets are delivered to the upper-layer application in the correct order. TCP uses receive windows to manage the buffering and ordering of out-of-order packets.

2) Window mechanism:

The window mechanism is the basis of flow control and congestion control in TCP. Both the sender and the receiver have a concept of window size.

The sender's window size represents the amount of data that can be sent but has not yet received an acknowledgment. The sender dynamically adjusts the window size based on the confirmation information sent by the receiver to control the sending rate and avoid sending too much data that the receiver cannot process in time.

The receiver's window size represents the amount of data it is able to receive. The receiver informs the sender of its current receiving capabilities by sending window size information to the sender. The sender controls the sending rate according to the receiver's window size to avoid sending too much data and causing the receiver's buffer to overflow.

The window mechanism enables the sender and receiver to coordinate the data transmission rate to adapt to the network conditions and the receiver's processing capabilities, thereby achieving reliable data transmission.

To sum up, TCP achieves reliable transmission through the reordering mechanism and window mechanism. The rearrangement mechanism ensures that the receiver can correctly sort out-of-order data packets, and the window mechanism is used for flow control and congestion control to ensure that data transmission between the sender and receiver can be carried out reasonably and efficiently.

(2) What is the ARQ protocol?

ARQ (Automatic Repeat Request) protocol is a protocol used for reliable data transmission and is located at the transport layer. It is a mechanism for handling and correcting transmission errors and lost data in data communications.

The main goal of the ARQ protocol is to ensure reliable transmission of data, which is achieved by introducing ACK**** confirmation and timeout retransmission mechanisms :

  1. ACK**** confirmation : The sender waits for confirmation from the receiver after sending the data packet;
  2. Timeout retransmission mechanism : If the sender does not receive confirmation within a certain period of time, or the receiver notifies that there is an error in the data packet, the sender will resend the data packet.

There are three main modes of the ARQ protocol :

1) Stop-and -Wait ARQ : The sender sends a data packet and then stops waiting for confirmation from the receiver. Only after receiving an acknowledgment does the sender send the next packet.

2) Continuous ARQ : The sender sends multiple data packets continuously without waiting for the confirmation of each data packet. After the receiver receives the packet, it sends a cumulative acknowledgment indicating that a series of packets have been successfully received. If the sender does not receive an acknowledgment or if the acknowledgment received indicates an error, it will retransmit the corresponding packet. (similar to pipeline)

  1. Go-Back-N : When the receiver detects an error or packet loss, the sender resends the entire window of unacknowledged packets.
  2. Selective-Repeat : When the receiver detects an error or packet loss, the sender only retransmits the packet in question, not all packets in the entire window.

3) Feedback ARQ : The receiver will periodically send confirmation messages to indicate that the data packet has been successfully received. The sender will retransmit based on the received confirmation information.

Note 1: Stop and wait ARQ needs to process the request-response of messages one by one, which may cause performance problems, so it is rarely used;

Note 2: Kafka also uses implementation mechanisms similar to Continuous ARQ and Feedback ARQ when ensuring message delivery consistency at the application layer;

(3)Stop-and-Wait ARQ

The working principle of stop-and-wait ARQ is as follows:

1) The sender sends a data packet to the receiver, then waits for the receiver to reply with ACK and starts timing;

2) During the waiting process, the sender stops sending new data packets;

3) When the data packet is not successfully received by the receiver, the receiver will not send ACK. The sender waits for a certain period of time and then resends the data packet;

Disadvantages: Long waiting time leads to low data transfer speed;

(4)Go-Back-N

In order to overcome the shortcomings of the paused ARQ protocol of waiting for ACK for a long time, the continuous ARQ protocol will continuously send a group of data packets and then wait for the ACKs of these data packets.

What is a sliding window?

Both the sender and the receiver maintain a sequence of data frames, which is called a window. The window size of the sender is determined by the receiver. The purpose is to control the sending speed to prevent the receiver's cache from being not large enough, causing overflow. At the same time, controlling the flow can also avoid network congestion. The protocol stipulates that unconfirmed packets within the window need to be retransmitted.

What is Go-Back-N?

The N-step backoff protocol allows the sender to continue sending packets while waiting for the timeout. All packets sent have sequence numbers. In the Go-Back-N protocol:

The sender needs to respond to the following three events **:**

1) When the upper-layer business sends data and calls send(), the sender must first check whether the sending window is full;

2) The sender receives ACK: In this protocol, the acknowledgment of the packet with sequence number n takes the form of cumulative acknowledgment, indicating that the receiver has correctly received all packets before sequence number n (including n);

3) The sender waits for timeout: If a timeout occurs, the sender will retransmit all packets that have been sent but have not yet been acknowledged; after the sender has sent all the packets in a window, it will check the largest valid confirmation. Then start retransmitting from the last packet with the largest valid acknowledgment;

For the receiver, the processing actions are as follows:

1) If a packet with sequence number n is received correctly and conforms to the order, the receiver will reply with an ACK for the packet and hand the packet over to the upper layer for processing; otherwise, the packet will be discarded;

2) If packet n has been received and handed over to the upper-layer business for processing, it means that all packets smaller than n have been correctly received and delivered for processing;

For example: the sender sends packets with sequence numbers: 1, 2, 3, 4, and 5 at once. If the packet with sequence number 2 is lost, even the packets with sequence numbers 3, 4, and 5 will be received correctly (but it does not meet the requirements for reception). order), the sender will still retransmit packets 2, 3, 4, and 5;

The technologies used by Go-Back-N include: serial number, cumulative confirmation, verification, timeout retransmission, etc.;

Disadvantages: Go-Back-N improves the problem of long waiting time in the stop-and-wait protocol, but there are still performance problems: when the window is very large, if retransmission occurs, the interval for retransmitting messages will be increased, which will be larger Greatly reduce efficiency;

(5) Selective-Repeat (selective retransmission)

Compared with the Go-Back-N protocol, the Selective-Repeat protocol avoids unnecessary retransmissions and improves efficiency by allowing the sender to only retransmit packets that are lost or damaged at the receiver.

Under the Selective-Repeat protocol, the sender needs to respond to the following three events:

1) Receive data from the upper layer. After receiving data from the upper layer, the sender needs to check the next sequence number available for the packet. If the sequence number is in the window, the data will be sent;

2) Receive ACK;

  1. If an ACK is received and the packet is within the window, the sender marks the confirmed packet as received;
  2. If the packet sequence number is equal to the base sequence number, the window sequence number moves forward to the unconfirmed packet with the smallest sequence number;
  3. If the window is moved and there are unsent packets with sequence numbers falling within the window, these packets will be sent;

3) Timeout. If a timeout occurs, the sender will retransmit packets that have been sent but not yet acknowledged.

Note: Unlike GBN, each group in the SR protocol has an independent timer;

Assume that the base number of the receiving window is 4 and the packet length is also 4; the receiver needs to respond to the following three events:

1) Packets with sequence numbers within [4,7] are received correctly. In this case,

  1. If the received packet falls within the receiver's window (the sequence number is in the range (4, 7]), an ACK will be sent;
  2. If the packet is a packet that has not been received before (the sequence number is less than 4), it will be cached;
  3. If the sequence number of the packet is equal to the base sequence number 4, then the packet and the previously cached packets with consecutive sequence numbers are delivered to the upper layer, and then the receiving window will move forward.

2) Packets with sequence numbers in [0,3] are received correctly. In this case, an ACK must be generated, even though the packet is one that has been previously acknowledged by the receiver. If the receiver does not acknowledge the packet, the sender window will not be able to move forward.

3) In other cases, ignore the group.

For the receiver, if a packet is received correctly regardless of whether it is in order, the receiver will return an ACK for the packet to the sender. Out-of-order packets will be cached until all lost packets are received, at which time a batch of packets can be delivered to the upper layer in order and the receiving window can be slid forward;

For example: In the above figure, the packet with sequence number 2 is lost, then the receiver will reply with the maximum confirmed ACK 1 each time after receiving the message; after the sender receives the ACK, it will resend packet 2; the receiver will After reaching group 2, slide the receiving window forward and return ACK5; at this time, all messages in the range [0, 5] have been received normally and can be processed;

(6) What is the relationship between the TCP protocol and the ARQ protocol?

TCP is based on the ARQ protocol, which uses ARQ's acknowledgment and retransmission mechanism to achieve reliable data transmission. The TCP protocol uses a variant of the ARQ protocol; in TCP, the sender divides the data into small data segments and uses the ARQ mechanism to ensure their reliable transmission. The sender will wait for confirmation from the receiver and retransmit if necessary. The receiver will verify the received data and send a confirmation or request retransmission message.

(7) How to understand RTT and RTO?

RTT (Round-Trip Time): Round trip delay. It represents the total delay experienced from the time the sender sends data to the time the sender receives the confirmation from the receiver (assuming that the receiver sends the confirmation immediately after receiving the data); it consists of three parts:

  1. Link propagation time (propagation delay); (Link propagation time refers to the time required for a signal to propagate on a physical link. It is the time delay from the signal sent from the sending end to the receiving end.)
  2. The processing time of the end system; (End-system processing time refers to the time delay required for processing and operation after the data reaches the receiving end system in network communication.)
  3. Queuing and processing time in the router cache (queuing delay); (In the router's cache, queuing and processing time refers to the time delay for data packets to wait to be processed and forwarded after entering the router.)

Among them, the values ​​of the first two parts are relatively fixed for a TCP connection, and the queuing and processing time in the router cache will change as the congestion level of the entire network changes. Therefore, changes in RTT reflect the degree of network congestion to a certain extent.

RTO (Retransmission TimeOut): retransmission timeout; the timeout retransmission time is dynamically adjusted according to the RTT and automatically controlled by the protocol stack; generally the RTO is at least 1.5 times the RTT; if the RTO is too small, retransmissions may be initiated too frequently ;If the RTO is too large, the waiting time for retransmission may be too long;

1.13 TCP flow control

(1) What is flow control?

When two parties communicate, the sender's rate is not necessarily equal to the receiver's rate. If the sender's sending rate is too fast, it will cause the receiver to be unable to process it. At this time, the receiver can only process the data that cannot be processed. There is a cache area (out-of-order data packets will also be stored in the cache area); if the cache area is full and the sender is still sending data, the receiver can only discard the received data packets; a large number of packet losses will This is a huge waste of network resources. Therefore, we need to control the sending rate of the sender so that the receiver and sender are in a dynamic balance. Controlling the sending rate of the sender is called flow control.

(2) How to control flow?

Both parties in the communication have two sliding windows:

  1. One is used to receive data, called the receiving window;
  2. One is used to send data, called the congestion window (i.e., the sending window);

ACK notification used to feedback the size of the receiving window is called window notification;

Each time the receiver receives a data packet, it can tell the sender how much space is left in its buffer when sending an ACK. We also call the remaining space in the buffer area the size of the receiving window, which is represented by the variable win.

After the sender receives it, it will adjust its own sending rate, that is, adjust its own sending window size. When the size of the receiving window received by the sender is 0, the sender will stop sending data. Prevent massive packet loss;

The above process is the negative feedback mechanism of TCP: through the feedback from the receiving end, the rate of the sending end is determined;

Note: The receiving end cannot directly and actively inform the sending end of the size of its receiving window, but through the dynamic adjustment of the sending window and receiving window, the receiving end can indirectly affect the sending rate of the sending end to achieve flow control.

Question 1 : If the sender stops sending data because the receiving window is 0 , how to judge whether it can send data again?

1) When the sender receives the acceptance window win = 0, it stops sending messages and starts a timer at the same time to send messages to the sender at regular intervals.

The receiving end sends a probe message.

2) After receiving the message, the receiver feedbacks the size of the current receiving window win through ACK;

3) After receiving the ACK, the sender determines the size of the receiving window win;

  1. If win = 0, refresh the timer again and wait for the next trial message to be sent;
  2. If win > 0, send an appropriate amount of data according to the window size;

Question 2: Is the size of the receiving window fixed?

It is not fixed, the size of the receiving window is dynamically adjusted according to a certain algorithm;

Question 3 : Is the larger the receiving window, the better?

No, according to the law of diminishing marginal returns, when the receiving window reaches a certain value, increasing it will not reduce the packet loss rate, and will also consume memory. Therefore, the size of the receiving window must be dynamically adjusted according to the network environment and the sending congestion window.

Question 4 : Are the sending window and receiving window equal?

Not necessarily equal. When the receiver sends an ACK message, it will tell the sender the size of its own receiving window, and the sender's sending window will set its own sending window accordingly, but they are not necessarily equal. ; Because the moment the receiver sends the ACK message, it is already processing the data piled in its own buffer, so under normal circumstances the receiving window >= the sending window.

1.14 TCP**** congestion control

Congestion control and flow control take very similar actions, but they address different problems:

  1. Congestion control is related to network congestion;
  2. And flow control is associated with the cache status of the receiver;

(1) What is congestion control?

Assume that host A transmits data to host B. When two hosts transmit data packets, if the sender does not receive an ACK from the receiver for a long time, the sender will think that the data packet it sent is lost, and will retransmit the lost data packet.

However, in actual situations, it is possible that too many hosts are using channel resources at this time, causing network congestion, and the data packet sent by A is blocked halfway and has not reached B for a long time. At this time, A mistakenly believes that packet loss has occurred and will retransmit the data packet.

The result is not only a waste of channel resources, but also making the network more congested. Therefore, congestion control is required.

(2) How to know the congestion situation of the network?

After A and B establish a connection, they can send data to B. However, at this time, A does not know the network congestion situation at this time. At this time, A does not know how many data packets to send continuously at one time.

Congestion window : The number of data packets that A sends continuously at one time is called the congestion window, and N represents the size of the congestion window at this time; In order to detect network congestion, two strategies can be adopted:

1) Linear growth: First send a data packet to test, if the data packet does not time out (no packet loss). Then 2 will be sent the next time. If no timeout event occurs, 3 will be sent next time, and so on, that is, N = 1, 2, 3, 4, 5…

2) Exponential growth: It is too slow to increase one by one, so you can send 1 at the beginning. If no timeout occurs, send 2. If the timeout event is still not sent, send 4, and then 8. …, use the analogy of doubling the speed, that is, N = 1, 2, 4, 8, 16…

Disadvantages : No matter the first method or the second method, there will be a bottleneck value in the end. However, the growth rate in the first case is too slow and can

The bottleneck value can be reached very slowly; in the second case, if the growth rate is too fast, the bottleneck value may be reached quickly;

In order to solve the problem of too slow or too fast detection, you can combine the first method with the second method: use exponential growth at the beginning, and after it grows to a certain threshold (recorded as ssthresh), Change to linear growth. We call the exponential growth phase slow start and the linear growth phase congestion avoidance ;

Figure 1 Slow start

(3) If a bottleneck is reached, how to deal with it?

    随着拥塞窗口的增加,发送速率不断提高,当TCP遇到分组超时重传时,即认为发生了网络拥塞;

    1)此时将更新ssthresh的值为当前拥塞窗口的一半,上图中是更新为24的一半即12;

    2)更新cwnd的值为1;

    3)然后继续执行慢启动—拥塞避免,如上图所示;

    如果TCP发送方接收到连续的3个重复确认,则认为是正常的网络包丢失,而不是网络拥塞造成的,这正是快恢复算法的功劳;

    1)重传丢失的分组;

    2)执行快恢复算法;

Figure 2 Congestion avoidance

(4) What is quick recovery?

When a timeout event occurs, it may not necessarily be due to network congestion. It may also be because a data packet is lost or damaged. If it is regarded as congestion control, starting from the initial state may lead to stricter control and lower sending rate;

For the above situation, we handle it through redundant ACK: that is, receiving repeated confirmations three times in a row indicates that the data packet is lost, and the lost data packet can be quickly retransmitted;

Suppose A sends M1, M2, M3, M4, M5...N data packets to B. If B receives M1, M2, M4... but never receives M3, then it will confirm M2 repeatedly, which is intended to tell A, M3 has not been received yet, maybe it was lost.

When A receives three consecutive ACKs confirming M2, and the M3 timeout event has not occurred yet. A knows that M3 may be lost. At this time, A does not have to wait for the timer set by M3 to expire, but quickly retransmits M3. And adjust ssthresh = MAX / 2, but at this time, the control window N is not reset to 1, but N = ssthresh is directly set and increases linearly. This process is called fast recovery, and the TCP version with fast recovery is called TCP Reno.

Figure 3 Quick recovery

Note: Another TCP version sets the size of the congestion window to 1, starting from the initial state, whether three identical ACKs are received or a timeout event occurs. This version is called TCP Tahoe.

1.15 TCP/IP protocol stack report format example

Taking UDP communication as an example, view the message structure of each layer of the TCP/IP model:

(1) Link layer

(2) Network layer (IP protocol)

(3) Transport layer (UDP protocol)

in conclusion:

MAC address is a product of Ethernet

IP address is a product of the network layer

The port is a product of the transport layer

(4) ARP protocol (network layer)

Address Resolution Protocol, address resolution protocol, maps the IPv4 address to a hardware address (MAC address), used for

Broadcasting network;

(5) ICMP protocol (network layer)

Internet Control Message Protocol, a network control message protocol, handles message communication between hosts and routers;


digression

Many people who are new to the computer industry or graduates of computer-related majors from universities encounter difficulties in finding employment due to lack of practical experience. Let's look at two sets of data:

  • The number of college graduates nationwide in 2023 is expected to reach 11.58 million, and the employment situation is grim;

  • Data released during the National Cyber ​​Security Awareness Week show that by 2027, the shortage of cyber security personnel in our country will reach 3.27 million.

On the one hand, the employment situation for fresh graduates is grim every year, and on the other hand, there is a shortage of one million cybersecurity talents.

On June 9, Max Research’s 2023 Employment Blue Book (including the “2023 China Undergraduate Employment Report” and “2023 China Higher Vocational Students Employment Report”) was officially released.

The top 10 majors with the highest monthly income for college graduates in 2022

Undergraduate computer majors and higher vocational automation majors have higher monthly incomes. The monthly incomes of the 2022 undergraduate computer majors and higher vocational automation majors are 6,863 yuan and 5,339 yuan respectively. Among them, the starting salary of undergraduate computer majors is basically the same as that of the 2021 class, and the monthly income of higher vocational automation majors has increased significantly. The 2022 class overtook the railway transportation major (5,295 yuan) to rank first.

Looking at the major specifically, the major with the highest monthly income for the 2022 undergraduate class is information security (7,579 yuan). Compared with the class of 2018, undergraduate majors related to artificial intelligence such as electronic science and technology and automation performed well, with starting salaries increasing by 19% compared with five years ago. Although data science and big data technology are new majors in recent years, they have performed well and have ranked among the top three majors with the highest monthly income for 2022 undergraduate graduates six months after graduation. French, the only humanities and social sciences major that entered the top 10 highest-paying undergraduates five years ago, has dropped out of the top 10.

“There is no national security without cybersecurity.” At present, network security has been elevated to the level of national strategy and has become one of the most important factors affecting national security and social stability.

Characteristics of the network security industry

1. The employment salary is very high, and the salary increases quickly. In 2021, Liepin.com announced that the employment salary in the network security industry is the highest per capita in the industry, 337,700!

2. There is a large talent gap and many employment opportunities

On September 18, 2019, the official website of the "Central People's Government of the People's Republic of China" published: my country's demand for cyberspace security talents is 1.4 million, but major schools across the country train less than 1.5 million people every year. Liepin.com’s “Cybersecurity Report for the First Half of 2021” predicts that the demand for network security talents in 2027 will be 3 million. Currently, there are only 100,000 employees engaged in the network security industry.

The industry has huge room for development and there are many jobs

Since the establishment of the network security industry, dozens of new network security industry positions have been added: network security experts, network security analysts, security consultants, network security engineers, security architects, security operation and maintenance engineers, penetration engineers, information security management Officer, data security engineer, network security operations engineer, network security emergency response engineer, data appraiser, network security product manager, network security service engineer, network security trainer, network security auditor, threat intelligence analysis engineer, disaster recovery professional , Practical attack and defense professionals...

Great career value-added potential

The network security major has strong technical characteristics, especially mastering the core network architecture and security technologies at work, which has an irreplaceable competitive advantage in career development.

As personal abilities continue to improve, the professional value of the work they do will also increase with the enrichment of their experience and the maturity of project operations, and the room for appreciation will continue to increase. This is the main reason why it is popular with everyone.

To a certain extent, in the field of network security, just like the medical profession, the older you get, the more popular you become. Because the technology becomes more mature, your work will naturally be taken seriously, and promotion and salary increases will come naturally.

How to learn hacking & network security

As long as you like my article today, my private network security learning materials will be shared with you for free. Come and see what is available.

1. Learning roadmap

There are a lot of things to learn about attack and defense. I have written down the specific things you need to learn in the road map above. If you can complete them, you will have no problem getting a job or taking on a private job.

2. Video tutorial

Although there are many learning resources on the Internet, they are basically incomplete. This is an Internet security video tutorial I recorded myself. I have accompanying video explanations for every knowledge point in the roadmap above.

The content covers the study of network security laws, network security operations and other security assessments, penetration testing basics, detailed explanations of vulnerabilities, basic computer knowledge, etc. They are all must-know learning contents for getting started with network security.

(They are all packaged into one piece and cannot be expanded one by one. There are more than 300 episodes in total)

Due to limited space, only part of the information is displayed. You need to click on the link below to obtain it.

CSDN gift package: "Hacker & Network Security Introduction & Advanced Learning Resource Package" free sharing

3. Technical documents and e-books

I also compiled the technical documents myself, including my experience and technical points in participating in large-scale network security operations, CTF, and digging SRC vulnerabilities. There are more than 200 e-books. Due to the sensitivity of the content, I will not display them one by one.

Due to limited space, only part of the information is displayed. You need to click on the link below to obtain it.

CSDN gift package: "Hacker & Network Security Introduction & Advanced Learning Resource Package" free sharing

4. Toolkit, interview questions and source code

"If you want to do your job well, you must first sharpen your tools." I have summarized dozens of the most popular hacking tools for everyone. The scope of coverage mainly focuses on information collection, Android hacking tools, automation tools, phishing, etc. Interested students should not miss it.

There is also the case source code and corresponding toolkit mentioned in my video, which you can take away if needed.

Due to limited space, only part of the information is displayed. You need to click on the link below to obtain it.

CSDN gift package: "Hacker & Network Security Introduction & Advanced Learning Resource Package" free sharing

Finally, here are the interview questions about network security that I have compiled over the past few years. If you are looking for a job in network security, they will definitely help you a lot.

These questions are often encountered when interviewing Sangfor, Qi Anxin, Tencent or other major companies. If you have good questions or good insights, please share them.

Reference analysis: Sangfor official website, Qi’anxin official website, Freebuf, csdn, etc.

Content features: Clear organization and graphical representation to make it easier to understand.

Summary of content: Including intranet, operating system, protocol, penetration testing, security service, vulnerability, injection, XSS, CSRF, SSRF, file upload, file download, file inclusion, XXE, logical vulnerability, tools, SQLmap, NMAP, BP, MSF…

Due to limited space, only part of the information is displayed. You need to click on the link below to obtain it.

CSDN gift package: "Hacker & Network Security Introduction & Advanced Learning Resource Package" free sharing

Guess you like

Origin blog.csdn.net/Python_0011/article/details/132868629