Computer Network (continuously updated...)

1. Let’s talk about the computer network architecture

There are generally three types of computer network architecture: OSI seven-layer model, TCP/IP four-layer model, and five-layer structure.

Simply put, OSI is a theoretical network communication model, TCP/IP is the actual network communication model, and the five-layer structure is a compromise network communication model to introduce network principles.

OSI seven-layer model

The OSI seven-layer model is a standard system for interconnection between computers or communication systems developed by the International Organization for Standardization.

Application layer: Specific network applications are completed through the interaction between application processes. The application layer protocol defines the rules for communication and interaction between application processes. Common protocols are: HTTP FTP SMTP SNMP DNS .
Presentation layer: data representation, security, and compression. Ensure that information sent by the application layer of one system can be read by the application layer of another system.
Session layer: establishes, manages, and terminates sessions. It is the interface between user applications and the network.
Transport layer: Provides reliable and transparent data transmission between the source and destination. The transport layer protocol provides logical communication for processes running on different hosts.
Network layer: Translate network addresses into corresponding physical addresses to implement path selection between different networks. The protocols include ICMP IGMP IP, etc.
Data link layer: Based on the physical layer providing bit stream services, it establishes data links between adjacent nodes.
Physical layer: Establish, maintain, and disconnect physical connections.

TCP/IP four-layer model

Application layer: corresponds to the OSI reference model (application layer, presentation layer, session layer).
Transport layer: Corresponding to the transport layer of OSI, it provides end-to-end communication functions for application layer entities, ensuring the sequential transmission of data packets and the integrity of data.
Internet layer: Corresponding to the network layer of the OSI reference model, it mainly solves host-to-host communication problems.
Network interface layer: Corresponds to the data link layer and physical layer of the OSI reference model.

five-tier architecture

Application layer: corresponds to the OSI reference model (application layer, presentation layer, session layer).
Transport layer: The transport layer corresponding to the OSI reference model
Network layer: The network layer corresponding to the OSI reference model
Data link layer: Data link layer corresponding to the OSI reference model
Physical layer: The physical layer corresponding to the OSI reference model.

2. Tell me what network protocols correspond to each layer?

3. So how is data transmitted between layers?

For the sender, it is packed layer by layer from top to bottom, and for the receiver, it is unpacked layer by layer from bottom to top.

The sender's application process transfers data to the receiver's application process
The AP first hands the data to the application layer of this host. The application layer plus the control information H5 of this layer becomes the data unit of the next layer.
After the transport layer receives this data unit, it adds the control information H4 of this layer, and then passes it to the network layer to become the data unit of the network layer.
At the data link layer, the control information is divided into two parts, which are added to the header (H2) and tail (T2) of the data unit of this layer.
The final physical layer transmits bit streams

4. What is the process from entering the URL in the browser address bar to displaying the homepage?

The general process of this question is relatively simple, but there are many points that can be explored in detail: DNS resolution, TCP three-way handshake, HTTP message format, TCP four-way wave, etc.

DNS resolution: resolve domain names into corresponding IP addresses.
TCP connection: Establish a TCP connection with the server through a three-way handshake
Send an HTTP request to the server
The server processes the request and returns an HTTP response
The browser parses and renders the page
Disconnect: TCP waves four times, the connection ends

Let's take the input www.baidu.comopen in new window as an example:

What protocols are used at each layer?

4. What are the security attacks?

Two types: active attack and passive attack.

Passive attack : refers to an attacker eavesdropping on other people's communication content on the network. This type of attack is usually called interception . There are two main forms of passive attack: message content leakage attack and traffic analysis attack . Since the attacker did not modify the data, this attack is difficult to detect.
Active attack : directly affects existing data and services. Common active attack types include:

Tampering: The attacker deliberately tamperes with the messages sent on the network, and even sends completely forged messages to the recipient.
Malicious programs: There are many types of malicious programs, including computer viruses, computer worms, Trojan horses, backdoor intrusions, rogue software, etc.
Denial of Service DoS: The attacker keeps sending packets to the server, making the server unable to provide normal services.

5. Do you understand DNS hijacking?

DNS hijacking, also known as domain name hijacking, is an attack method that causes users to access the wrong website by replacing the IP address corresponding to the original domain name, or making it impossible for users to access the website normally.

Domain name hijacking can often only be carried out within a specific network range, and DNS servers outside the range can return normal IP addresses. An attacker can impersonate the organization that owns the original domain name, modify the organization's domain name registration information through email, or transfer the domain name to another host, and save the new domain name information in the designated DNS server, making it impossible for users to access the domain name. The original domain name is parsed to access the target address.

What are the steps for DNS hijacking?

Obtain the domain name information to be hijacked: The attacker will first access the domain name to query the domain name information of the site to be hijacked.
Control the email account that responds to the domain name: After obtaining the domain name information, the attacker uses brute force or special methods to crack the password corresponding to the email account used by the company to register the domain name. More advanced attackers can even directly E-Mail information theft.
Modify registration information: When an attacker cracks an email, he or she will use the relevant change functions to modify the registration information of the domain name, including domain name owner information, DNS server information, etc.
Use E-Mail to send and receive confirmation letters: After modifying the registration information, the attacker's E-Mail receives the relevant confirmation information for modifying the domain name registration information before the real owner, and replies to confirm the modified file, and waits for the network company to restore the successfully modified letter. Afterwards, the attacker successfully completed DNS hijacking.

How to deal with DNS hijacking?

Access websites directly through IP addresses to avoid DNS hijacking
Since domain name hijacking can often only be carried out within a specific network range, some advanced users can point DNS to a normal domain name server through network settings to achieve normal access to the target URL. For example, the address of the computer's preferred DNS server is fixed at 8.8.8.8

6. What is a CSRF attack? How to avoid it?

What is a CSRF attack?

CSRF, cross-site request forgery (English full name is Cross-site request forgery), is an attack method that coerces users to perform unintended operations on the currently logged-in web application.

How does CSRF attack?

The user logs into the bank without exiting, and the browser contains the user's identity authentication information at the bank.
The attacker included the forged transfer request in the post
Users browse posts while remaining logged in on the bank website.
Send the forged transfer request along with the identity authentication information to the bank website
When the bank website sees the identity authentication information, it thinks it is the user's legitimate operation, which ultimately causes the user to lose funds.

How to deal with CSRF attacks?

Check the Referer field

The Referer field in the HTTP header records the source address of the HTTP request. Under normal circumstances, the request to access a security-restricted page comes from the same website, and if a hacker wants to implement a CSRF attack on it, he can generally only construct the request on his own website. Therefore, CSRF attacks can be defended by verifying the Referer value.

Add verification token

Add a randomly generated token as a parameter to the HTTP request, and establish an interceptor on the server side to verify the token. If there is no token in the request or the token content is incorrect, it is considered that it may be a CSRF attack and the request will be rejected. .

Multiple verification for sensitive operations

For some sensitive operations, in addition to verifying the user's authentication information, multiple verifications can also be performed through email confirmation and verification code confirmation.

7.What is an XSS attack and how to avoid it?

XSS attacks are also relatively common. XSS is called Cross-Site Scripting . Because it is confused with the abbreviation of Cascading Style Sheets (CSS), some people abbreviate Cross-Site Scripting attacks as XSS. It refers to the malicious attacker inserting malicious html code into the Web page. When the user browses the web page, the html code embedded in the Web will be executed, thereby achieving the special purpose of maliciously attacking the user.

XSS attacks are generally divided into three types: storage type, reflection type, and DOM type XSS.

How is XSS attacked?

Simply put, the XSS attack method is to find a way to "instruct" the user's browser to execute some front-end code that does not originally exist in the web page.

Take the reflection type as an example. The flow chart is as follows:

The attacker constructs a special URL that contains malicious code.
When a user opens a URL with malicious code, he or she accesses a normal website server.
The website server takes out the malicious code from the URL, splices it into HTML and returns it to the browser.
After the user's browser receives the response, it parses and executes it. The malicious code mixed in it is also executed, requests the malicious server, and sends user data.
The attacker can steal the user's data, impersonate the user's behavior, and call the target website interface to perform operations specified by the attacker.

How to deal with XSS attacks?

Filter input, filter tags, etc. to only allow legal values.
HTML escaping
For link jumps, such as <a href="xxx" etc., the content must be verified and illegal links starting with script are prohibited.
Limit input length

8. What are DoS, DDoS and DRDoS attacks?

DOS : (Denial of Service), which translates as denial of service. All attacks that can cause denial of service are called DOS attacks. The most common DoS attacks include computer network broadband attacks and connectivity attacks .
DDoS : (Distributed Denial of Service), translated as distributed denial of service. It refers to multiple attackers in different locations launching attacks against one or several targets at the same time, or an attacker controlling multiple machines located in different locations and using these machines to attack the victim simultaneously.

The main forms include traffic attacks and resource exhaustion attacks. Common DDoS attacks include: SYN Flood, Ping of Death, ACK Flood, UDP Flood , etc.

DRDoS : (Distributed Reflection Denial of Service), Chinese is Distributed Reflection Denial of Service. This method relies on sending a large number of data packets with the victim's IP address to the attacking host, and then the attacking host responds in large numbers to the IP address source. This results in a denial of service attack.

How to protect against DDoS?

For traffic attacks in DDoS, the most direct method is to increase the bandwidth. In theory, as long as the bandwidth is greater than the attack traffic, it is fine. However, this method is very costly. On the premise of sufficient bandwidth, we should try to improve the configuration of hardware facilities such as routers, network cards, and switches.

For resource exhaustion attacks, we can upgrade the host server hardware so that the server can effectively resist massive SYN attack packets while ensuring network bandwidth. We can also install a professional anti-DDoS firewall to combat traffic-based attacks such as SYN Flood. Technologies such as porcelain bowl, load balancing, and CDN can effectively combat DDos attacks.

9. What is the difference between symmetric encryption and asymmetric encryption?

Symmetric encryption : refers to the use of the same key for encryption and decryption. The advantage is that the calculation speed is faster, but the disadvantage is how to safely transmit the key to the other party. Common symmetric encryption algorithms include: DES, AES, etc.

Asymmetric encryption : refers to the use of different keys (i.e. public key and private key) for encryption and decryption. Public keys and private keys exist in pairs. If the public key is used to encrypt data, only the corresponding private key can decrypt it. Common asymmetric encryption algorithms include RSA.

What is the difference between RSA and AES algorithms?

RSA

Asymmetric encryption is used, using the public key for encryption and the private key for decryption. The length of the private key is generally longer. Since it requires calculations such as the multiplication of large numbers and modulus, its operation speed is slow and it is not suitable for encrypting large amounts of data files.

AES

Using symmetric encryption, the maximum length of the secret key is only 256 bits. Encryption and decryption are fast and easy to implement in hardware. Because it is symmetric encryption, both communicating parties need to know the encryption key before data transmission.

10. What is the definition and function of IP protocol?

What is the IP protocol?

IP Protocol (Internet Protocol), also known as Internet Protocol, is a data packet protocol that supports interconnection between networks. It works at the Internet layer and its main purpose is to improve the scalability of the network.

Through the Internet Protocol IP , networks with different performances participating in the interconnection can be regarded as a unified network .

Compared with the transport layer TCP, the IP protocol is a connectionless/unreliable, best-effort packet transmission service, and together with the TCP protocol, it forms the core of the TCP/IP protocol.

What does the IP protocol do?

The IP protocol mainly has the following functions:

Addressing and routing : The source IP address and destination IP address are carried in the IP datagram to indicate the source host and destination host of the data packet. During the transmission process of IP datagrams, each intermediate node (IP gateway, router) only forwards based on the network address. If the intermediate node is a router, the router will select the appropriate path according to the routing table. The IP protocol forwards IP datagrams to the destination host based on the routing information provided by the routing protocol.
Fragmentation and reassembly : IP datagrams may pass through different networks during transmission. In different networks, the maximum length limit of datagrams is different. The IP protocol assigns an identifier and fragmentation to each IP datagram. The information related to the assembly enables the datagram to be transmitted in different networks. The fragmented IP datagram can be forwarded independently in the network. After reaching the target host, the target host completes the reassembly work and restores the original IP datagram.

What is the difference between transport layer protocols and network layer protocols?

The network layer protocol is responsible for providing logical communication between hosts; the transport layer protocol is responsible for providing logical communication between processes.

11. What are the classifications of IP addresses?

An IP address is unique within this Internet range. It can generally be thought of as, IP address = {<network number>, <host number>}.

Network number : It indicates which network of the Internet the host is connected to and which network address it belongs to.
Host number : It identifies the host address and indicates which host in the network it belongs to.

IP addresses are divided into five categories: A, B, C, D, and E:

Class A address (1~126): starts with 0, the network number occupies the first 8 digits, and the host number occupies the last 24 digits.
Class B address (128~191): starts with 10, the network number occupies the first 16 digits, and the host number occupies the last 16 digits.
Class C address (192~223): starts with 110, the network number occupies the first 24 digits, and the host number occupies the last 8 digits.
Class D addresses (224~239): starting with 1110, reserved as multicast addresses.
Class E address (240~255): starting with 1111, reserved bits for future use

15. How to solve the problem of insufficient IPV4 addresses?

We know that an IP address has 32 bits, which can mark 2 to the power of 32 addresses. It sounds like a lot, but the number of network devices in the world has far exceeded this number, so IPV4 addresses are no longer enough. So how to solve it?

DHCP: Dynamic Host Configuration Protocol, dynamically allocates IP addresses and only assigns IP addresses to devices connected to the network. Therefore, a device with the same MAC address does not necessarily get the same IP address every time it connects to the Internet. This protocol So that idle IP addresses can be fully utilized.
CIDR: Classless Inter-Domain Routing. CIDR eliminates the traditional concepts of Class A, Class B, and Class C addresses and subnets, thereby allocating IPv4 address space more effectively, but it cannot fundamentally solve the problem of address exhaustion.
NAT: Network Address Translation Protocol. We know that hosts belonging to different LANs can use the same IP address, thus alleviating the problem of IP resource depletion to a certain extent. However, the IP address used by the host in the LAN cannot be used in the public network. , when a local network host wants to communicate with a public network host, the NAT method can convert the host IP address into a global IP address. This protocol can effectively solve the problem of insufficient IP addresses.
IPv6: As the next generation Internet protocol that succeeds IPv4, it can achieve 2 to the power of 128 addresses. This order of magnitude is enough even if every grain of sand on the earth is assigned an IP address. This protocol can fundamentally solve the problem. The problem of insufficient IPv4 addresses.

Why do we need a MAC address when we have an IP address?

Only when the device is connected to the network can it be assigned an IP address based on which subnet it has entered, when the device does not have an IP address yet, or during the process of assigning an IP. We need MAC addresses to distinguish different devices.
The IP address can be compared to the address and the MAC address to the recipient. In a communication process, both are indispensable.

Tell us about the working process of the ARP protocol?

ARP protocol, Address Resolution Protocol , address resolution protocol, which is used to realize the mapping of IP addresses to MAC addresses.

First, each host will establish an ARP list in its own ARP buffer to represent the correspondence between the IP address and the MAC address.
When the source host needs to send a data packet to the destination host, it will first check its own ARP list to see if there is a MAC address corresponding to the IP address; if so, it will directly send the data packet to the MAC address; if not, It initiates an ARP request broadcast packet to the local network segment to query the MAC address corresponding to the destination host. The packet of this ARP request includes the IP address of the source host, the hardware address, and the IP address of the destination host.
After receiving this ARP request, all hosts in the network will check whether the destination IP in the data packet is consistent with its own IP address. If they are not the same, the packet will be ignored; if they are the same, the host will first add the MAC address and IP address of the sender to its own ARP list. If the IP information already exists in the ARP table, it will overwrite it, and then Send an ARP response packet to the source host, telling the other party that it is the MAC address it needs to find.
After the source host receives this ARP response packet, it adds the IP address and MAC address of the destination host to its own ARP list, and uses this information to start data transmission. If the source host has not received an ARP response packet, it means that the ARP query failed.

16. What are the functions of the ICMP protocol?

ICMP (Internet Control Message Protocol), Internet Control Message Protocol.

The ICMP protocol is a connectionless-oriented protocol used to transmit error reporting control information.
It is a very important protocol and it is extremely important for network security. It is a network layer protocol and is mainly used to transfer control information between hosts and routers, including reporting errors, exchanging restricted control and status information , etc.
When encountering situations such as IP data being unable to access the target or IP routers being unable to forward data packets at the current transmission rate, ICMP messages will be automatically sent.

For example, ping , which we use frequently every day , is based on ICMP.

Tell me about the principle of ping?

ping, Packet Internet Groper , is an Internet packet explorer, a program used to test the amount of network connections. Ping is a service command that works at the application layer in the TCP/IP network architecture. It mainly sends ICMP (Internet Control Message Protocol) request messages to a specific destination host to test whether the destination station is reachable and understood. its relevant status.

Generally speaking, ping can be used to check whether the network is accessible. It ICMPworks based on protocols. Assuming that machine A pings machine B , the working process is as follows:

Ping notifies the system to create a new fixed-format ICMP request packet.
The ICMP protocol packages the data packet with the IP address of target machine B and forwards it to the IP protocol layer.
The IP layer protocol uses the local IP address as the source address, the IP address of machine B as the destination address, and adds some other control information to construct an IP data packet.
First obtain the MAC address of target machine B.
The data link layer constructs a data frame. The destination address is the MAC address passed from the IP layer , and the source address is the MAC address of the local machine.
After receiving it, machine B compares the target address to see if it is consistent with its own MAC address. If it matches, it will process and return it. If it does not match, it will discard it.
Calculate the round-trip time based on the timestamp in the ICMP echo reply message returned by the destination host
The final displayed results include the following items: IP address sent to the destination host, number of packets sent & received & lost, minimum, maximum & average round trip time

Tell us about the working process of the ARP protocol?

ARP protocol, Address Resolution Protocol , address resolution protocol, which is used to realize the mapping of IP addresses to MAC addresses.

First, each host will establish an ARP list in its own ARP buffer to represent the correspondence between the IP address and the MAC address.
When the source host needs to send a data packet to the destination host, it will first check its own ARP list to see if there is a MAC address corresponding to the IP address; if so, it will directly send the data packet to the MAC address; if not, It initiates an ARP request broadcast packet to the local network segment to query the MAC address corresponding to the destination host. The packet of this ARP request includes the IP address of the source host, the hardware address, and the IP address of the destination host.
After receiving this ARP request, all hosts in the network will check whether the destination IP in the data packet is consistent with its own IP address. If they are not the same, the packet will be ignored; if they are the same, the host will first add the MAC address and IP address of the sender to its own ARP list. If the IP information already exists in the ARP table, it will overwrite it, and then Send an ARP response packet to the source host, telling the other party that it is the MAC address it needs to find.
After the source host receives this ARP response packet, it adds the IP address and MAC address of the destination host to its own ARP list, and uses this information to start data transmission. If the source host has not received the ARP response packet, it means that the ARP query failed.

Talk about the difference between TCP and UDP?

The most fundamental difference: TCP is connection-oriented, while UDP is connectionless .

Talk about the application scenarios of TCP and UDP?

TCP application scenarios: Scenarios with relatively low efficiency requirements but relatively high accuracy requirements. Because data confirmation, retransmission, sorting and other operations are required during transmission, the efficiency is not as high as UDP. For example: file transfer (accuracy and high requirements, but the speed can be relatively slow), sending and receiving emails, and remote login.
UDP application scenarios: Scenarios with relatively high efficiency requirements and relatively low accuracy requirements. For example: QQ chat, online video, Internet voice calls (instant messaging, high speed requirements, but occasional interruption is not a big problem, and the retransmission mechanism cannot be used here at all), broadcast communication (broadcast, multicast).

Why does QQ use UDP protocol?

First of all, QQ is not entirely based on UDP. For example, when using QQ for file transfer and other activities, TCP will be used as a guarantee of reliable transmission.
The advantage of using UDP for interactive communication is that the delay is shorter and the handling of data loss is relatively simple. At the same time, TCP is a full-duplex protocol and needs to establish a connection, so the network overhead will be relatively large.
If you use QQ Voice and QQ Video, the advantages of UDP are even more prominent. First, the delay is smaller. The most important point is unreliable transmission, which means that if data is lost, there will be no retransmission. Because generally speaking, users can accept that the image is slightly blurred and the sound is slightly unclear, but if the previously lost picture and sound appear again after a few seconds, this may be difficult to accept.
Since QQ's server design capacity is for massive applications, a server must accommodate hundreds of thousands of concurrent connections at the same time. Therefore, the server can only guarantee this ultra-large-scale service by using the UDP protocol to communicate with the client.

To summarize briefly: UDP protocol is a connectionless protocol. It is highly efficient, fast, takes up less resources, and puts less pressure on the server. However, its transmission mechanism is unreliable and must rely on auxiliary algorithms to complete transmission control. The communication protocol used by QQ is mainly UDP, supplemented by TCP protocol.

Why is the UDP protocol unreliable?

UDP does not need to establish a connection before transmitting data. After receiving the UDP message, the transport layer of the remote host does not need to confirm, providing unreliable delivery. To summarize, the following four points are made:

No guarantee of message delivery: no acknowledgment, no retransmission, no timeout
Delivery order is not guaranteed: no packet sequence number is set, no rearrangement is performed, and head-of-line blocking does not occur.
No tracking of connection status: no need to establish a connection or restart the state machine
No congestion control: no built-in client or network feedback mechanism

# DNS Why use UDP?

More precisely, DNS uses both TCP and UDP.

TCP is used when performing zone transfer (the primary domain name server transmits the changed part of the data to the secondary domain name server), because the amount of data transmitted synchronously is more than the amount of data in a request and response, and the message length allowed by TCP is longer. long, so in order to ensure the correctness of the data, TCP based on reliable connections will be used.

When a client queries a DNS server for a domain name (domain name resolution), the returned content generally does not exceed the maximum length of the UDP message, which is 512 bytes. When using UDP for transmission, there is no need to create a connection, thus greatly improving the response speed. , but this requires that both the domain name resolution server and the domain name server must handle timeouts and retransmissions themselves to ensure reliability.