Computer network interview questions (super detailed)

computer network architecture

insert image description here

application layer

The application layer is used to specify the protocol that the application process follows when communicating, and many protocols of the application layer are based on the client-server approach. Client (client) and server (server) both refer to the two application processes involved in the communication. The client-server approach describes the relationship between the processes serving and being served. The client is the service requester and the server is the service provider. The message is transmitted.

Protocols involved:
Domain Name System DNS: A distributed database that maps domain names and IP addresses to each other, making it easier for people to access the Internet without having to remember IP strings that can be directly read by machines.
HTTP Protocol: Hypertext Transfer Protocol, all World Wide Web documents must comply with this standard. HTTP was originally designed to provide a way to publish and receive HTML pages.
Mail transfer protocol: SMTP

transport layer

Only the protocol stack of the host located at the edge of the network has the transport layer. Unlike the network layer, the network layer provides logical communication between hosts, while the transport layer provides general data transmission services between application processes (end-to-end services). ). The transport layer has functions such as flow control (preventing overload, that is, injecting too much data into the network), congestion control (suppressing the rate of data transmission) and other functions, providing end-to-end reliable and transparent data transmission services for upper-layer protocols, and upper-layer service users The implementation details of the communication subnet need not be related.

The transport layer mainly uses the following two protocols:

  • Transmission Control Protocol TCP: A connection-oriented protocol that provides reliable data transmission services.
  • User Datagram Protocol UDP: A connectionless protocol that does not provide reliable delivery.

insert image description here
Protocols running on the TCP protocol:

HTTP (Hypertext Transfer Protocol, Hypertext Transfer Protocol) Web server transfer hypertext transfer protocol to the local browser. Port: 80
HTTPS (HTTP over SSL, Hypertext Transfer Protocol Secure), a secure version of the HTTP protocol. Port: 443
FTP (File Transfer Protocol), used for file transfer. Port: 21
SMTP (Simple Mail Transfer Protocol, Simple Mail Transfer Protocol), used to send e-mail. Port: 25
SSH (Secure Shell, used to replace TELNET with poor security), used for encrypted and secure login.

Protocols running on the UDP protocol:

DNS (Domain Name Service, Domain Name Service), used to complete address lookup, mail forwarding and other work.
SNMP (Simple Network Management Protocol) is used for network information collection and network management.
NTP (Network Time Protocol) is used for network synchronization.
DHCP (Dynamic Host Configuration Protocol), dynamically configures IP addresses.

Network layer

The network layer adopts the IP datagram service: the connection between two nodes is established through IP addressing, and then the segment or user datagram generated by the transport layer is encapsulated into packets and packets for transmission. The connection is established first, and each packet is sent independently, regardless of the packets before and after it. In this process, the network layer does not provide end-to-end reliable transport services, and does its best to deliver (the transport layer in the host of the network is responsible for reliable delivery).
insert image description here

The Internet Protocol IP is one of the two most important protocols in the TCP/IP system. There are three other protocols used in conjunction with the IP protocol:

  • ARP (Address Resolution Protocol)
  • Internet Control Message Protocol ICMP (Internet Control Message Protocol)
  • Internet Group Management Protocol IGMP (Internet Group Management Protocol)

ARP (Address Resolution Protocol), the ARP protocol completes the mapping between IP addresses and physical addresses, and is used to dynamically resolve the addresses of Ethernet hardware.
Working principle:

  • First, each host builds an ARP list in its own ARP buffer to represent the correspondence between IP addresses and MAC addresses.
  • When the source host needs to send a data packet to the destination host, it will first check whether the MAC address corresponding to the IP address exists in its ARP list:
  • If there is, send the packet directly to this MAC address;
  • If not, send an ARP request broadcast packet to the local network segment to query the MAC address corresponding to the destination host. The ARP request packet includes the IP address of the source host, the hardware address, and the IP address of the destination host.
  • After all hosts in the network receive this ARP request, they will check whether the destination IP in the data packet is the same as their own IP address.
  • If it is not the same, ignore the packet; if it is the same, the host first adds the sender's MAC address and IP address to its own ARP list,
  • If the IP information already exists in the ARP table, overwrite it, and then send an ARP response packet to the source host, telling the other party that it is the MAC address it needs to find;
  • After the source host receives the ARP response packet, it adds the obtained IP address and MAC address of the destination host to its own ARP list, and uses this information to start data transmission.
  • If the source host has never received an ARP response packet, it means that the ARP query fails.

data link layer

Data link layer: The data transmission between two devices can be regarded as being carried out on a pipeline, and the transmitted data unit is a frame (each frame includes data and necessary control information (such as synchronization information, address information, Error control, etc.)), the link layer ensures the correctness of the transmitted data.

Three questions:

  • Encapsulation into a frame: It is to add a header and a trailer before and after a piece of data, and then a frame is formed. An important role of the header and the trailer is to delimit the frame.
  • Transparent transmission: In the process of data transmission, if the binary code of a byte in the data happens to be the same as SOH or EOT, the data link layer will mistakenly "find the frame boundary", so that these data can be as-is without error through this data link layer. Solve the problem of transparent transmission: character padding (insert escape characters)
  • Error detection: Bit errors may occur during transmission: 1 may become 0, and 0 may become 1. In a period of time, the ratio of transmission error bits to the total number of transmitted bits is called bit error rate BER (Bit Error Rate). The bit error rate has a great relationship with the signal-to-noise ratio. When the computer network transmits data, various error detection measures must be adopted. In the frame transmitted by the data link layer, the error detection technology of cyclic redundancy check (CRC) is widely used.

insert image description here

physical layer

The physical layer considers how to transmit data bit streams on the transmission media connecting various computers. Its function is to shield the differences between different transmission media and communication means as much as possible. So that the upper data link layer does not have to consider what the specific transmission medium of the network is. transmitted in bits

The main task of the physical layer: determine some characteristics of the interface with the transmission medium (the relevant characteristics of the device, the voltage range, and the corresponding function)
insert image description here


The benefits of network protocol layering?

  • Simplify problem difficulty and complexity. Since the layers are independent, we can split large problems into small ones.
  • Good flexibility. When the technology of one of the layers changes, as long as the interface relationship between the layers remains unchanged, the other layers are not affected.
  • Easy to implement and maintain.
  • Promote standardization work. When separated, the functions of each layer can be described relatively simply

TCP vs UDP

  • TCP/IP, or Transmission Control Protocol, is a connection-oriented protocol. Before sending data, a connection must be established. TCP provides reliable services, that is, data transmitted through a TCP connection will not be lost, not duplicated, and arrive in sequence. (similar to making a phone call)
  • UDP is one of the TCP/IP protocol suites. It is a connectionless protocol. It does not need to establish a connection before sending data. It is an unreliable protocol. Because there is no need to establish a connection, it can be transmitted in any possible path on the network, so whether it can reach the destination, the time of reaching the destination, and the correctness of the content cannot be guaranteed. (similar to sending WeChat)

insert image description here


TCP three-way handshake

In network data transmission, the transport layer protocol TCP is a reliable transmission to establish a connection. The process of TCP establishing a connection is called a three-way handshake.

The specific details of the three-way handshake: the client sends SYN to the server -> the server returns SYN, ACK -> the client sends ACK

  • In the figure, the TCP server process of host B first creates the transmission control block TCB, and prepares to accept the connection request of the client process.
  • The first handshake: Host A's TCP sends a connection request segment to host B, the synchronization bit in its header is SYN = 1, and the sequence number seq = x is selected, indicating that the sequence number of the first data byte when transmitting data is x.
  • The second handshake: After the TCP of host B receives the connection request segment, if it agrees, it will send back an acknowledgment. Host B should set SYN = 1 and ACK = 1 in the acknowledgment segment, its acknowledgment number ack = x + 1, and its own selected sequence number seq = y.
  • The third handshake: After host A receives this segment, it gives confirmation to host B, its ACK = 1, and the confirmation number ack = y + 1. Host A's TCP notifies the upper-layer application process that the connection has been established. After receiving the confirmation from host A, the TCP of host B also informs its upper-layer application process that the TCP connection has been established.
    insert image description here

The purpose of the three-way handshake is to establish a reliable communication channel.

  • The first handshake: the client can't confirm anything; the server confirms that the other party sends normally
  • The second handshake: the client confirms: its own sending and receiving are normal, and the other party's sending and receiving are normal; the server confirms: its own receiving is normal, and the other party's sending is normal
  • The third handshake: the client confirms: its own sending and receiving are normal, the other party's sending and receiving are normal; the server confirms: its own sending and receiving are normal, and the other party's sending and receiving is normal, so the three-way handshake can confirm that the dual sending and receiving functions are normal, missing one Not possible.

What if the connection is established, but the client fails?
Keep-alive timer: used to prevent long periods of idleness on the TCP connection. Each time the server receives a client request, it resets a timer. The time is usually set to 2 hours. If the server has not received the client's information after 2 hours, it will send a probe segment. If 10 probe segments are sent (each 75 seconds apart) and there is no response, the client is assumed to have failed and the connection is terminated.


What is the Four Waves of TCP

After the data transmission is over, both parties of the communication can release the connection. We call the process of releasing the connection four times:

The specific details of the four waves

  • The first wave: The application process of host A first sends a connection release segment to its TCP, stops sending data again, and actively closes the TCP connection. Host A sets FIN = 1 in the header of the connection release segment, its sequence number is seq = u, and waits for B's confirmation (FIN_WAIT_1 state).
  • The second wave: Host B sends an acknowledgment, sending an ACK=1, the acknowledgment number ack = u+1, and the segment’s own sequence number seq = v. The TCP server process notifies the higher-level application process. The connection from host A to host B in this direction is released, and the TCP connection is in the CLOSE_WAIT state. At this time, if host B sends data, host A still needs to receive it.
  • The third wave: host B confirms that there is no data to send to host A, sets FIN to 1, and its application process notifies TCP to release the connection.
  • The fourth wave: After host A receives the connection release segment, it enters the TIME_WAIT state and must send an acknowledgement. In the confirmation segment, ACK = 1, the confirmation number ack = w + 1, and its own sequence number seq = u + 1. After host B receives it and confirms the ack, it changes to the CLOSED state and no longer sends data to the client. The client also enters the CLOSED state after waiting for 2*MSL (maximum segment lifetime) time. Complete four waves.

insert image description here

Simple understanding: waving four times, both sides confirm that the other side is closed

  • The client sends a close connection request to the server,
  • After the server receives the client's request to close the connection, it replies with a message confirming receipt
  • After the server determines that it will no longer send messages to the client, it sends a message to the client, ready to close the connection
  • The client receives the message that the server wants to close the connection and sends the server: The message to close the connection has been received.

Why can't the ACK and FIN sent by the server be combined into three waves (what is the meaning of the CLOSE_WAIT state)?
Because when the server receives the client's request to disconnect, there may be some data that has not been sent. At this time, it will reply ACK first, indicating that it has received the disconnection request. Wait until the data is sent and then send FIN to disconnect the data transmission from the server to the client.

What happens if the server's ACK doesn't reach the client on the second wave?
The client does not receive an ACK confirmation and will resend the FIN request.

What is the significance of the client's TIME_WAIT state?
Two questions:

  • The last ACK segment sent by host A may not reach host B (lost),
  • A failed connection request segment may appear in this connection

The TIME_WAIT state is used to resend ACK packets that may be lost. If the server does not receive the ACK, it will resend the FIN. If the client receives the FIN within 2*MSL, it will resend the ACK and wait for 2MSL again to prevent the server from re-sending the FIN without receiving the ACK.

MSL (MaximumSegment Lifetime), refers to the maximum survival time of a segment in the network, 2MSL is the maximum time required for a send and a reply. If the Client does not receive the FIN again until 2MSL, then the Client concludes that the ACK has been successfully received and ends the TCP connection.


How the TCP protocol guarantees reliable transmission

  • Stop and wait: Stop sending every time a packet is sent and wait for the other party to confirm. Send the next packet after receiving the confirmation.
  • Timeout retransmission: When TCP sends a segment, it starts a timer and waits for the destination to acknowledge receipt of the segment. If an acknowledgment cannot be received in time, the segment will be resent;
  • Flow Control: Each side of a TCP connection has a fixed size buffer space. The receiver side of TCP only allows the other side to send as much data as the receiver's buffer can accommodate. This prevents the faster host from overflowing the buffer of the slower host. This is flow control. The flow control protocol used by TCP is a variable size sliding window protocol.
  • Congestion control: When the network is congested, it may cause network congestion or even network paralysis, and TCP will reduce the transmission of data.
  • Packet Checksum: TCP will keep a checksum of its header and data, which is an end-to-end checksum. The purpose is to detect any changes in the data during transmission. If there is an error in the verification of the packet, the message segment will be discarded and no response will be given. At this time, the TCP will retransmit the data after the data sending end times out ;
  • Reordering out-of-order packets: Since TCP segments are transmitted as IP datagrams, and IP datagrams may arrive out of order, TCP segments may also arrive out of sequence. TCP numbers each packet sent, the receiver sorts the packets, and transmits the ordered data to the application layer.
  • Acknowledgment mechanism: When TCP receives data from the other end of the TCP connection, it will send an acknowledgment. This acknowledgment is not sent immediately, it will usually be delayed by a fraction of a second;
  • Discard duplicate data: The receiver of TCP discards duplicate data.

TCP congestion control?

Congestion control is to prevent too much data from being injected into the network, overloading the routers or links in the network. Flow control is point-to-point traffic control, while congestion control is the overall control of global network traffic. Both senders have a congestion window (cwnd).

  • Slow start: The sender's congestion window is 1 at the beginning, increasing from small to large. The congestion window cwnd is doubled (multiplied by 2) for each transmission round. When cwnd exceeds the slow start threshold, a congestion avoidance algorithm is used to prevent cwnd from growing too long.
  • Congestion avoidance (algorithm): When cwnd exceeds the slow start threshold, cwnd increases by 1 every time a round-trip time RTT elapses. In the process of slow start and congestion avoidance, once network congestion is found, the slow start threshold is set to half of the current value, and cwnd is reset to 1, and the slow start is restarted.
  • Fast retransmission: The receiver sends a duplicate acknowledgment immediately after receiving an out-of-sequence segment, and the sender retransmits immediately after receiving 3 duplicate acknowledgments.
  • Fast recovery: When the sender receives three consecutive acknowledgments, it halves the slow start threshold, sets the current window as the slow start threshold, and adopts a congestion avoidance algorithm. (When using the fast recovery algorithm, slow start is only used when the connection is established and the network times out)

What is the HTTP protocol?

HTTP is a protocol for transferring data based on the TCP/IP communication protocol. The HTTP protocol works on the client-server architecture, and realizes the specification of reliable transmission of hypertext data such as text, pictures, audio, and video. The format is referred to as "Hypertext Transfer Protocol". The Http protocol belongs to the application layer, and the first layer accessed by users is http.

Features:

  • ①Simple and fast: When the client sends a request to the server, it only needs to transmit the request method and path.
  • ②Flexible: HTTP allows the transmission of any type of data object.
  • ③No connection: Limit each connection to only one request. After the server processes the client's request and receives the client's response, it disconnects.
  • ④ Stateless: The protocol has no memory capability for transaction processing.
  • ⑤Support B/S and C/S mode.

Difference between Http and Https?

  • Different ports: Http is 80, Https443
  • Security: http is a hypertext transfer protocol, information is transmitted in clear text, and https is a transfer protocol processed through SSL encryption, which is more secure.
  • Whether to pay: https needs to get a CA certificate and needs to pay
  • Connection method: http and https use completely different connection methods (HTTP connection is very simple and stateless; HTTPS protocol is a network protocol constructed by SSL+HTTP protocol that can perform encrypted transmission and identity authentication, which is better than http protocol. Safety.)

HTTPS uses key encryption during transmission, which is more secure. But there is a fee, and an additional layer of SSL delay will also increase.


How HTTPS works

  • First, HTTP requests the server to generate a certificate, and the client verifies the validity period, legality of the certificate, whether the domain name is consistent with the requested domain name, and the public key of the certificate (RSA encryption);
  • If the client passes the verification, it will generate a random number according to the validity of the public key of the certificate, and the random number will be encrypted with the public key (RSA encryption);
  • After the message body is generated, its digest is encrypted by MD5 (or SHA1) algorithm, and the RSA signature is obtained at this time;
  • Sent to the server, at this time only the server (RSA private key) can decrypt it.
  • The random number obtained by decryption is encrypted with AES as the key (only the client and the server know the key at this time).

Encryption Algorithm

Encryption algorithm: The technology of encoding and decoding information. Encoding is to translate the original readable information (also known as plaintext) into code form (also known as ciphertext), and the inverse process is decoding (decryption).

The main point of encryption technology is encryption algorithm, which can be divided into three categories:

Symmetric encryption, such as
the basic principle of AES: divide the plaintext into N groups, then use the key to encrypt each group to form its own ciphertext, and finally combine all the grouped ciphertexts to form the final ciphertext.
Advantages: open algorithm, small amount of calculation, fast encryption speed, high encryption efficiency
Disadvantages: both parties use the same key, security cannot be guaranteed

Asymmetric encryption, such as the
basic principle of RSA: generate two keys at the same time: private key and public key, the private key is kept secretly, and the public key can be distributed to trusted clients
Private key encryption, only the private key or public key can be decrypted
Public key encryption, can only be decrypted with the private key
Advantages: Safe, difficult to crack
Disadvantages: The algorithm is time-consuming

Irreversible encryption, such as MD5, SHA
basic principle: no key is required in the encryption process. After the plaintext is input, the system directly processes it into ciphertext through the encryption algorithm. This encrypted data cannot be decrypted and cannot be calculated based on the ciphertext. out plaintext.


How many steps does a complete HTTP request go through?

The HTTP protocol adopts a request/response model. The client sends a request message to the server, and the request message contains the request method, URL, protocol version, request header and request data. The server responds with a status line that includes the protocol version, success or error code, server information, response headers, and response data.

The following 7 steps will be completed between the web browser and the web server:

  • Establish a TCP connection, three-way handshake
  • The web browser sends the request line to the web server
  • The web browser sends the request header. After the browser sends its request command, it also sends some other information to the web server in the form of header information. After that, the browser sends a blank line to notify the server that it has ended the header information. of sending.
  • Web server response: After the client sends a request to the server, the server will send back a response to the client, HTTP/1.1 200 OK, and the first part of the response is the version number of the protocol and the response status code.
  • The web server sends response headers: Just as the client sends information about itself with the request, the server also sends the user data about itself and the requested document with the response.
  • The web server sends data to the browser: after the web server sends the header information to the browser, it will send a blank line to indicate that the sending of the header information ends here, and then it responds with the format described by the Content-Type response header information. Send the actual data requested by the user.
  • Web server closes TCP connection

What is the request body of http?
The HTTP request body is the data that we first send to the server when we request data. After all, when I get data from the server, I must first indicate what I want. The HTTP request body consists of: request line, request header, and request data.

What are the response packets of http?
The HTTP response message is the data returned by the server. There must be a request body and then a response message. The response message contains three parts: status line, response header field, and response content entity implementation


The process of entering a URL to get a page?

  • The browser searches its own DNS cache, searches the operating system's DNS cache, reads the local Host file, and queries the local DNS server.
  • For the query to the local DNS server, if the domain name to be queried is included in the local configuration zone resources, the resolution result is returned to the client to complete the domain name resolution (this resolution is authoritative); if the domain name to be queried is not in the local DNS server zone resolve, but the server has cached the URL mapping relationship, then call the IP address mapping to complete the domain name resolution (this resolution is not authoritative). If the local domain name server does not cache the URL mapping relationship, it will initiate a recursive query or an iterative query according to its settings;
  • After the browser obtains the IP address corresponding to the domain name, the browser requests the server to establish a link and initiates a three-way handshake;
  • After the TCP/IP link is established, the browser sends an HTTP request to the server;
  • The server receives the request, maps it to a specific request processor for processing according to the path parameters, and returns the processing result and the corresponding view to the browser;
  • The browser parses and renders the view. If it encounters references to static resources such as js files, css files, and pictures, it repeats the above steps and requests these resources from the server;
  • The browser renders the page according to the requested resources and data, and finally presents a complete page to the user

Comparison of http versions

Features
of HTTP version 1.0: The previous version of HTTP 1.0 is a stateless, connectionless application layer protocol. (Short connection)
HTTP1.0 stipulates that the browser and the server maintain a short connection, each request of the browser needs to establish a TCP connection with the server, and the server disconnects the TCP connection immediately after processing (no connection), the server does not track each Each client also does not log past requests (stateless).

New feature of HTTP 1.1 version (long connection): The
default persistent connection saves traffic. As long as either end of the client server does not explicitly disconnect the TCP connection, the connection will always be maintained. Multiple HTTP requests can be piped, and the client can Issue multiple HTTP requests at the same time, instead of waiting for the response one by one

Features of HTTP 2.0 version:

  • Binary Framing (encoding it in binary format)
  • Header compression (HPACK algorithm specially designed for header compression is set.)
  • Flow control (set how many bytes of a data stream are received some flow control)
  • Multiplexing (requests and responses can be sent simultaneously on the basis of a shared TCP link)
  • Request priority (performance can be further optimized by optimizing the interleaving and transmission order of these frames)
  • Server push (that is, the server can send multiple responses to a client request. The server pushes resources to the client without an explicit request from the client. (Major update))

How are common HTTP status codes classified, and what are the common status codes?

The HTTP status code indicates the return result of the client's HTTP request, identifies whether the server's processing is normal, and indicates the error occurred in the request. Types of Status Codes:
insert image description here

Common status codes:
200: The request is processed normally
204: The request is accepted but no resources can be returned
301: Permanent redirect
302: Temporary redirect
304: Cached
400: The request message syntax is incorrect, the server cannot recognize
403: The request The corresponding resource is forbidden to be accessed
404: The server cannot find the corresponding resource
500: Internal server error
503: The server is busy


Request method in HTTP protocol

  • GET: used to request access to a resource identified by a URI (Uniform Resource Identifier), which can be passed to the server through the URL
  • POST: used to transmit information to the server, the main function is similar to the GET method, but the POST method is generally recommended.
  • PUT: To transfer a file, the message body contains the file content and saves it to the corresponding URI location.
  • HEAD: Get the header of the message, similar to the GET method, except that it does not return the body of the message. It is generally used to verify whether the URI is valid.
  • PATCH: The data sent by the client to the server replaces the content of the specified document (partially replaced)
  • TRACE: echoes the original request message of the client requesting the server for "loopback" diagnosis
  • DELETE: Deletes a file, in contrast to the PUT method, deletes the file at the corresponding URI location.
  • OPTIONS: Query the HTTP methods supported by the corresponding URI.

Difference between GET method and POST method

  • Functionally: GET is generally used to obtain resources from the server, and POST is generally used to update resources on the server;
  • Security: GET is insecure, because the data submitted by the GET request will appear on the URL (in the request header) in plain text, which may reveal private information; POST request parameters are wrapped in the request body, which is relatively safer.
  • Data volume: The amount of data transmitted by Get is small, because it is limited by the length of the URL, but the efficiency is high; Post can transmit a large amount of data, so only Post mode can be used when uploading files;

Session vs Cookie

Cookie: A cookie is a file (key-value format) stored on the user's browser by the Web server, and can contain user-related information. When the client initiates a request to the server, it will carry the cookie previously created by the server, and the server distinguishes different users through the data carried in the cookie.

session: session is a piece of storage space allocated by the server to the session during the session between the browser and the server. The server will set the sessionid in the cookie of the client browser by default. This sessionid corresponds to the cookie. The cookie transmitted by the browser during the request to the server contains the sessionid. The server obtains the information stored in the session according to the sessionid in the transmitted cookie, and then Identify the identity of the session.

  • Security: cookie data is stored on the client, which is less secure, and session data is stored on the server, which is relatively more secure
  • Size limit: The cookie has a size limit, the data saved by a single cookie cannot exceed 4K, and the session has no such limit. In theory, it is only related to the memory size of the server;
  • Server resource consumption: Session is stored on the server and will exist for a period of time before disappearing. When access increases, it will have an impact on server performance
  • Implementation mechanism: The implementation of Session often relies on the Cookie mechanism, and the SessionID is returned through the Cookie mechanism;

The HTTP protocol itself cannot determine the user's identity. So you need a cookie or session

Guess you like

Origin blog.csdn.net/Lzy410992/article/details/119667393