Interview Stereotype-Computer Network

network model

Seven-Layer Network Model

  1. Physical layer: use the transmission medium to provide a physical connection for the data link layer, and realize the transparent transmission of the bit stream.
  2. data link layer
  3. network layer IP
  4. Transport layer TCP, UDP
  5. session layer
  6. The presentation layer interprets the underlying commands and data
  7. Application layer Application layer protocols: DNS, HTTP, SMTP, etc., users interact with the network at this layer

TCP/IP four layers

  1. network interface layer
  2. Internet Layer IP
  3. Transport layer TCP, UDP
  4. Application layer HTTP, SMTP, FTP, etc.

Layer 5 protocol

  1. The physical layer realizes the transparent transmission of bit streams between adjacent computers
  2. The data link layer assembles IP datagrams into frames, and the control information is transmitted on the link between two adjacent nodes
  3. Network layer IP, which serves for packet switching information of different hosts
  4. The transport layer TCP and UDP provide general data transmission services for the communication between two hosts
  5. Application layer HTTP, SMTP, FTP, etc.

HTTP

The difference between HTTP and HTTPS

  • https needs to go to ca to apply for a certificate, so it needs a certain fee
  • Http is a hypertext transfer protocol, and the information is transmitted in plain text, while https is a secure ssl encrypted transfer protocol, and the cost is relatively high
  • The http connection is very simple and stateless. The https protocol is a network protocol constructed by the ssl+http protocol that can perform encryption and authentication
  • The port used by http is 80, and the port used by https is 443

The content of the HTTP request message

  • The request line includes the request method (GET, POST...), URL, HTTP protocol version
  • request header. The format is, header field: value.
  • request body

What are the main fields in the HTTP header

  • Host: The address of the server that accepts the request, which can be an IP or a domain name

  • User-Agent: The name of the application sending the request

  • Connection: Specify connection-related attributes, such as (Keep_Alive, long connection)

  • Accept-Charset: Inform the server of the encoding format that can be sent

  • Accept-Encoding: Inform the server of the data compression format that can be sent

  • Accept-Language: Notify the language that the server can send

HTTP response message

  • Status line: protocol version, status code, status code description
  • response header
  • response body

HTTP response header main fields

  • Server: the name and version of the server application software
  • Content-Type: the type of the response body
  • Content-Length: The length of the response body
  • Content-Charset: The encoding used for the response body
  • Content-Encoding: The data compression format used by the response body
  • Content-Language: The language used in the response body

HTTP status code

1xx The server received the request and needs the requester to continue the operation

2xx ok, the request was successful

3xx redirection, resource has been reallocated

4xx Client request error, 403 forbidden request resource rejected, 404 not found request resource not found

5xx server error, 500 server failure, 503 server overloaded or down for maintenance

200, the request was successful

"301 Moved Permanently" indicates a permanent redirection, indicating that the requested resource no longer exists and needs to be accessed again with a new URL.

"302 Found" indicates a temporary redirection, indicating that the requested resource is still there, but it needs to be accessed by another URL temporarily.

Both 301 and 302 use the field Location in the response header to indicate the URL to be redirected, and the browser will automatically redirect to the new URL.

"304 Not Modified" does not have the meaning of jumping, indicating that the resource has not been modified, redirecting the existing buffer file, also known as cache redirection, that is, telling the client to continue to use the cache resource for cache control.

"400 Bad Request" indicates that there is an error in the message requested by the client, but it is only a general error.

"403 Forbidden" means that the server prohibits access to resources, not that the client's request is wrong.

"404 Not Found" means that the requested resource does not exist or is not found on the server, so it cannot be provided to the client.

"500 Internal Server Error" and 400 types are general and common error codes. We don't know what error occurred on the server.
"501 Not Implemented" means that the function requested by the client is not yet supported, similar to "opening soon, please look forward to it".
"502 Bad Gateway" is usually an error code returned by the server as a gateway or proxy, indicating that the server itself is working normally, and an error occurred when accessing the back-end server.
"503 Service Unavailable" means that the server is currently busy and temporarily unable to respond to the client, similar to "the network service is busy, please try again later".

HTTP hijacking

Insert specific network data packets into the normal data flow, let the client explain the wrong data, and display small advertisements or web content to users in the form of pop-up new windows

step:

Identify the HTTP protocol link in the TCP connection;

Change the HTTP response body;

Send the tampered data packets back to the user first, so that the subsequent data packets will be directly discarded after arriving. and the client displays the modified page

Prevention:

Pre-encryption: HTTPS, preventing plaintext transmission from being hijacked (but not DNS hijacking)

Encryption in the event: split HTTP request packets, the operator's bypass device does not have a complete TCP/IP protocol stack, and cannot be marked, the web server has a complete TCP/IP protocol stack, and can assemble the received data packets into a complete The HTTP request does not affect the service

Post-event shielding: when the front end displays HTTP, it detects the content and triggers a callback when the DOM structure changes

DNS hijacking: By hijacking the DNS server, gaining control over the resolution record of a domain name and modifying the resolution result of the domain name. The original access to domain name A is transferred to domain name B, and the wrong query result is returned. It may be the continuous promotion of some products

Difference: DNS hijacking tends to be persistent. Advertisements are forced to be pushed when accessing an interface. The frequency of HTTP hijacking is changeable, and the hijacking process is also very fast. Generally, it often occurs in small tails of websites.

HTTPS hijacking: fake certificates for hijacking....

cross domain

Cross-domain means that browsers cannot execute scripts from other websites. It is caused by the browser's same-origin policy, which is a security restriction imposed by the browser.

Same origin: same domain name, protocol, and port

That is, the browser can only execute website scripts under the same protocol, the same domain name, and the same port. If the script of the website does not belong to the current interface during execution, it will not execute

HTTP request response interruption reason

The network is broken, the network is blocked, the request times out, the browser has a problem, and the server has a problem

How to check
Check network, check local...

HTTP has several request methods

HTTP1.0 defines three request methods: GET, POST and HEAD methods.

HTTP1.1 adds six new request methods: OPTIONS, PUT, PATCH, DELETE, TRACE and CONNECT methods.

The difference between GET and POST

The parameters of GET are placed in the url and returned to the server to get the data; POST has a parameter in the request body to get the specified data from the server;

Because the parameters of GET are exposed on the url, the security cannot be guaranteed, and there is also a length limit

Application scenarios
GET is used to query data, POST is used to modify data, and other scenarios that require more security such as passwords

cookies and sessions

What are cookies?
A cookie is actually a small piece of text information. The client requests the server, and if the server needs to record the user status, it uses the response to issue a cookie to the client browser. The client will save the cookie.

When the browser requests the website again, the browser submits the requested URL together with the cookie to the server. The server checks the cookie to identify the user status. The server can also modify the content of the cookie as needed.

What is Session?
Session is another mechanism for recording client status. The difference is that Cookie is saved in the client browser, while Session is saved on the server. When the client browser accesses the server, the server records the client information on the server in some form. This is Session. When the client browser visits again, it only needs to find the status of the client from the Session.

The difference between Session and Cookie?
1. Data storage location: cookie data is stored on the client's browser, and session data is stored on the server.

2. Security: Cookies are not very secure. Others can analyze cookies stored locally and cheat them. Considering security, sessions should be used.

3. Server performance: the session will be saved on the server within a certain period of time. When the number of visits increases, it will take up the performance of your server. Considering the reduction of server performance, cookies should be used.

4. Data size: The data saved by a single cookie cannot exceed 4K, and many browsers limit a site to save up to 20 cookies.

5. Importance of information: You can consider storing important information such as login information as a session. If you need to keep other information, you can put it in a cookie.

DNS lookup process (application layer)

Used to resolve user-supplied hostnames to IP addresses

  1. The browser extracts the domain name address from the received url, and sends the domain name to the client of the DNS application
  2. Check whether the browser cache and the local hosts file have a mapping of this URL, and if so, call this IP address mapping
  3. If not, check whether the local DNS resolver cache has a mapping for this URL, and if so, return the mapping
  4. If not, make a query request to the DNS server
  5. When the server receives the query, it queries the resources in the local configuration area, and returns the result if found
  6. If it cannot be found, but the server has cached the URL mapping relationship, return the search result
  7. If there is no cache, continue to forward the request to the upper-level DNS server for query. Finally, the resolution results are returned to the local DNS server in turn, and the local DNS server returns to the client, and stores the mapping in the server's cache

Long Links & Short Links

  • In HTTP/1.0, short connections are used by default. That is to say, every time the browser and the server perform an HTTP operation, a connection is established, but the connection is terminated when the task ends. If an HTML or other type of web page accessed by the client browser
    contains other web resources, such as JavaScript files, image files, CSS files, etc.; when the browser encounters such a web resource, it will create a HTTP sessions.
  • But starting from HTTP/1.1, long connections are used by default to maintain connection characteristics. Using the HTTP protocol of long connection, this line of code will be added in the response header: Connection:keep-alive
  • In the case of using a long connection, when a webpage is opened, the TCP connection used to transmit HTTP data between the client and the server
    will not be closed. If the client visits the webpage on this server again, it will continue to use this one. connection established. Keep-Alive will not keep the connection forever, it has a keep time, which can be set in different server software (such as Apache). To implement a persistent connection, both the client and the server must support persistent connections.

The principle that the page is reset when searching for sensitive words

According to the regulations of the TCP protocol, a three-way handshake is required to establish a connection between the user and the server: the first handshake, the user sends a SYN packet to the server to send a request (SYN, x:0), and the second handshake server sends a SYN/ACK packet to the user Response (SYN/ACK, y:x+1), the third handshake user sends an ACK packet to the server to issue a confirmation (ACK, x+1:y+1), so far a TCP connection is successfully established. Where x is the serial number sent by the user to the server, and y is the serial number sent by the server to the user.
Keyword detection, for plaintext or base64 and other weakly encrypted communication content, match with the prepared sensitive word library, when sensitive words are found, change the SYN/ACK packet sent back by the server to SYN/ACK, Y:0, this It means that the TCP connection is reset, and the user voluntarily gives up the connection, prompting that the connection failed. Let the user mistakenly think that the server refuses to connect, and voluntarily give up the connection with the server, and automatically block the recording of web pages containing sensitive words

IP

IP address classification

Class A: 1-byte (8-bit) network number, 3-byte (24-bit) host number. The first digit of the network number is fixed at 0, and the remaining 7 digits can be used freely. Reserved address 0 (00000000) means "this article network", 127 (01111111) means local loopback software test

Class B: 2-byte (16-bit) network number, 2-byte (16-bit) host number. The first two digits of the network number are fixed at 10, and the remaining 16 digits can be used freely. reserved address

Class C: 3-byte (24-bit) network number, 1-byte (8-bit) host number. The first three digits are fixed at 110, and the remaining 21 digits are available.

Class D:

Class E:

An IP address whose host ID is all 0s indicates a single network to which "this host" is connected.

An IP address whose host number is all 1s indicates all hosts on the network.

The representation range of class A addresses is: 0.0.0.0-126.255.255.255, and the default network mask is: 255.0.0.0. Class A addresses are assigned to large-scale networks.

Class B addresses represent the range: 128.0.0.0-191.255.255.255, the default network mask is: 255.255.0.0, and Class B addresses are assigned to general medium-sized networks

The representation range of class C address is 192.0.0.0-223.255.255.255, the default network mask is: 255.255.255.0, class C address is allocated to small networks, such as LAN

Class D addresses are called broadcast addresses, and are used by special protocols to send information to selected nodes.

Conversion between ipv4 and ipv6. Transition means from ipv4 to ipv6

The transition between ipv4 and ipv6 is a gradual process. While users experience the benefits brought by IPv6, they can still communicate with other IPv4 users in the network.

Mainstream technology:

  1. Dual-stack strategy: (the most direct way) add the IPv4 protocol stack to the IPv6 node. The nodes with dual protocol stacks are called "IPv6/v4 nodes", and these nodes can communicate with IPv4 nodes using IPv4, or directly use IPv6 to communicate with IPv6 nodes.
  2. Tunnel technology: (In order to solve the isolated island problem formed by the isolation of local pure IPv6 network and IPv4 backbone, use tunnel technology to solve it) Use the tunnel technology that traverses the existing IPv4 Internet to connect the isolated islands, and gradually expand the scope of IPv6 implementation. At the entrance of the tunnel, the router encapsulates the IPv6 array group into IPv4, and the source address and destination address of the IPv4 group are the IPv4 addresses of the tunnel entrance and exit respectively. At the exit of the tunnel, the IPv6 packet is taken out and forwarded to the destination node.
    There are four specific forms of tunnel technology in practice: tunnel construction, automatic configuration tunnel, multicast tunnel, and 6to4.
  3. Tunnel Broker TB, Tunnel Broker. (The purpose is to simplify the configuration of the tunnel and provide automatic configuration means), TB can be regarded as a virtual IPv6 ISP, which provides the means for users connected to the IPv4 network to connect to the IPv6 network, while connecting to the IPv4 network The users are the customers of TB.
  4. Protocol conversion technology. The main idea is that the communication between V6 nodes and V4 nodes requires the help of an intermediate protocol conversion server. The main function of this protocol conversion server is to convert the network layer protocol header between V6/V4 to adapt to the protocol type of the peer.
  5. SOCKS64。

Introduce the SOCKS library in the client, which is between the application layer and socket, and replace the socket API and DNS domain name resolution API of the application layer.

The other is a SOCKS gateway.

  1. transport layer relay

The working mechanism is similar to SOCKS64, except that the "protocol translation" of the transport layer is performed on the transport layer repeater

  1. Application Layer Proxy Gateway (ALG)

similar. Protocol translation is performed at the application layer.

ARP protocol

Address Resolution Protocol, that is, address resolution protocol, is used to realize the mapping from IP address to MAC address, that is, to ask the MAC address corresponding to the target IP

  1. 查主机缓存里(的ARP列表里)有没有记录这个IP和MAC地址的对应  
    
  2. 有就直接发送,没有就向本网段所有主机发送广播,发送自己的IP地址和MAC地址,询问谁是这个IP地址,这个地址的MAC地址是什么  
    
  3. 网络中的其他主机收到之后对照被询问的地址和自己能不能对上,是的话就从数据包中提取源主机的IP和mac地址写入自己的ARP列表,并将自己的MAC地址写入响应包,回复源主机  
    
  4. 源主机收到ARP响应包之后,就可以用这些信息发送数据  
    

Why use the ARP protocol: OSI divides the network into 7 layers, and each layer does not communicate directly, only specific interfaces communicate. IP is at the third layer network layer, and MAC address works at the second layer data link layer. The protocol needs to encapsulate the IP address and MAC address when sending packets, but only knows the IP, and cannot directly find it across layers, so the service of the ARP protocol must be used to help obtain the MAC address of the destination node

After entering a URL in the browser, what happens after pressing enter

URL, uniform resource locator, l simple point is URL = ip or domain name + port number + resource location + parameter + anchor point

1. After entering a URL, the browser first searches for the IP address of the URL by querying the DNS (searching up the DNS server layer by layer until the IP address corresponding to the URL is found)

2. After obtaining the IP address and port number of the target server (http port 80, https port 443), the system library function socket will be called to request a TCP stream socket. The client sends an HTTP request message to the server

(1) Application layer: The client sends an HTTP request message.

(2) Transport layer: (add source port, destination port) to establish a connection. Before actually sending data, the three-way handshake client and server establish a TCP connection.

(3) Network layer: (add IP header) routing addressing.

(4) Data link layer: (add frame header) to transmit data.

(5) Physical layer: Physical transmission bit.

3. The server side parses the request message and sends the HTTP response message through the physical layer→data link layer→network layer→transport layer→application layer.

4. Closing the connection, TCP waves four times.

5. The client parses the HTTP response message, and the browser starts to display HTML

Causes of web page lag

Slow network speed, insufficient bandwidth, low hardware configuration, and full memory.

The JS script is too large, blocking the loading of the page.

There are too many web resources, it takes a long time to receive data, and loading a certain resource is slow.

DNS resolution speed.

how to check

Hardware problem: Check whether the network cable or wireless network card is plugged in, whether it is connected to the router, that is, whether the bottom layer is in Unicom state;

Software problem: Check whether there is a corresponding driver, whether the server is good, whether the DNS is correct, or the proxy may not be turned off

When a web page loads slowly, how to analyze its cause and solve the problem?

too many http requests

Too many resources, too many resources

JS script is too large

slow internet

What should be parsed first and then parsed for a URL (domain name resolution order)?

Domain name hierarchy: from right to left are the top-level domain name, the second-level domain name...the leftmost is the host name (server name). For example, com in www.baidu.com is the top-level domain name, cn in email.tsinghua.edu.cn is the top-level domain name, which is the domain name of China, edu is the domain name of the education and scientific research department, and email is the server name.

When domain name resolution, search for the matching subdomain name first. If the subdomain name exists, query the resolution result from the configuration file of the subdomain name. If the subdomain name does not exist, query the result from the configuration file of the upper level.

TCP、UDP

The process of three-way handshake of TCP connection

Initial state client CLOSED, server LISTEN

  1. Client A sends a SYN packet (SYN, x:0) to server B to request a connection. At this time, the status is SYN_SENT, indicating that the client has sent a SYN message.
  2. Server B receives it and sends a SYN/ACK packet (SYN/ACK,
    y:x+1) in response. At this time, the server status changes from LISTEN (the server socket is in the listening state and can accept connections) to SYN_RECV, indicating that a SYN message is received
  3. Client A receives and sends a confirmation ACK (ACK, x+1:y+1), and the connection is successful. Status of both parties ESTABLISHED

The process of TCP waving four times

Initial state both sides ESTABLISHED

  1. Client A sends a FIN to close the data transfer from client A to server B. Client FIN_WAIT_1. Indicates that the connection is actively closed, FIN is sent to the other party, enter FIN_WAIT_1, and wait for the confirmation of the other party

  2. Server B receives this FIN, and it sends back an ACK, confirming that the serial number is the received serial number plus 1. The client FIN_WAIT_2 means a semi-connection, and the server may still have data to send, which will be closed later. ServerCLOSE_WAIT.

  3. Server B closes the connection with client A and sends a FIN to client A. Server LAST_ACK, waiting for the opposite ACK message

  4. Client A sends back an ACK packet for confirmation, and sets the acknowledgment sequence number to the received sequence number plus 1. The client enters TIME_WAIT, indicating that it has received the other party's FIN message and sent an ACK message, and it will return to the CLOSED available state after waiting for 2MSL.

Why TIME_WAIT:

To prevent the server from resending the FIN message without receiving the ACK message in the LAST_ACK state, the function of this TIME_WAIT state is to resend the ACK message that may be lost.

Characteristics of TCP and UDP

TCP connection-oriented, UDP connectionless

TCP is reliable, guarantees security, UDP best effort delivery, does not guarantee security

TCP is point-to-point, UDP can be one-to-one, many-to-many, many-to-one

TCP is byte-oriented, UDP has no congestion control

TCP overhead is large, UDP overhead is small

Application scenarios of TCP and UDP

UDP usage scenario DNS protocol (because UDP is fast), watch video, send voice, QQ chat, multimedia classroom screen broadcast

TCP usage scenarios HTTP protocol, QQ file transfer, email, login

How does TCP achieve reliable transmission

Confirmation and retransmission mechanism: confirmation when establishing a connection and sending a packet, verification failure during transportation, packet loss or delayed retransmission by the sender

Data sorting: Divide data into many packets and transmit them in order

Flow Control: Sliding Windows and Timers

Congestion control: slow start, congestion avoidance, fast retransmission, fast recovery

flow control

It acts on the receiver to control the sending speed of the sender so that the receiver can receive it in time to prevent packet loss.

Implemented by sliding window

sliding window

The way TCP performs flow control, the receiver controls the sending speed of the sender by telling the other party its own window size, so as to prevent itself from being overwhelmed due to the sender sending too fast

timer

The sender starts a timer after receiving the window with a value of 0, and sends a packet to inquire about the current sliding window after the time is up to prevent deadlock (the packet sent back by the receiver with a window that is not 0 is lost, and the two parties wait for each other)

congestion control

It acts on the network to prevent too much data from being injected into the network and avoid excessive network load.

Congestion: The demand for a resource in the network exceeds the available part that the resource can provide, affecting network performance

Congestion control: prevent excessive data injection into the network, so that routers or links in the network will not be overloaded.

Congestion window
The flow control used by the sender. In addition, considering the receiving capacity of the receiver, the sending window may be smaller than the congestion window.

Slow start and congestion avoidance

Slow start: Do not send a large amount of data at the beginning, first detect the congestion level of the network, that is to say, gradually increase the size of the congestion window from small to large.

The congestion window is set to 1 at the beginning, and every time a confirmation is received, the congestion window will be doubled. When the window value is 16 (slow start threshold), it will be increased by addition, each time +1, until the network congestion. When there is congestion, set the new slow start threshold to half of the congestion time, and set the congestion window to 1, and then let it repeat. At this time, the amount of network data will be greatly reduced in an instant.

Congestion avoidance: The congestion avoidance algorithm makes the congestion window grow slowly, that is, the sender's congestion window cwnd is increased by 1 instead of doubled every time a round-trip time RTT passes.

Fast retransmission and fast recovery

Fast retransmission: Every time the receiver receives an out-of-sequence segment (after receiving 2, it receives 4, indicating that 3 is lost), it immediately sends a repeated confirmation of packet 2, so that the sender can know the packet loss as soon as possible.

The sender retransmits immediately after receiving three consecutive duplicate acknowledgments 3

Fast recovery: When the sender receives 3 consecutive acknowledgments, the slow start threshold is halved, the value of the congestion window is set to half of the slow start threshold, and the congestion avoidance algorithm is implemented. After each acknowledgment is received, +1

Guess you like

Origin blog.csdn.net/weixin_44200259/article/details/128079248