Linux [Network Basics] Data Link Layer & IP Protocol Technology Supplement & DNS & DHCP

1. Data link layer

(1) The association between the data link layer and the network layer

The main function of the data link layer is to be responsible for data transmission between adjacent devices.
At the network layer is peer-to-peer communication. The IP protocol is mainly used to describe the start point to the end point. The link layer is a supplement to the network layer, responsible for the data transmission of each adjacent node in the path from the start point to the end point.
The main concern of the network layer is from the start point to the end point.
The link layer is concerned with the communication of each adjacent node on the path.

The example of Tang Monk learning from the Buddhist scriptures was given earlier.
For example, let’s take another example:
For example, if we want to go from Henan to Beijing, This is the starting point and the ending point we described, and it is not enough to have the starting point and the ending point alone. It is also necessary to describe the itinerary on the way, such as taking a car from Henan to a certain airport, and then transferring from a certain airport to point A... Arriving in Beijing, Here is the route planned on the way.
This entire itinerary from the start point to the end point is equivalent to our network layer . And the movement of these places one by one, that is, the specific travel methods and routes, is equivalent to the data link layer .
If there is no itinerary, even if you know the way and route of travel, you don't know the specific route. But only know the itinerary, and can only stay in place without passing through specific passages. This is the relationship between the network layer and the data link layer. A little bit like TCP assists IP, TCP provides policy and IP provides actions.

(2) Principle of LAN communication

The essence of cross-network transmission is the result of forwarding by countless LANs. To thoroughly understand cross-network forwarding, you must first understand the principle of packet forwarding in LANs. Hosts in the same LAN can
communicate directly.
As shown in the figure:
insert image description here
first, m1 needs to Want to communicate with m6, m1 sends a mac frame to each host connected to the bus, and then each host judges that it is sent to me, because each host has a unique identifier that is the mac address, if it is not, it is discarded, it is up deliver.
Conclusion:
All hosts in the LAN can actually receive the corresponding mac, but most hosts decide whether to follow up by comparing the target mac address in the data frame with their own mac address at their own data link layer. deal with.
The principle of the packet capture tool :
In the local area network, the network card has a promiscuous mode, which does not give up the task data frame and directly delivers it upwards.
The essence of LAN communication:
In a LAN, only one host can send a message at any time. If multiple messages are sent at the same time, the data in the LAN will collide and become invalid data, so it also needs its own strategy, such as Collision detection and collision avoidance algorithms to make hosts try not to collide.
Looking at the LAN from the perspective of the system:
we can regard it as a critical resource, and collision detection and collision avoidance ensure that only one host can write data to the critical resource at any one time.
The local area network cannot be too large, the reason:
the local area network is too large, there are many hosts, and the probability of collision at any time will increase. For example, wireless WIFI also communicates in the local area network. There are many people and it is very difficult because if there is only one base station built around, and the power is Definitely, so the card.
Mitigation : Switches can be used to identify localized collisions and not forward collision data to reduce the probability of collisions.

(3) Ethernet protocol

Ethernet protocol:
The IP protocol at the network layer is responsible for the routing selection. To put it bluntly, the IP protocol at the network layer is selecting routes, but the network layer does not care about how to get from machine A to device B, but the data link Layer is responsible for the forwarding of neighboring devices.
Ethernet frame format:
insert image description here
32-bit source MAC address/destination MAC address: indicates the MAC address of the sender/receiver, used to describe and identify adjacent devices, refers to the hardware address of the network card (also called MAC address), the length is 48 bits, It is solidified when the network card leaves the factory.
16-bit upper-layer protocol: used to select the upper-layer analysis protocol when data is distributed. The frame protocol type field has three values, corresponding to IP, ARP, and RARP. In order to facilitate the receiver to identify the payload of the current Ethernet protocol by type an agreement.
32-bit FCS: The end of the frame is a CRC check code, which checks whether the data is distorted during transmission.
Understand MAC address and IP address:
the network layer selects the route, and the data link layer is responsible for the forwarding of adjacent devices.
The data link layer forwards according to the routing item selected by the network layer. It must be until the MAC address of the adjacent device corresponding to the route to be forwarded;
either the MAC address of the machine in the subnet
or the MAC address corresponding to the connected routing item WAN
insert image description here

(4) ARP protocol

The above mentioned the Ethernet protocol format, you can see that the destination address and source address in the format refer to the destination MAC address and source MAC address, that is, when the data is in the data link layer, the MAC address of the target host needs to be known, but the network layer The data submitted by the protocol to the data link layer only contains the IP address of the target host, and we cannot know its corresponding MAC address.
How to get the MAC address of the target host?
During network communication, the application program of the source host knows the IP address and port number of the destination host, but does not know the hardware address of the destination host. The data packet is first received by the network card and then processed by the upper layer protocol. If the received data packet If the hardware address does not match the machine, it will be discarded directly.
Therefore, the hardware address of the destination host must be obtained before communication.

That is to say, when organizing the Ethernet protocol, you need to know the MAC address of the adjacent device first, so that you can know the hardware address of the destination host step by step, but how to do it if the current host does not know the MAC address of the adjacent device What?
Use the arp protocol to obtain the MAC address of the adjacent device

  • Obtain the MAC address of the machine inside the subnet
  • Obtain the MAC address of the connected router device WAN**

The ARP protocol is used to solve this problem, and he can obtain the corresponding MAC address through the ip address. The ip address here is essentially the ip address that the next piece of data should go to calculated through the routing item, not the destination ip address that you should go to yourself.
Therefore, the ARP protocol is a protocol between the network layer and the data link layer. In other words, the ARP protocol establishes the mapping relationship between the host IP address and the MAC address.
ARP datagram format:
insert image description here
Header:
Ethernet destination address: In the ARP request, the destination MAC address is filled with: 0xFFFFFFFF, indicating that the current data is forwarded to every machine in the subnet.
Ethernet source address: It is the source MAC address, which is the MAC address of the current host.
Frame type: superior protocol (ARP protocol)
28 bytes ARP request/reply :
hardware type: current network type: Ethernet, token ring network.
Protocol type: the address type to be converted, ip is converted to MAC.
Hardware address length: Indicates the length of the MAC address.
Protocol address length: Indicates the length of the ip address.
op: Identifies whether it is a request or a response
1: Request
2: Response
The working principle of the ARP protocol:
it is a protocol to solve the address problem. Using the target IP address as a clue, it is used to locate the MAC address corresponding to the next network device that should receive the data packet. Generally speaking, it is to obtain the MAC address of the adjacent device. If the target host is not on the same link, the MAC address of the next-hop router can be found through ARP. But ARP is only applicable to IPv4, not IPv6.
How does ARP know the MAC address ?
ARP determines the MAC address by means of two types of packets, ARP request and ARP response.
Assume that host A sends an IP packet to host B on the same link. The IP address of host A is 172.20.1.1, and the IP address of host B is 172.20.1.2. They do not know each other's MAC address.
insert image description here
In order to obtain the MAC address of host B, host A first sends an ARP request packet through broadcast.
This packet contains the IP address of the host whose MAC address you want to know, that is, the ARP request packet already contains the IP address 172.20.1.2 of host B. Because broadcast packets can be received by other hosts or routers on the same link.
Therefore, the ARP request packet will also be analyzed by other hosts and routers on the same link.
If the target IP address in the ARP request packet is consistent with its own IP address, then this node will stuff its own MAC address into the ARP response packet and return it to host A.
Summary:
Send an ARP request packet from an IP address to obtain its MAC address (the ARP request packet also has another function, which is to tell the other party its own MAC address), and the target host fills its own MAC address into the ARP response packet Return to the source IP address. In this way, the MAC address can be obtained from the IP address through ARP, and IP communication within the link can be realized.

The source host sends an ARP request, asking "what is the hardware address of the host whose IP address is 192.168.0.20", and broadcasts this request to the local network segment (the hardware address of the Ethernet frame header FF:FF:FF:FF:FF: FF means broadcast).
When the destination host receives the broadcast ARP request and finds that the IP address matches the host, it sends an ARP response packet to the source host and fills in its own hardware address in the response packet.
ARP cache table
Each host maintains an ARP cache table, which can be viewed with the arp -a command. The entries in the cache table have an expiration time. If an entry is not used again within 20 minutes, the entry will be invalid, and an ARP request will be sent next time to obtain the hardware address of the destination host. (usually 20 minutes).
Why is there an ARP cache table?
If an ARP request is made to determine the MAC address every time an IP datagram is sent, unnecessary network traffic will be caused. Therefore, the usual practice is to cache the obtained MAC address for a period of time. It refers to anticipating that the same information may be used again and opening up an area in memory to store this information.
That is, the MAC address obtained through ARP for the first time is memorized in an ARP cache table as the mapping relationship between IP and MAC. When sending datagrams to this IP address next time, there is no need to resend the ARP request, but directly use this The MAC address in the cache table is used to send the datagram. Improve efficiency.
Note:
Every time ARP is executed, its corresponding cache content will be cleared. However, before clearing, you can obtain the desired MAC address without executing ARP. In this way, to a certain extent, the possibility of ARP packets being broadcast in large numbers on the network is also prevented.
The arp protocol can only be used within the subnet, and can only broadcast arp requests to machines within the subnet. In other words, only the MAC addresses of machines inside the subnet can be obtained.
Question:
IP address and MAC address are indispensable?
As long as you know the MAC address of the receiving end on the data link, don't you know that the data is going to be sent to host B? Then you also need to know its IP address? , as long as you know the IP address, even if you don’t do ARP, you just need to make a broadcast on the data link, can’t it be sent to host B? Why do you need both an IP address and a MAC address? As shown in the figure below: Host A wants to send
IP
insert image description here
data When reporting to host B, it must go through router C. Even if the MAC address of host B is known, because router C will isolate the two networks, it is still impossible to directly send datagrams from host A to host B. Sending ARP is conditional . At this time, host A must first send the datagram to router C's MAC address C1.
Therefore, both the IP address and the MAC address are indispensable. Therefore, there is an ARP protocol that associates these two addresses. In order to avoid excessive network traffic caused by the two-stage communication, ARP has the function of caching the mapping between the IP address and the MAC address. With this caching function, it is not necessary to send an ARP request every time when sending an IP packet, thereby preventing performance degradation

Two, NAT protocol

The NAT server divides the network into a public network and a private network. It continues to convert the source IP address for the private network request data, and converts the destination IP address for the response data replied by the public network. It is transparent to both parties of the message and has no perception. The network request can only be initiated by the private network host, NAT conversion, and the public network host responds. After conversion, the NAT server needs to save the mapping relationship.
As mentioned above, in the IPv4 protocol, the number of IP addresses is insufficient. NAT technology is currently the main means to solve the shortage of IP addresses . It is an important function of routers. NAT is also called address translation protocol.
As shown in
insert image description here
the figure: Workflow:
As shown in the figure above, take the communication between the source host of 10.0.0.10 and the destination host of 163.221.120.9 as an example. Using NAT, the NAT router on the way converts the sending source address from 10.0.0.10 to the global IP address (202.244.174.37) and then sends the data. Conversely, when the packet is sent from the address 163.221.120.9, the target address (202.244.174.37) is first converted to the private IP address 10.0.0.10 and then forwarded. In TCP or UDP, because the IP address in the IP header needs to It is used for the calculation of the checksum, so when the IP address changes, it is also necessary to convert the headers of TCP and UDP accordingly.
Summarize:

  • When the private network requests the public network: convert the source ip address of the private network in the network data into the ip address of the public network.
  • Response from the public network to the private network: convert the destination ip address of the public network in the network data into the ip address of the private network.
  • Static NAT: NAT protocol, which uniquely manages a private network and a public network.
  • Dynamic NAT: NAT manages more than just a public network ip. When private network data arrives, an idle ip is selected for mapping.

However, whether it is static or dynamic, it is a 1-to-1 relationship, and a private network IP is mapped to a public network IP, and there is no substantial mitigation of exhaustion.

3. NAPT protocol

Problem:
Because NAT does not substantially solve the problem, if multiple hosts in the private network access the same external network server at the same time, and in the data returned by the server, the destination IP addresses are all the same, because they are all mapped through NAT The same public network IP. So at this time, how does the NAT router determine which data to forward the data to?

At this time, NAPT technology was introduced , using IP addresses plus port numbers to solve this problem.
As can be seen from the figure below, when sending, the sending port numbers of different hosts are processed through NAPT technology, so that although the same IP address is used when sending, the port numbers are different, and these mapping relationships are stored in the conversion In the table, when the data is sent back, it is forwarded to the corresponding host.
insert image description here
Example:
The port number of the host 163.221.120.9 is 80, there are two clients 10.0.0.10 and 10.0.0.11 communicating at the same time in the left figure, and the local ports of these two clients are both 1025. At this point, just converting the IP address to a certain global address 202.244.174.37 will make all the converted numbers exactly the same. Therefore, just converting the port number of 10.0.0.11 to 1026 will solve the problem. The NAPT router generates a NAPT conversion table, which can correctly convert the combination of addresses and ports, so that clients A and B can communicate with the server at the same time.
When is this table generated?
This translation table is automatically generated on the NAT router. In the case of TCP, this table will be generated as soon as the SYN packet is sent when the first handshake of the TCP connection is established. Then, as the confirmation response of the FIN packet sent when the connection is closed is deleted from the table, the start and end times of the applications at both ends of the UDP communication are not necessarily consistent, so it is relatively difficult to generate a conversion table in this case.
Why is the start and end time of both ends of UDP communication not necessarily consistent?
This is because UDP is a connectionless protocol, which does not provide reliable data transmission and flow control mechanisms. Since UDP has no process of establishing a connection, it does not maintain any connection state or send acknowledgments. The sender encapsulates messages into data packets and sends them to the receiver, and the receiver will receive and process these data packets as best as possible, which means that the delivery order, reliability or timing of data packets cannot be guaranteed.
In UDP, the sender and receiver work independently without interfering with each other. This means that the sender can send packets at its own speed, while the receiver can receive and process packets at a different speed. This can lead to inconsistent start and end times, as the sender may send a large number of packets in a short period of time, and the receiver may take longer to receive and process those packets.
Summary:
In the communication using TCP or UDP, only when the destination address, source address, destination port, source port, and protocol type (TCP or UDP) are the same, it is considered to be the same communication connection. It is NAPT that is used at this time. In the NAPT scenario, a public network IP can theoretically convert up to 65536 private network IPs.
The NAT gateway is transparent to the private network host and the public network host, and the two parties are unaware during the communication process.
The NAT gateway will save the mapping relationship after conversion to prevent conversion again after the response comes back.
Private network to public network: It is to change the source ip address in the network data to the public network ip address.
Public network to private network: It is to change the destination ip address in the network data to a private network ip address.
NAPT adds port conversion, so that one public network IP can serve multiple private network hosts. Alleviate the problem of ip address exhaustion. Data can only be transferred from private network to public network first, not from public network to private network.

4. ICMP protocol

The ICMP protocol is a network layer protocol. A newly built network often needs to conduct a simple test first to verify whether the network is smooth; but the IP protocol does not provide reliable transmission. If the packet is lost, the IP protocol cannot notify the transmission Whether the packet is dropped at the layer and the reason for the packet loss.
ICMP function :
confirm whether the IP packet reaches the target IP address successfully.
Informs the reason why the IP packet was dropped during sending.
ICMP also works based on the IP protocol. But it is not a function of the transport layer, so people still attribute it to the network layer protocol.
ICMP can only be used with IPv4. If it is IPv6, you need to use ICMPv6
ping command:
ping is followed by a domain name, not a url. This domain name can be resolved into an IP address through DNS (described later).
It can verify the connectivity of the network, and also count the response time and TTL (Time To Live in IP packets, life cycle). The ping command will first send an ICMP Echo Request to the peer.
After the peer end receives it, it will return an ICMP Echo Reply.
Note:
telnet is port 23, ssh is port 22, so what port is ping?
Because the ping command is based on ICMP, it is at the network layer. The port number is the content of the transport layer. In ICMP, there is no such thing as port number information.

5. DNS

Now let’s talk about the defects of NAT :
because the conversion of NAT completely depends on this conversion table, there are many restrictions.

  • Unable to establish a connection to the internal server from outside the NAT;
  • The generation and destruction of the conversion table requires additional overhead;
  • Once the NAT device is abnormal during the communication process, even if there is hot standby, all TCP connections will be disconnected,
    so there is a proxy service at this time
    . In fact, the proxy server is the intermediate agency between the personal network and the Internet service provider.
    For example, if we directly access the server, it is equivalent to going directly to the manufacturer to buy things. At this time, it may take a long time because of the long distance or cumbersome procedures. Requesting services through the proxy server is equivalent to helping us buy things from the consignment dealer. He brings the goods in from a distant place, and he handles the procedures. We only need to pay for the goods.
    Proxy server is a widely used technology .

Over the wall: proxy in the WAN.
Load balancing: proxy in the LAN.

Example:
For example, when we visit some foreign websites, direct access is not acceptable, but request services through a proxy server, let him obtain data, and then forward the data to us, then we can also obtain the data of this website.
insert image description here
When more people use the proxy server to access the same target server, in order to improve efficiency, instead of forwarding each time to obtain data, the proxy server may cache the content of the target server in advance, and directly cache the cached data when someone visits Sending back, this is the technology of reverse proxy, and the kind of forwarding mentioned above is forward proxy.
The forward proxy server essentially collects various requests together to facilitate the management of requests. The
reverse proxy server often acts as a cache
and distributes incoming traffic or requests to different hosts in each cluster in a more balanced manner through the reverse proxy server. , called load balancing .
Note:
It should be noted that the proxy service here is completely different from the NAT service, although it looks similar.
Proxy service is an application layer service that can be deployed on any device and works at the application layer. We personally initiate a request to the proxy server, and then he accesses the target server.
The NAT service is a network layer service, which is deployed on the gateway device and works at the network layer. What we request is the target server.

6. DHCP

DHCP is a network protocol used to automatically assign IP addresses and other configuration information related to network connections to devices on the network .
If you need to set an IP address every time you go to a new place and connect a new device, this is undoubtedly a very troublesome thing. Therefore, in order to automatically set the IP address and uniformly manage the allocation of IP addresses, DHCP was created.
insert image description here

When the host is connected to the gateway device, it will broadcast a DHCP request, obtain the IP address of the gateway, and automatically assign the IP address to itself by the DHCP service to generate its own routing table.
insert image description here
Notice:

  • In a large network environment , there may be one or more dedicated DHCP servers for managing and assigning IP addresses and configuration information for a large number of devices. These DHCP servers can centrally manage the IP address pool in the entire network, and dynamically assign appropriate IP addresses to devices connected to the network, so that each device can communicate in the network.
  • A DHCP server usually supports the assignment of other configuration parameters, such as subnet mask, default gateway, DNS server, etc. With DHCP, network administrators can more easily manage and configure large numbers of devices without manually assigning and configuring network information for each device.
  • In a small network or home network environment , the router usually integrates the function of a DHCP server, so the router can act as a DHCP server. However, for large networks or environments that require more complex network configurations, a dedicated DHCP server will be set up separately

Guess you like

Origin blog.csdn.net/m0_59292239/article/details/132042437