Data Link Layer Explanation

Table of contents

1. Problems solved by the data link layer

2. Ethernet protocol

2.1 Understanding Ethernet

2.2 Ethernet frame format

2.3 MAC address

2.3.1 Know the MAC address

2.3.2 Compare MAC address and IP address

2.4 PERSON

2.4.1 Understanding MTU

2.4.2 Impact of MUT on IP protocol

2.4.3 The influence of MTU on UDP protocol

2.4.4 The impact of MTU on the TCP protocol

2.5 The process of data transmission across the network

3. ARP protocol

3.1 Understanding the ARP protocol

3.2 ARP data format

3.3 Workflow of ARP protocol


1. Problems solved by the data link layer

  • IP has the ability to send data from one host to another across the network, but IP cannot guarantee that data can be sent to the peer host reliably every time, so IP needs the upper layer TCP protocol to provide reliability guarantee for it . For example, after the data packet is lost, TCP asks IP to resend the data. Finally, under the reliability mechanism provided by TCP, IP can ensure that the data can be reliably sent to the peer host.
  • However, when data is transmitted over the network, it needs to jump from one host to another host hop by hop before finally forwarding the data to the target host. Therefore, the prerequisite for sending data to the target host is that the data needs to be forwarded to The next-hop host directly connected to the current host. The direct connection of two hosts also means that the two hosts belong to the same network segment, so forwarding data to the next-hop host is actually LAN communication, which is also a problem that the link layer needs to solve
  • The network layer IP provides the ability to send data across the network, the transport layer TCP provides reliability assurance for data transmission, and the link layer solves the communication problem between two connected hosts

2. Ethernet protocol

2.1 Understanding Ethernet

LAN technology

The communication technologies adopted by different LANs may be different, and there are three common LAN technologies as follows:

  • Ethernet: Ethernet is a computer local area network technology, the most widely used
  • Token ring network: Token ring network is often used in IBM systems. In this network, a special frame called "token" is continuously transmitted on the ring to determine when a node can send packets.
  • Wireless LAN/WAN: The wireless local area network is a supplement and extension of the wired network, and is now an important organizational part of the computer network

Although the communication technologies used by each LAN in the network may be different, IP shields the differences in the underlying network. For the IP layer of both parties in the network communication and the protocols above it, it is not necessary to care which layer is used by the underlying layer. LAN technology

  • The data is encapsulated before being sent. At this time, the data link layer encapsulates the data with the header of the corresponding LAN.
  • If the data is to be transmitted across the network, it needs to be forwarded by a router
  • When the data is delivered upwards at the data link layer of the router, the LAN header corresponding to the data will be removed
  • Before the router forwards the data to the next hop, it will encapsulate the data with the LAN header corresponding to the next hop network

The routers in the network will constantly remove the old LAN header of the data and add the new LAN header, so when the data is transmitted across the network, even if the network to be crossed uses a different LAN technology, the crossover can be correctly realized in the end.

Ethernet Communication Principles

  • "Ethernet" is not a specific network, but a technical standard. It includes not only the content of the data link layer, but also part of the content of the physical layer. Such as: Ethernet specifies the network topology, access control mode, transmission rate, etc.
  • The network cable in Ethernet must use twisted pair, and the transmission rate is 10M, 100M, 1000M, etc.
  • All hosts in the Ethernet share a communication channel. When a host in the LAN sends data, all hosts in the LAN can receive the data

  • For example, when host A in the local area network wants to send data to host B, in fact, every host in the local area network can receive the data sent by host A, but in the end only host B will send the data sent by host A upwards. make delivery
  • Although other hosts in the LAN also received the data sent by host A, after identification, they found that the data was not sent to themselves, so they directly discarded the data without delivering it upwards.

That is, during LAN communication, all hosts in the LAN can see any data transmitted in the LAN, but each host only cares about the data sent to itself

extension:

  • The network packet capture can not only capture the message data sent to the local machine, but also capture the message data sent to other machines. In fact, it is because when the network packet capture is performed, the host will capture all the message data received from the LAN. The text data has been delivered upwards.
  • The network card has a mode called promiscuous mode . The network card set to promiscuous mode can receive all data streams passing through it, regardless of whether its destination address is its

collision avoidance algorithm

Since all the hosts in the Ethernet share a communication channel, only one host is allowed to send data at the same time, otherwise the data sent by each host will interfere with each other. From the perspective of the system, the communication channel shared by each host here is a critical resource, and this critical resource is only allowed to be used by one host at a time.

  • For this problem, the Ethernet approach is not to limit the ability of each host to send data. Each host in the LAN sends data directly when it wants to send data, but as long as the sent data collides with the data sent by other hosts, it will Implement a collision avoidance algorithm
  • Collision avoidance algorithm: When the data sent by the host collides, the host waits for a random time (to avoid another collision) before resending the data, and the data in the LAN can be dissipated as much as possible while the host is waiting
  • After the data sent by the host in Ethernet collides, the host will execute the collision avoidance algorithm, so Ethernet is a LAN communication standard based on collision area and collision detection

The collision avoidance algorithm is that the host waits for a period of time and resends the data, so the bottom layer of the Ethernet also has a retransmission mechanism, but the retransmission mechanism of the Ethernet is only to ensure that the data is sent from one host in the LAN to another host

Token Ring

  • The transmission method of the Token-ring network adopts a star topology physically, but a ring topology logically
  • The communication transmission medium of token ring network can be unshielded twisted pair, shielded twisted pair and optical fiber, etc.
  • The nodes in the token ring network are connected together by a multistation access unit (Multistation Access Unit, MAU). The MAU is a specialized hub for transmission around the loop of the workstation computer.

In the token ring network, there is a special frame called "token". The "token" will be continuously transmitted on the ring. Only the host with the "token" can send data, so the data will not collide.

  • The "token" in the token ring network is like a mutex used to protect critical resources in the system. Like the mutex, the "token" also has two states: "busy" and "idle". The card is occupied, "Free" means the token is not occupied
  • A computer that wants to send data must first detect the "idle" token and set it to a "busy" state before it can send data, similar to the process of applying for a mutex
  • Since the "token" is passed in sequence on the network ring, all computers connected to the network have an equal chance of obtaining the token, so there will be no starvation problem

2.2 Ethernet frame format

  • The source address and destination address refer to the hardware address of the network card (that is, the MAC address, which is fixed when the network card leaves the factory), and the length is 48 bits
  • The frame protocol type field has three values, corresponding to IP protocol, ARP protocol and RARP protocol
  • At the end of the frame is the CRC check code

How does a MAC frame separate the header from the payload?

Both the frame header and the frame trailer of the Ethernet MAC frame are of fixed length. When the bottom layer receives a MAC frame, it extracts the fixed-length frame header and frame tail in the MAC frame, and the rest is the payload

How does the MAC frame decide which protocol to deliver the payload to the upper layer?

There are more than one upper-layer protocols corresponding to the Ethernet MAC frame, so after separating the header and payload of the MAC frame, it is necessary to determine which protocol to deliver the separated payload to the upper layer

There is a 2-byte type field in the frame header of the MAC frame, so after separating the header and payload, the payload can be delivered to the corresponding upper layer protocol according to this field

Example understanding

Suppose host A in the LAN wants to send an IP datagram to host B in the same LAN, then the destination address in the encapsulated MAC frame of host A is the MAC address of host B, the source address is the MAC address of host A, and the frame protocol The type corresponds to 0800, followed by the IP datagram to be sent, and the tail part of the frame corresponds to the CRC check

When host A sends the MAC frame to the LAN, all hosts in the LAN can receive the MAC frame, including host A itself.

  • After host A receives the MAC frame, it performs CRC check on the received MAC frame. If the check fails, it means that a collision occurred during data transmission. At this time, host A will execute the collision avoidance algorithm, and then perform MAC frame re-checking. hair
  • After host B receives the MAC frame, it extracts the destination address in the MAC frame and finds that the destination address is the same as its own MAC address, so after the CRC check is successful, it delivers the payload to the upper IP layer for further processing
  • After receiving the MAC frame, other hosts in the LAN will also extract the destination address in the MAC frame, but find that the destination address does not match its own MAC address, so it will directly discard the MAC frame

That is, when the bottom layer receives a MAC frame, it will judge whether the MAC frame is sent to itself according to the destination address in the MAC frame. If it is sent to itself, it will perform a CRC check on it. If the check is successful, According to the frame protocol type of the MAC frame, the MAC will be delivered to the corresponding upper layer protocol for processing

2.3 MAC address

2.3.1 Know the MAC address

  • MAC addresses are used to identify connected nodes in the data link layer
  • The length is 48 bits, that is, 6 bytes, and is generally expressed in the form of a hexadecimal number plus a colon. For example: 08:00:27:03:fb:19
  • It is determined when the network card leaves the factory and cannot be modified. The MAC address is usually unique (the MAC address in the virtual machine is not the real MAC address and may conflict; some network cards also support user-configurable MAC addresses)

You can use the ifconfig command to view the MAC address

2.3.2 Compare MAC address and IP address

There will be two sets of addresses in the data routing process, source IP address and destination IP address, source MAC address and destination MAC address

  • The IP address describes the overall start and end of the journey
  • The MAC address describes the start and end of each interval on the road

During the routing process of data, the source IP address and destination IP address can be understood as not changing (in fact, they may change, NAT technology), and the source MAC address and destination MAC address of the data will change after each hop

2.4 PERSON

2.4.1 Understanding MTU

MTU (Maximum Transmission Unit, maximum transmission unit) describes the maximum amount of data that can be sent by the underlying data frame at one time. This limitation is generated by the physical layer corresponding to different data link layers

  • The value corresponding to the MTU of Ethernet is generally 1500 bytes, and different network types have different MTUs. If the data to be sent at one time exceeds the MTU, the data needs to be fragmented at the IP layer.
  • Ethernet stipulates that the minimum length of data in a MAC frame is 46 bytes. If the amount of data sent is less than 46 bytes, padding bits need to be added after the data. For example, the length of an ARP packet is not enough for 46 bytes.

2.4.2 Impact of MUT on IP protocol

Because the data link layer stipulates the maximum transmission unit MTU, if the amount of data to be sent by the IP layer exceeds the MTU at one time, the IP layer needs to fragment the data first (the premise is that the IP layer is not set to prohibit fragmentation, otherwise The packet is directly discarded), and then the fragmented data can be delivered downward

Fragmentation and assembly of data occurs at the IP layer. Not only the source host may fragment the data, but also the router during the data routing process may fragment the data. Because the MTU of different networks is different, if the MTU of a network on the transmission path is smaller than the MTU of the source network, the router may fragment the IP datagram again

For the specific fragmentation and assembly process, you can browse another article of the blogger:

(28 messages) Network layer - IP protocol_GG_Bond19's blog - CSDN blog icon-default.png?t=N3I4https://blog.csdn.net/GG_Bruse/article/details/130640438

2.4.3 The influence of MTU on UDP protocol

If the option field is not carried in the IP header, the length of the IP header is 20 bytes, while UDP uses a fixed-length 8-byte header, so if the data carried by UDP at one time exceeds 1500 − 20 − 8 = 1472 bytes , then the data needs to be fragmented at the IP layer

If any one of the multiple IP datagrams obtained after fragmentation is lost during transmission, the IP layer reassembly at the receiving end will fail. Assuming that the probability of packet loss during network transmission is 1/10,000, if the data is split into 100 parts and sent, then the probability of packet loss will rise to 1%. Because as long as there is a packet loss of a fragmented packet, it is equivalent to the loss of the entire packet, so fragmentation will increase the probability of UDP packet loss

2.4.4 The impact of MTU on the TCP protocol

For TCP, fragmentation will also increase the probability of TCP packet loss. Unlike UDP, TCP will retransmit after packet loss. Therefore, TCP should minimize data retransmission caused by fragmentation.

  • The datagram sent by TCP cannot be infinitely large, or is subject to MTU. The maximum message length of a single datagram of TCP is called MSS (Max Segment Size)
  • During the process of establishing a connection, the two parties in the TCP communication will conduct MSS negotiation, and finally select the smaller value among the MSS values ​​supported by both parties as the final MSS
  • The value of MSS is in the 40-byte option field of the TCP header (kind=2)
  • Ideally, the value of MSS is exactly the maximum length of data that will not be fragmented at the IP layer

2.5 The process of data transmission across the network

Take host A transmitting data across the network to host B as an example, the data routing process is as follows:

  • If host A wants to transmit data across the network to host B, it needs to first hand over the data to router A in the same LAN. Therefore, host A needs to send the encapsulated MAC frame to the current LAN. At this time, the source MAC in the MAC frame The address and destination MAC address correspond to the MAC address of host A and the MAC address of router A.
  • All hosts in the local area network where host A is located can receive this MAC frame, but in the end only router A finds that the destination MAC address in the MAC frame is the same as its own MAC address, so it unpacks the MAC frame and sends The remaining IP datagram after unpacking is delivered to the IP layer
  • After the IP layer of router A gets the unpacked IP datagram, it will extract the destination IP address in the IP header, and then determine that the data needs to be forwarded to router B after querying the routing table, so router A sends the data to The frame header and frame tail of the MAC frame are re-encapsulated, but at this time, the source MAC address and destination MAC address in the encapsulated MAC frame become the MAC address of router A and the MAC address of router B.
  • Although there may be many hosts directly connected to router A, in the end only router B finds that the destination MAC address in the MAC frame is the same as its own MAC address, so it will unpack the MAC frame and send the unpacked The remaining IP datagrams are delivered to the IP layer
  • After the IP layer of router B gets the unpacked IP datagram, it will also extract the destination IP address in the IP header, and after querying the routing table, it is determined that the data needs to be forwarded to router C, so router B then forwards the data to Deliver downward, and re-encapsulate the frame header and frame tail of the MAC frame, but at this time, the source MAC address and destination MAC address in the encapsulated MAC frame have changed again, becoming the MAC address of router B and the MAC address of router C address
  • Repeat the above process until the final data is forwarded to host B

Therefore, when data is transmitted across the network, its corresponding source IP address and destination IP address generally do not change, but the source MAC address and destination MAC address of the data are always changing. The fundamental reason is that the data The corresponding previous hop host and next hop host are constantly changing

3. ARP protocol

The Address Resolution Protocol (ARP) protocol is a TCP/IP protocol that obtains the MAC address based on the IP address.

3.1 Understanding the ARP protocol

Why does a protocol like ARP exist?

Taking the example just now as an example, when data is forwarded from host A to router D through various routes, router D needs to forward the data to host B to complete the data routing.

  • Since router D and host B belong to the same LAN, router D can directly deliver data to host B, but to send data to a host in the same LAN, the premise is to know the MAC address of the other party first (MAC frame destination address in
  • But router D only knows the IP address of host B at this time, so router D must obtain the MAC address of host B in some way

That is, if you want to send a message to the other party in the same LAN, you must know the MAC address of the other party, but in most cases you only know the IP address of the other party, so you need to use the ARP protocol to obtain the MAC address of the target host according to the IP address.

Positioning of the ARP protocol

In the TCP/IP four-layer model, the network protocol stack is divided into application layer, transport layer, network layer and data link layer from top to bottom

Among them, the most typical protocols of the application layer are HTTP, HTTPS and DNS, etc., the most typical protocols of the transport layer are TCP and UDP, the most typical protocol of the network layer is IP, and the most typical protocol of the data link layer is the MAC frame protocol. There are two other protocols at the data link layer called ARP and RARP

Although ARP, RARP and MAC frame protocols belong to the protocol of the data link layer, the ARP protocol and RARP protocol belong to the upper layer protocol of the MAC frame

  • The upper-layer protocol of the MAC frame is not necessarily the protocol of the network layer directly, and the upper-layer protocol of the MAC frame may also belong to the protocol of the data link layer
  • Similarly, the ICMP protocol and IGMP protocol in the network layer, although these two protocols belong to the network layer with the IP protocol, these two protocols belong to the upper layer protocol of IP

3.2 ARP data format

  • The hardware type refers to the network type of the link layer, 1 is Ethernet
  • The protocol type refers to the address type to be converted, 0x0800 is the IP address
  • The hardware address length corresponds to the Ethernet address is 6 bytes, because the MAC address is 48 bits
  • The length of the protocol address corresponds to 4 bytes for the IP address, because the IP address is 32 bits
  • An op field of 1 indicates an ARP request, and an op field of 2 indicates an ARP response

It can also be seen from the data format of ARP that ARP is the upper layer protocol of the MAC frame protocol. The first 3 fields and the last field in the ARP data format correspond to the Ethernet header, but because the length of the ARP data packet is less than 46 bytes , so when the ARP packet is encapsulated into a MAC frame, an 18-byte padding field needs to be added

3.3 Workflow of ARP protocol

Taking the previous example as an example, router D wants to forward data to host B in the same LAN. The premise is that router D must know the MAC address of host B, but now router D only knows the IP address of host B, so router D now Need to initiate an ARP request to host B, and then wait for host B to send an ARP response to know the MAC address of host B

The process of ARP request

First, router D needs to build an ARP request

  • First, because router D constructs an ARP request, the op field in the ARP request is set to 1
  • The hardware type field in the ARP request is set to 1, indicating that Ethernet communication is currently used
  • The protocol type in the ARP request is set to 0800, because the router obtains the MAC address of host B based on the IP address of host B
  • The length of the hardware address and the length of the protocol address in the ARP request are set to 6 and 4 respectively, because the length of the MAC address is 48 bits and the length of the IP address is 32 bits
  • The sender's Ethernet address and sender's IP address in the ARP request correspond to the MAC address and IP address of router D.
  • The destination Ethernet address and destination IP address in the ARP request correspond to the MAC address and IP address of host B, but since router D does not know the MAC address of host B, the binary sequence of the destination Ethernet address is set to all 1s. Indicates broadcasting in the local area network

After the ARP request is constructed, the ARP packet is delivered to the MAC frame protocol and encapsulated into a MAC frame

  • When encapsulating the MAC frame header, the Ethernet destination address and the Ethernet source address correspond to the MAC addresses of host B and router D respectively, but since router D does not know the MAC address of host B, the Ethernet destination address in the MAC frame header The binary sequence can only be set to all 1, which means broadcasting in the LAN
  • Because an ARP request packet is encapsulated here, the frame type field in the MAC frame is set to 0806
  • Since the length of the ARP request packet is only 28 bytes, which is less than 46 bytes, it is necessary to add an 18-byte padding field to the payload of the MAC frame, and finally perform a CRC check on the MAC frame.

After the MAC frame is encapsulated, router D can broadcast the encapsulated MAC frame to the LAN

  • Because the MAC frame is sent in broadcast mode, each host in the LAN will unpack the MAC frame after receiving the MAC frame. When these hosts recognize that the frame type field in the MAC frame is 0806, they know that this is an ARP request or response packet, so they will deliver the payload of the MAC frame to the ARP layer
  • When the ARP layer receives this data packet, it finds that the op field in the ARP data packet is 1, so it is determined that this is an ARP request, and then extracts the destination IP address field in the ARP data packet, although all hosts in the LAN will Hand over the data packet to its own ARP layer, but in the end only host B finds that the destination IP address in the ARP data packet is the same as itself, so only host B will respond to the ARP request, while other hosts in the LAN recognize After the destination IP address in the ARP packet does not match, the ARP request packet will be discarded
  • After receiving the ARP request message, other irrelevant hosts in the LAN do not discard it at the MAC frame layer, but discard it after finding that the destination IP of the ARP packet does not match its own IP at the ARP layer

Summarize:

  • The initiator builds an ARP request and broadcasts it to each host
  • Each host can identify the reception, and then deliver the payload to the ARP layer of each host according to the frame type field of the MAC frame
  • Other irrelevant hosts immediately discard the ARP request in the ARP protocol stack according to the destination IP, and only the target host processes the request

The process of ARP reply

When host B replies, it first needs to construct an ARP reply

  • Because host B builds an ARP reply, the op field in the ARP reply is set to 2
  • The values ​​of the hardware type, protocol type, hardware address length, and protocol address length in the ARP response are the same as those set in the ARP request
  • The sender's Ethernet address and sender's IP address in the ARP response correspond to the machine's MAC address and IP address
  • The destination Ethernet address and destination IP address in the ARP response correspond to the MAC address and IP address of router D. The ARP request sent by router D notifies its MAC address and IP address, so host B knows it

After the construction of the ARP response is completed, in order to send the ARP response to the Ethernet, the ARP packet must also be delivered to the MAC frame protocol and encapsulated into a MAC frame.

  • When encapsulating the MAC frame header, the Ethernet destination address and the Ethernet source address correspond to the MAC addresses of router D and host B respectively
  • Because an ARP response packet is encapsulated here, the frame type field in the MAC frame is set to 0806
  • Since the length of the ARP response data packet is only 28 bytes, which is less than 46 bytes, it is necessary to add 18 bytes of padding fields to the payload of the MAC frame, and finally perform CRC check on the MAC frame

After the MAC frame is encapsulated, host B can send the encapsulated MAC frame to the LAN

  • At this time, every host in the LAN can receive the MAC frame at the bottom layer, but the irrelevant host in the LAN will discard the MAC frame after finding that the Ethernet destination address corresponding to the MAC frame is different from itself. , and will not be delivered to the upper ARP layer, and finally router D will deliver the payload of the unpacked MAC frame to its own ARP layer
  • When the ARP layer of router D receives this data packet, it finds that the op field in the ARP data packet is 2, so it is determined that this is an ARP response, and then it extracts the address of the sending end Ethernet in the ARP data packet and sends At this time, router D has obtained the MAC address of host B.

After receiving the ARP reply message, other irrelevant hosts in the LAN discard it directly at the MAC frame layer, and do not deliver it to their own ARP layer.

ARP cache table

In fact, it is not necessary to initiate an ARP request every time you want to obtain the MAC address of the other party. After each ARP request is initiated, a mapping relationship between the corresponding host IP address and MAC address will be established. Each host maintains an ARP cache table, and you can use arp -a command to view

The entries in the cache table have an expiration time, which is generally 20 minutes. If an entry is not used again within 20 minutes, the entry will become invalid, and the next time you use it, you need to re-initiate an ARP request to obtain the MAC address of the destination host. address

When do you need to initiate an ARP request?

The above is just that when router D wants to send data to host B, it needs to obtain the MAC address of host B through ARP, but the actual data may need to initiate an ARP request at each hop in the routing process to ask the next hop host to correspond The MAC address of the next hop is usually only known at each hop, but the corresponding MAC address is not known.

Note: ARP is a protocol standard for LAN communication, so a host cannot initiate an ARP request to another host across the network

The source and destination MAC addresses are already included in the header of the MAC frame, why are there these two fields in the ARP header?

  • Although both MAC frames and ARP are in the data link layer, they are the relationship between the upper and lower layers after all, so they will not care about each other's data
  • If the underlying network is not Ethernet, but other types of network, then the MAC address of the ARP layer is necessary

When performing LAN communication, why not send data directly by broadcasting?

During LAN communication, even if you only know the IP address of the other party but not the MAC address of the other party, you can also broadcast the data to the LAN. At this time, the hosts in the LAN can also compare the destination IP address at the IP layer. Whether the address matches itself, to determine whether the received data is sent to itself

In theory yes, but not right

  • For most hosts in the LAN, the received message should have been discarded long ago, but now the message is delivered to the IP layer, which is a waste of network and system resources. Therefore, at the underlying MAC frame layer, it should be determined whether the message is sent to the current host, rather than when the data is delivered to the IP layer.
  • In addition, if you use the broadcast method to send data without thinking, the concepts of broadcast and unicast will become blurred. Obviously, you want to send data to a host in the LAN, but you use the broadcast method. this is obviously unreasonable

RARP protocol

RARP (Reverse Address Resolution Protocol, Reverse Address Translation Protocol), is a TCP/IP protocol that obtains an IP address based on a MAC address

In some cases, only the MAC address of a host may be known. At this time, the RARP protocol can be used to know the IP address of the host.

Theoretically speaking, the RARP protocol must be simpler than the ARP protocol, because you already know the MAC address of a host, then you can already send a message directly to the host, just send a message directly to ask the other party's IP address

Guess you like

Origin blog.csdn.net/GG_Bruse/article/details/130711915
Recommended