MAC table, ARP table, routing table summary

Full text summary
A key step in a computer network is that different nodes on the communication path forward the data packets flowing through the node. The common switching equipment is mainly switches (layer 2 and layer 3) and routers (layer 3). At the same time, they each maintain some table structures to help complete the correct addressing and forwarding of data packets. This article describes in detail three crucial tables: the role of forwarding tables, ARP tables, and routing tables in the network packet forwarding function , And how they work together. By the way, I will continue to talk about switches and routers in the previous article.

Network layered protocol
computer network is to physically connect geographically isolated computing nodes (twisted pair, optical fiber, wireless signal, etc.), and adopt a layered method to divide the computer network into five layers from top to bottom: Application Layer -> Transport layer -> Network layer -> Data link layer -> Physical layer, corresponding to the design of a complete set of network protocol stack, each layer runs multiple protocols, the lower layer provides services to the immediate upper layer and hides the communication details, and the upper layer calls Adjacent to the services provided by the lower layer, the peer-to-peer communication between the different levels of the host is completed, and the higher layers (application layer, transport layer, network layer) can use this to realize logical connection communication. It is called "logical communication" because it seems that the same higher levels of different hosts transmit data in the horizontal direction, but in fact there is no horizontal physical connection between the two peer levels. For example, as follows:

The transport layer protocol is used to implement logical communication between application processes. This is because the process is originally a logical concept invented to facilitate resource allocation and management. It is not a physical entity and cannot be physically associated; the
network layer protocol is used to implement logic between hosts Communication, this is because two communication hosts are usually located in different positions in the network topology. Most of the time, they are not directly connected physically. A communication path is required between them according to the routing protocol. The network layer protocol is responsible for the data packets. Send to the destination host, but upward (that is, to the transport layer) only provide best-effort (Best-Effort) data packet service;

The data link layer is used to realize point-to-point communication. Since different nodes are physically connected through real data links, it can no longer be regarded as logical communication.

Forwarding Table (Forwarding Table)
MAC address
(Media Access Control Address), that is, media access control address, usually also called Ethernet address or physical address, it is an address used to confirm the location of network devices, each network device (For example, the Network Adapter (Network Interface Card, NIC)) has a unique MAC address in the world. If a device has multiple network cards, each network card must have a unique MAC address. It is determined by the manufacturer when the network equipment leaves the factory.

The MAC address has 48 bits in total, which is 6 bytes. Usually every 4 bits form a hexadecimal number, which can be expressed in the form of xx:xx:xx:xx:xx:xx. Each x is a hexadecimal number. number. Where ff:ff:ff:ff:ff:ff is the broadcast address, and the data packet with this destination address will be broadcast to all ports by the switch and sent to all LANs connected to the port; 01:xx:xx:xx:xx :xx is the multicast address.

Workflow The
forwarding table, also known as the MAC table, has to be mentioned in the switch when we talk about it, because the switch forwards data frames based on the forwarding table.

The switch is essentially a computer, with computing (CPU), storage (SRAM or TCAM) and network resources (forwarding chips and links), and even install a dedicated operating system, it will maintain a record of the MAC address of the LAN host port The table corresponding to the switch port, the switch is responsible for transmitting the data frame to the designated host port according to this table.

The switch has a "store and forward" function:

After the switch receives the data frame, it first records the source MAC address and the corresponding arrival port in the data frame into the MAC table. This process is usually called "self-learning" and does not require any manual intervention;

Next, the switch checks its MAC table to see if there is a matching entry for the destination MAC address in the data frame. If so, it will forward the data frame according to the corresponding port recorded in the MAC table. This forwarding method is called "unicast" ( Unicast). And if not, the data frame will be sent out from all other ports that are not reaching the port. This forwarding procedure is called "Broadcast".

The following will explain in detail the process of data frame transmission by the switch in the form of an illustration. Let's take a look at the forwarding situation of a single switch:
Insert picture description here

Proceed as follows:

Host A will send a
data frame whose source MAC address is the physical address of the network card of host B and the destination MAC address is the physical address of the network card of host B to switch 1. After receiving this data frame, the switch will first send the source MAC address and The corresponding input port 0 is recorded in the MAC address table maintained by the switch;
then the switch will check whether there is information about the destination MAC address in the data frame in its own MAC address table, and if so, the corresponding port recorded in the MAC address table Send out, if not, the data frame will be sent out from all ports of the non-receiving port, here only from port 1; at
this time, all hosts on the LAN (all hosts connected through the switch) will receive this data frame , But only when host B receives this data frame, it will respond to this broadcast frame and respond with a data frame (what mechanism determines this response, you need to check the information to confirm ~ TCP ACK packet?), this data frame includes The MAC address of the network device of host B;
when the switch receives the data frame responded by host B, it will also record the source MAC address in the data frame, that is, the MAC address of the network device of host B. At this time, it will be used as host A and host B When communicating with each other, the switch realizes unicast according to the records in the MAC address table, and the actual switch "learns" two forwarding table entries in one forwarding process.
So when multiple switches are interconnected in the LAN, how to record the MAC address table of the switches? The following figure shows this situation:
Insert picture description here

Proceed as follows:

Host A sends a data frame whose source MAC address is the physical address of the network card of host C and the destination MAC address is the physical address of the network card of host C to switch 1.
After receiving this data frame, switch 1 will learn the source MAC address and check the MAC address Table, if there is no record of the destination MAC address, it will broadcast the data frame. Host B and Switch 2 will both receive the data frame. After receiving this data frame,
Switch 2 will also send the source MAC address and the corresponding Record the port in the MAC address table, and check your own MAC address table. If there is no record of the destination MAC address, the data frame will be broadcast, and both host C and host D will receive the data frame;
host C receives the data frame Then, it will respond to this data frame and reply with a data frame whose source MAC address is the physical address of the machine's network card. The frame will eventually be sent to host A. At this time, both switch 1 and switch 2 will record the MAC address of host C to themselves In the MAC address table, and send this data frame to host A in unicast form; at
this time, host A and host C can communicate with host C to transmit data frames in unicast form, A and D, B and C and The communication between B and D is the same as the above process, so the MAC address table of switch 2 records that the MAC addresses of host A and host B correspond to its port 3.
Summary
As can be seen from the above two figures, the switch has the function of dynamically self-learning source MAC address and physical port mapping, and one port of the switch can correspond to multiple MAC addresses, but one MAC address can only correspond to one port.

Note: The MAC address dynamically learned by the switch has a validity period of only 300s by default. If the MAC address recorded within 300s does not have a corresponding communication process to update the corresponding entry, this record will be automatically deleted, which is maintained by a timer in the switch .

ARP table (Address Resolution Table)
ARP protocol
First of all, in an Ethernet environment, hosts on the same network segment need to know each other's MAC address before they can communicate.

The previous section introduced the working principle of the switch, and learned that the switch is based on MAC addressing, check the table to confirm the output port to complete the node forwarding task. Seeing this should actually raise a question that we have overlooked from the beginning: How does the source host obtain the MAC address of the destination host's network device when the data packet is initially constructed and ready to be sent? At this time, you need to use the ARP protocol. In each node or host in the network topology, an ARP table is actually maintained, which records the mapping relationship between the host's IP address (network address) and MAC address (physical address).

The ARP protocol, or address resolution protocol, is a network layer protocol that runs on each network node and is responsible for completing the mapping from the host IP address to the MAC address.

Workflow
Next, according to the following figure, explain in detail the working principle of the ARP protocol:
Insert picture description here

Proceed as follows:

If host A wants to send a data packet to another host B in the same network segment (the nodes connected by the switch are in the same network segment), obviously, the user application of A already knows B's IP address, or the domain name ( Domain Name, DNS protocol will complete the mapping from host name to IP address, this is not the point), then host A will first check its own ARP cache table (ARP Cache) to see if there is a correspondence between host B’s IP address and its MAC address If so, directly encapsulate the MAC address of the host B network device as the destination MAC address into the data frame, and obtain all the information required for data frame encapsulation without further operations, and then complete the encapsulation and send the data frame to the destination MAC address . If not, host A will send an ARP request message (ARP Request), the destination IP address of the request is the IP address of host B, and the destination MAC address is the broadcast address of the MAC layer (ie ff:ff:ff:ff:ff: ff), the source IP address and MAC address are the IP address of host A and its MAC address;
when the switch receives this data frame, it finds that this frame is a broadcast frame, so it will send this data frame from all ports that are not receiving Out;
all nodes in the same network segment will receive the ARP request packet, the node whose destination IP does not match will ignore the request directly, and when host B receives the data frame, it resolves to the IP address and its own IP address Consistent, first record the corresponding relationship between host A's IP address and its MAC address in its own ARP cache table, and send an ARP response (ARP Response), the source MAC address of the response packet is the MAC of host B's own network device Address, the response is forwarded to host A through the switch;
host A records the correspondence between host B’s IP address and MAC address in its own ARP cache table after receiving the response data frame. At this time, host A can continue to encapsulate the data frames to be sent to host B, and the switch has also learned the correspondence between the MAC addresses of host A and host B and their ports, and then the data frames sent by host A are forwarded to the host through the switch B.

One thing worth noting here is the change of IP address and MAC address during transmission:

The MAC address is always the same when propagated in the same broadcast domain, but when it crosses the broadcast domain (that is, passing through the router), it will change due to re-encapsulation. The source MAC address will become the MAC address of the router's output port and the destination MAC address Depending on the actual situation of the network topology, if the router is directly connected to the network segment where the destination host is located, then the destination MAC address is the MAC address of the destination host; regardless of whether it is the source IP address or the destination IP address, it will always be during data packet transmission. will not change.
Summary
Each node in the network will maintain an ARP cache table inside the node by running the ARP protocol, which is used to complete the mapping from IP address to MAC address. Before sending data, it will often query the corresponding destination IP address in the local ARP table. If there is no table entry, an ARP broadcast request will be initiated until the corresponding host response is obtained and a response is sent. After the mapping relationship between the destination IP address and the MAC address contained in the response is added to the ARP cache table, the data link layer The data frame can be encapsulated and sent with the MAC address as the destination MAC address.

Routing Table
IP address
IP address (Internet Protocol Address), namely Internet Protocol address, also known as network layer address or host address, is an address assigned to each network device on the network.

There are two versions of the popular IP protocol: IPv4 (Internet Protocol Version 4) and IPv6 (Internet Protocol Version 6). Among them, the IPv4 address is 32 bits, that is, 4 bytes. For ease of use, 8 bits per byte of xxx.xxx.xxx.xxx are often expressed from binary to decimal. This representation method is called dotted decimal. The address can be divided into five categories: A, B, C, D, and E. The 32-bit IP address with all 1s: 255.255.255.255 is called the "Limited Broadcasr Destination Address" and is used to broadcast a packet It is sent to all hosts in the network, and the router blocks the packet from passing through and limits its broadcast function to the inside of the network. Therefore, it can be said that the router isolates the broadcast domain (the switch isolates the conflict domain).

With the continuous expansion of the network scale and the number of nodes, out of concerns that the 32-bit IPv4 will soon be allocated and exhausted, an IPv6 address has been introduced, 128 bits, 16 bytes, usually every 4 bits is represented as a 16 The hexadecimal number, 16 bytes are divided into 8 groups, each group contains 2 bytes, that is, 4 hexadecimal numbers, separated by colons between groups: ffff:ffff:ffff:ffff:ffff:ffff:ffff :ffff is the broadcast address under the IPv6 protocol.

IP VS MAC
MAC address and IP address are essentially the nature of the physical port of network equipment, and can be used to address network equipment, but if they have similar functions and only work at different levels, there should be some mechanism between the two Realize mutual replacement, but why can they coexist so far?

For the comparison of MAC address and IP address, some people have used this analogy: a person already has a mobile phone number (IP address), why does he need an ID number (MAC address)? The ID number is a unique identification number of a person. As long as there is this number, the person can be found, but why do his friends use the mobile phone number instead of the ID number to find the person? Yes, because it is convenient. But if the person commits a crime, the police use the phone number to call for the person, and the fool will answer it... and what if the number is changed? Then, don’t you have to use your ID number to issue a wanted order across the country, and use your ID number to search for the person’s records in various systems across the country (consumption, social, medical) in order to finally locate the person for effective arrest.

The above example actually fits the meaning of the existence of two types of addresses: IP address is a logical address. According to the network protocol, if you join the Internet in a different geographic location, you will be assigned a completely different IP address (DHCP dynamic allocation of IP addresses), but because it belongs to At the network layer, the original intention of the relatively high level of abstraction design is to simplify communication and facilitate use, especially for user processes; MAC address is a physical address, which works at the data link layer, and is determined and burned by the manufacturer once it leaves the factory. The EPROM of the network device has a fixed global unique address, and any conditions will not change at any time. Although it is not convenient to use, and it describes the details of the lower-level data link communication, it does not change at any time. Can be used for data communication addressing.

A more rigorous and complete explanation is as follows:

Question: Since every Ethernet device has a unique MAC address when it leaves the factory, why do we need to assign another IP address to each host? In other words, why is each host assigned a unique IP address, and why should a unique MAC address be embedded in the production of network devices (such as network cards, hubs, routers, etc.)?
Answer: The main reasons are based on the following points:
(1) The allocation of IP addresses is based on the topology of the network, not based on who made the network settings. If an efficient routing scheme is established on the basis of the equipment manufacturer instead of the topological location of the network, this scheme is not feasible;
(2) When there is an additional layer of address addressing, the equipment is more Easy to move and repair. For example, if an Ethernet card is broken, it can be replaced without having to obtain a new IP address. If an IP host moves from one network to another, it can be given a new IP address without changing a new network card;
(3) Whether it is a local area network or a wide area network, the communication between computers will eventually behave In order to start the data packet from the initial node on some form of link, from one node to another node, and finally to the destination node. The movement of data packets between these nodes is done by mapping the IP address to the MAC address by the ARP protocol.

Let's take another example to see how the IP address and MAC address are combined to transmit data packets:

Suppose a data packet (named PAC) is to be sent from a host in Beijing (named A, IP address is IP_A, MAC address is MAC_A) to a host in New York (named B, IP address is IP_B) on the network , The MAC address is MAC_B). The two hosts are unlikely to be directly connected, so data packets must pass through many intermediate nodes (such as routers, gateway servers, etc.) during transmission. It is assumed that they must pass through C1, C2, and C3 during transmission. The MAC addresses of the input and output ports are respectively M1_In/M1_Out, M2_In/M2_Out, M3_In/M3_Out). Before sending the PAC, A sends an ARP request, finds the arrival port MAC address M1_In of the first intermediate node C1 that it must go through to reach IP_B, and then encapsulates the addresses in its data packet: IP_A, IP_B, MAC_A and M1_In. After the PAC is transmitted to C1, ARP finds the arrival port MAC address M2_In of the second intermediate node C2 that it will experience according to its destination IP address IP_B, and then encapsulates the data packet with the destination MAC address of M2_Out and transmits it to C2. And so on, until the MAC address MAC_B of the B host whose IP address is IP_B is finally found, and finally sent to the host B. In the process of transmission, the source IP address IP_A and destination IP address IP_B of the data packet remain unchanged, while the source MAC address and destination MAC address continue to change due to the re-encapsulation of the data frame by the intermediate node until the destination address MAC address is MAC_B, and the data packet finally Reach the destination host B.

In summary, the similarity between IP address and MAC address is that they can be used as device address identifiers. The differences are mainly reflected in the following aspects:

For a certain device on the network, such as a computer or a router, its IP address is variable (but must be unique), while its MAC address is immutable. We can assign any IP address to a host as needed. For example, we can assign an IP address to a computer on the LAN as 192.168.0.112, or change it to 192.168.0.200. Once any network device (such as network card, router) is produced, its MAC address is always unique and cannot be changed by the user; the
length is different. The IP address is 32 bits and 4 bytes, and the MAC address is 48 bits and 6 bytes; the
allocation basis is different. IP address allocation is based on network topology, MAC address allocation is based on manufacturer;
addressing protocol layers are different. The IP address is used in the third layer of OSI, the network layer, and the MAC address is used in the second layer of OSI, the data link layer. The data link layer protocol allows data to be transferred from one node to another node on the same link (addressed by MAC address), while the network layer protocol allows data to be transferred from one network to another network (ARP according to the purpose IP address, find the MAC address of the intermediate node, forward it through the intermediate node, and finally reach the destination network).
Workflow The
router is responsible for the communication between different network segments (Subnet, subnet). Each network connected to the router port is called a subnet or network segment, which is a broadcast domain. There is also a table in the router. This table is called a routing table. By running a routing protocol on network nodes, it records and updates path information to different network segments. The information in the routing table is divided into direct routes and non-direct routes:

Direct route: the network segment directly connected to the router port, the information is automatically generated by the router;
non-direct route: the network segment not directly connected to the router port, this record needs to be added manually or generated by dynamic routing.
Under the native Linux system, with dual network cards em1 (114.212.84.179) and virbr0 (192.168.122.1), execute the command: route -n, and get the value display form of the kernel IP routing table as follows:
Insert picture description here

Parse the above routing table:

The first item: the destination network is 114.212.80.0/21, and the gateway address is 0.0.0.0 (the numerical form of "..."), which means that the network segment belongs to the network segment directly connected to a certain port of the router, and the data packet will be transmitted from the router em1 Interface output; the
second item: the destination network is 192.168.122.0/24, and the gateway address is also 0.0.0.0, indicating that the network segment belongs to the network segment directly connected to a certain port of the router, and the data packet will be output from the router virbr0 interface;
third Item: When the destination network is 0.0.0.0 (the numeric form of "default"), it matches any network segment. Because the routing table matches the first matching strategy, the third item is when the destination IP address cannot match the first two items For a successful match, the corresponding gateway is called the default gateway, which is the next hop address that should be forwarded to when there is no table entry for a destination network stored in the router, and it is output from the em1 port.

Some of the entries recorded in the router need to be added manually, which is called static routing; some are obtained dynamically, called dynamic routing. Each entry in the table has the following attributes:

Destination network address (Destination): The result of the AND of the network address and the network mask is used to define the destination network range that the machine can reach. Normally, the destination network range includes the following situations:
(1) Host address: a certain The network address of a specific host;
(2) Subnet address: the network address of a specific subnet;
(3) Default route: All network addresses that are not specified in the routing table are uniformly matched with 0.0.0.0 for configuring the default Gateway

Network mask (Genmask): also known as the subnet mask (Subnet Mask), is a 32-bit address, the function is to divide an also 32-bit IPv4 address into a network address (Network Address) and a host address (Host Address) ). The subnet mask cannot exist alone, it must be used in conjunction with the IP address. The subnet mask is the basis for judging whether any two hosts are in the same network segment. Simply put, the respective IP addresses of the two hosts and the subnet mask configured on the machine do a bitwise AND operation. If the results are the same, It means that the two hosts are on the same network segment and can communicate directly without forwarding by the router;

Gateway (Gateway, also known as Next Hop Server): When sending IP data packets, the gateway defines a specific network destination address and the next hop IP address to which the data packets will be sent. If it is a network segment directly connected to the router, the gateway is usually the IP address of the network port corresponding to the router, but the interface must be consistent with the gateway at this time. If it is a remote network or a default route, the gateway is usually a server or router on the network connected to the router. If the target is the network to which the host belongs, no routing is required, and the gateway is displayed as "*";

Interface (Iface): The interface defines a specific network destination address, and the router uses the network interface (the router's physical port) to forward data packets. The gateway must be located in the same subnet as the interface (except the default gateway), otherwise other routing items need to be called when using this routing item, which may cause routing deadlock;

Hop count (Metric): Hop count is used to indicate the cost of routing. It usually represents the total number of hops required to reach the destination address. A hop count represents passing through a router. The TTL field in the header of an IP datagram is where the datagram belongs. The total number of hops that can survive. The smaller the number of hops, the lower the routing cost, and the more the number of hops, the higher the cost. When there are multiple routing options that reach the same destination network, the routing algorithm will choose a route with fewer hops.

Flags: The meanings of various routing table entries are as follows:
(1) U: The route is dynamic;
(2) H: The target is a host;
(3) G: The route points to the gateway;
(4) R: Resume dynamic The entries generated
by routing ; (5) D: Dynamic installation by the routing background program;
(6) M: Modification by the routing background program;
(7) !: Reject routing.

References (Refs): Not used in the Linux kernel, generally 0;

Search times (Use): The number of times this routing item was searched by the routing software.

The router works at the network layer. The logical address, that is, the IP address, can be identified at the network layer. That is to say, the data frame can be unpacked into IP data packets at most during data packet analysis. The router cannot manipulate the payload field of the datagram, but Do something for the IP header: When a certain port of the router receives a packet, the router will read the destination IP address in the packet, and then look it up in the routing table. If an entry corresponding to the destination IP address is found in the routing table, the packet is forwarded to the corresponding port of the router. If it is not found, then if the router is configured with a default route (default gateway), it will by default send all the data packets of the unresolvable destination network segment host to the default gateway for further forwarding. If the default route is not configured, the packet will be sent Discard, and return the source host with Unreachable information. This is the process of packet routing.

Use the following figure to introduce the working principle of the router in detail:
Insert picture description here

Proceed as follows:

Host A encapsulates the message from the upper layer into an IP datagram at the network layer. The source IP address in the IP header is its own IP address, and the destination IP address is the IP address of host B. Host A will use the 24-bit subnet mask configured by this machine to perform an AND operation with the destination address, and the destination address is not in the same network segment as the machine (Host A is located in the 192.168.1.0/24 network segment, and Host B is located in 192.168. .2.0/24 network segment, or belong to different subnets), so the data packet sent to host B needs to be forwarded by gateway router 1.
Host A obtains the MAC address of the E0 port of gateway router 1 through ARP request, and sends the data The link layer encapsulates the MAC address of the router E0 port into the destination MAC address in the Ethernet frame header. The source MAC address is its own MAC address, and then sends a data frame to router 1;
router 1 receives the data frame from port E0, Then perform analysis, peel off the header of the data link layer, and check in the routing table whether there is a table entry corresponding to the network segment of the destination IP address (that is, 192.168.2.2/24 and its 192.168.2.0/24 network segment), According to the record in the routing table, the next hop (Next Hop) or gateway address of the data packet sent to the 192.168.2.0/24 network segment (middle host) is 10.1.1.2/8 (actually the IP of the E1 port of router 2) Address), and the router finds that the next hop address is just in the network segment (10.0.0.0/8) directly connected to its E1 port, so the data is re-encapsulated on the E1 port of router 1. At this time, the source MAC of the Ethernet frame The address is the MAC address of the E1 port of router 1, and the destination MAC address is the MAC address of the E1 port of router 2, which is obtained through ARP broadcast. After the encapsulation is completed, the data frame is sent to router 2.
Router 2 receives the data frame from port E1, Then perform analysis, peel off the header of the data link layer, detect the destination IP address, and match it with the routing table. At this time, it is found that the network segment of the destination host IP address is exactly the directly connected network segment of its own E0 port. The router 2 So the MAC address of host B is obtained through ARP broadcast. At this time, the data packet is re-encapsulated on the E0 port of router 2. The source MAC address is the MAC address of the E0 port of router 2, and the destination MAC address is the MAC address of host B. Send data frame to host B;
After completing the above 1~4, host B finally receives the data packet from host A.
In summary, the seemingly "simple" inter-network host communication is really not too easy.

Summary The
routing table is responsible for recording the path from one network to another. The router relies on the routing protocol and its determined routing table to complete the three layers, that is, the data forwarding at the network layer. The most important information in the routing table entry is the correspondence between the destination network segment and the gateway, that is, the next-hop IP address. The gateway is usually a dedicated gateway server or router, and the gateway is responsible for finally forwarding the data packet to the destination network segment.

Guess you like

Origin blog.csdn.net/grimmp/article/details/107215053