Macroscopic understanding of Linux [network foundation]

1. Computer network background

To learn computer networks, we must first have a macroscopic understanding. The problem solved by the network involves two parties, the sender of the data and the receiver of the data, which is to solve the problem of inter-process communication between processes between different computers.
It has gone through four processes:
the first is independent mode :
computers are independent of each other, as shown in the figure below:
insert image description here
the second is network interconnection : multiple computers are connected together to complete data sharing, as shown in the figure:
insert image description here
the third is local area network LAN , the number of computers is more, and they are connected together with switches and routers, as shown in the figure:
insert image description here

The fourth is the wide area network WAN : to connect computers thousands of miles away, the so-called "local area network" and "wide area network" are just relative concepts. For example, China's wide area network can also be regarded as a relatively large local area network. As shown in the picture:
insert image description here

2. Computer network protocol

(1) The concept of network protocol

Protocol concept :
a protocol is a convention, a collection of rules, standards or conventions established for data exchange in a computer network.

If the data format, transmission method, and character set of the two parties in the communication are different, then it is difficult to communicate between the two parties, because the two parties do not know each other's standards. According to the provisions of this agreement, the two parties unify the format of the data, so that the communication can be completed.
Network protocol and network protocol cluster:
The network protocol solves the problem of the network data format sent by both parties on the network, and essentially makes an agreement on the data format sent and received.

A protocol always refers to a protocol at a certain layer. To be precise, when entities of the same layer communicate, the collection of relevant communication rules and conventions is the layer protocol, such as physical layer protocol, transport layer protocol, and application layer protocol. The network protocol is the two sides of the communication, agreeing on the data format used in the communication, and carrying out network communication.

We have many computer manufacturers, many computer operating systems, and many computer network hardware devices. How to enable the computers produced by these different manufacturers to communicate with each other smoothly? Someone needs to stipulate and agree on a common standard for everyone to follow. This is the network protocol. Then, a network protocol cluster composed of many network protocols is called a network protocol cluster.
The composition of the network protocol
The network protocol consists of three parts, semantics, syntax, and timing.

  1. Semantics is the interpretation of the meaning of each part of the control information. It specifies what kind of control information needs to be sent, as well as the completed action and what kind of response to make. (what to do)
  2. Syntax is the structure and format of user data and control information, and the order in which data appears (how to do it)
  3. Timing is a detailed specification of the sequence in which events occur. (order of doing)

(2) Protocol layering

To understand the protocol layering, we must first understand what is called the architecture. It specifies the layering of the network, the functions that each layer needs to complete, and the protocols that each layer has. It also specifies the layering of each layer. relationship with oneself.
1. The OSI seven-layer model is as follows:
insert image description here

OSI, the open system interconnection seven-layer network model, is called the open system interconnection reference model, which is a logical definition and specification.
The network is logically divided into 7 layers. Each layer has related and corresponding physical devices, such as routers and switches.
The OSI seven-layer model is a framework design method, and its main function is to help different types of hosts realize data transmission.
Its biggest advantage is that it clearly distinguishes the three concepts of service, interface and protocol. The concept is clear and the theory is relatively complete. Through seven hierarchical structural models, different systems and different networks can achieve reliable communication. . Simply put, the functions of these seven layers are:
Application layer: an interface between network services and end users
Presentation layer: data representation, security, and compression
Session layer: establish, manage, and terminate sessions
Transport layer: responsible for data transmission between processes
Network layer: responsible for address management and routing
data link layer: responsible for data transmission between adjacent devices

physical layer: responsible for the transmission of physical photoelectric signals. , the presentation layer and the session layer are implemented separately, so most of them now use the TCP/IP five-layer model or four-layer model that integrates these three layers.
2. TCP/IP five-layer (four-layer) model
insert image description here
TCP/IP is synonymous with a group of protocols, and it also includes many protocols, forming the TCP/IP protocol cluster. The TCP/IP communication protocol adopts a 5-layer hierarchical structure, and each layer calls the network provided by its next layer to fulfill its own needs.
Physical layer: Responsible for the transmission mode of optical/electrical signals. For example, the network cable (twisted pair) commonly used in Ethernet, the coaxial cable used in early Ethernet (now mainly used in cable TV), optical fiber, and the current wifi wireless network The use of electromagnetic waves, etc. all belong to the concept of the physical layer. The ability of the physical layer determines the maximum transmission rate, transmission distance, anti-interference, etc. The hub (Hub) works on the physical layer.

Data link layer: Responsible for the transmission and identification of data frames between devices. For example, the driver of the network card device, frame synchronization (that is, what signal is detected from the network line is counted as the beginning of a new frame), conflict detection (if a conflict is detected Automatic retransmission), data error checking, etc. There are standards such as Ethernet, token ring network, and wireless LAN. The switch (Switch) works at the data link layer.

Network layer: Responsible for address management and routing selection. For example, in the IP protocol, a host is identified by an IP address, and the data transmission line (route) between two hosts is planned through a routing table. Router (Router ) work at the network layer.

Application layer: Responsible for communication between applications, such as Simple Email Transfer (SMTP), File Transfer Protocol (FTP), Network Remote Access Protocol (Telnet), etc. Our network programming is mainly aimed at the application layer.
To put it simply, in the TCP/IP five-layer model, he merges the application layer, presentation layer and session layer with similar functions into the application layer.

  • Application layer: responsible for data communication between applications - (commonly used protocols: HTTP, FTP, DNS, DHCP)

  • Transport layer: responsible for data transmission between processes - (commonly used protocols: TCP, UDP)

  • Network layer: Responsible for communication between terminal nodes (point-to-point), including address management and routing selection, etc. (commonly used protocols and devices: IP, ICMP, IGMP, routers)

  • Data link layer: responsible for data transmission between adjacent devices - (commonly used protocols and devices: Ethernet protocol, switches)

  • Physical layer: responsible for the transmission of physical photoelectric signals - (commonly used protocols and devices: Ethernet protocol, hub)

3. The basic process of network transmission
The forwarding of network data needs to go through its own network protocol stack first, and then convert the data into photoelectric signals through the network, and transmit it to the peer machine. After the peer machine receives the data, it needs to go through its own machine network protocol The layers of the stack are submitted upwards, and the data is always submitted to the application program of the application layer.
insert image description here

(3) Data encapsulation and distribution

1. Visually understand the encapsulation, distribution and protocol:
when data packets are transmitted between different devices using the network, in order to send them to the destination reliably and accurately, and to use transmission resources (transmission equipment and transmission lines) efficiently, you must first Split and package the data packet, attach the destination address, local address, and some bytes for error correction to the sent data packet, and encrypt it when the security and reliability are high, etc. . These operations are called data encapsulation . The rules followed and negotiated by the two communicating parties when processing data packets are protocols . Compared with mailing items, the data package itself is like an item, while packaging is like filling in various mailing information, and the protocol is the regulation on how to fill in the information.
2. Re-understand:
Different protocol layers have different names for data packets. They are called segments at the transport layer, datagrams at the network layer, frames at the link layer, and bit streams at the physical layer. When the application layer data is sent to the network through the protocol stack, each layer protocol must add a data header, which is called encapsulation .
The header information contains information such as how long the header is, how long the payload is, and what the upper layer protocol is. After the data is encapsulated into a frame, it is sent to the transmission medium. After reaching the destination host, each layer protocol strips off the corresponding header, which is called splitting. According to the "upper layer protocol field" in the header, the data is handed over to the corresponding upper layer protocol for processing. Just like data packet = header + payload, almost all protocols have to solve encapsulation and distribution as shown in the figure:

insert image description here

Specific package diagram :
insert image description here

  • 1. User information is converted into data for transmission on the network
  • 2. Data is converted into data segments and a reliable connection is established between the sender and receiver hosts
  • 3. The data segment is converted into a data packet or datagram, and a logical address is placed in the header, so that each data packet can be transmitted through the Internet
  • 4. Packets or datagrams are converted into frames for transmission in the local network. On the local network segment, use the hardware address to uniquely identify each host.
  • 5. Frames are converted into bit streams, and digital encoding and clock schemes are used
    for specific use diagrams :
    insert image description here

Decomposition is to remove the header and tail of the package according to the order of direct encapsulation, and restore the original data. In fact, it is the reverse operation of encapsulation, also known as decapsulation.
3. Summary:
Summary 1: The same protocol must be adopted between the layers of the sender and receiver to establish a connection and achieve normal communication.

  • The same encoding and decoding rules must be adopted between application layers to ensure the correctness of user information transmission.
  • The same port number and protocol relationship must be used between the transport layers to ensure the communication between the upper layer application processes.
  • The same logical addressing process must be used between network layers to ensure that data will not be transmitted to the wrong destination.
  • If the protocol is different between the data link layers, the receiver cannot "understand" the content of the data.
  • The hardware interface specifications must be the same between the physical layers, otherwise no signal can be received.

Summary 2: In the actual network environment, there will be a lot of hardware devices between the sender and the receiver to play the role of relay. In the figure below, a communication structure is assumed, and two switches are added between the two computers And the router, the data of the sending host will reach the receiving host through the following intermediate devices. The
insert image description here
sending host first performs data encapsulation.

  • The electrical signal sent from the physical network card of the sending host reaches the switch through the network cable, and the switch converts the electrical signal into binary data and sends it to the data link layer of the switch. The switch intelligently forwards the data to the corresponding router device according to the MAC address of the data frame header, and converts the binary data into physical electrical signals before forwarding.
  • After the router receives the data, it will remove the MAC header information of the data link layer and send the data packet to the network layer. The file is forwarded to the next router, and the new MAC header information must be repackaged before forwarding, and the data is converted into binary.
  • That is, the router receives the electrical signal, converts the electrical signal into binary data and sends it to the network layer, and then repackages it according to the MAC address and IP address and converts it into an electrical signal for forwarding, and then the corresponding switch receives the electrical signal, and then sends it to the network layer according to the MAC address. In the network card of the receiving host.

(4) Address management

1. IP address :
The IP address is the unique identifier of the host in the network. No matter which host communicates with any host, the IP address is needed for positioning. And during communication, each piece of data will contain the source address and the peer address, and they specify the two objects of communication.
For our commonly used IPV4, the IP address is a uint32_t type of data, which is an unsigned 32-bit integer.
We usually also use dotted decimal strings to represent IP addresses, such as 192.168.0.1. The range of each byte is 0-255.
An IP address can uniquely identify a host in the network, a public IP address can only be occupied by one machine, and a machine can have multiple IP addresses.
2. Port number :
The port number is the unique identifier of the process on the host. When writing a program, you need to tell the computer which port the data should be sent to.
A port can only be occupied by one process, but a process can use multiple ports at the same time. And during communication, each piece of data will also include the source port and the peer port to specify which process to send the data from and which process to process. The
port number is uint16_t, which is an unsigned 16-bit integer.
3. MAC address :
The MAC address is used to identify the connected nodes in the data link layer, and can usually be considered as a physical address. It is determined when the network card leaves the factory and cannot be modified. The mac address is usually unique.
The MAC address is composed of 48 bits, generally a hexadecimal number plus a separator:, such as 08:00:27:03:fb:19.
For the receiver, all network data needs to go through the network protocol stack, and which application program the network data belongs to is distinguished by port.
4. Network byte order and host byte order
In network communication, you also need to pay attention to the byte order.
Endianness is the order in which the CPU stores data in bytes in memory, that is, big-endian and small-endian.
Big-endian storage mode : It means that the low bits of the data are stored in the high addresses of the memory, while the high bits of the data are stored in the low addresses of the memory.
Little-endian storage mode : It means that the low bits of the data are stored in the low addresses of the memory, while the high bits of the data are stored in the high addresses of the memory.
In network communication, the network byte order adopts the big-endian storage mode, and the host byte order varies according to the host. Home computers are generally little-endian, but the communication on the network cannot ensure the uniqueness of the host byte order. Because the audience is the entire network, and once the byte order of the hosts on both sides of the communication is different, it will cause data ambiguity during communication, so it is necessary to ensure that the byte order is the same, and it is necessary to convert the host byte order to a common network during communication byte order.

Guess you like

Origin blog.csdn.net/m0_59292239/article/details/132009926