The tortuous experience of IoT devices sending MQTT messages to the cloud

In order to understand the entire network process in which IoT devices obtain data from sensors and send it to the cloud through the network, let's first look at the network layering model:

9cc3b7894eb58d07d69e74326921384c.png

The above figure exemplifies the most common protocols in network layering:

  • Application layer : The application is responsible for packaging the data with corresponding rules (protocols) and sending it to the transport layer

    • MQTT: Message Queue Telemetry Transport

    • CoAP: Constrained Application Protocol

    • HTTP: Hypertext Transfer Protocol

    • FTP: File Transfer Protocol

    • SMTP: Simple Mail Transfer Protocol

  • Transport layer : Responsible for grouping the data transmitted from the application layer. In order to ensure the order and integrity of the data received by the terminal, each packet will be marked and handed over to the network layer

    • TCP: Transmission Control Protocol

    • UDP: User Data Protocol

  • Network layer : responsible for sending data packets from the transport layer to the target terminal

    • IP: Internet Protocol

    • ICMP: Internet Control Message Protocol

    • IGMP: Internet Group Management Protocol

  • Link layer : Sends and receives data units for the network layer

    • ARP: Address Resolution Protocol

    • RARP: Reverse Address Resolution Protocol

  Encapsulation and Decomposition  

When the data passes through each layer, it must be packaged by the corresponding protocol, that is, encapsulation (Encapsulation), and when it reaches the terminal, it must be unpacked layer by layer, that is, demultiplexing (Demultiplexing).

When sending, the business data collected by the device is encapsulated into an MQTT message by the application program, and each layer will use the message transmitted from the upper layer as a data block of this layer, and add its own header, which contains the protocol identification. Passed down as a packet of this layer.

When receiving, the data flows from bottom to top, the message header is removed when passing through each layer, the correct upper layer protocol is determined according to the message identifier, and finally it is processed by the application program at the application layer.

98dc51e0ee924849be6e5cb1b5ded141.png

The business data collected by the IoT device is encapsulated into an MQTT message by the application on the device side. The MQTT message will be transmitted sequentially in the form of a data stream through an established TCP long connection, and the TCP will divide the data stream after receiving it. Into small data blocks, the TCP header added to each small block and the data block together form a TCP packet, the packet is sent through the network layer, and the network layer follows the IP protocol. After receiving the packet sending request, it will put the packet into IP datagram, fill in the header, and send the datagram through the link layer.

After the cloud system receives the data request from the link layer, it enters the network layer to analyze the data, passes it to the transport layer, verifies the sequence and integrity of the packets, takes out the data from the data block, obtains the MQTT message, and passes it to the application layer to process. This process will strip the header layer by layer to restore the business data collected by the IoT device.

  Application layer-MQTT protocol  

MQTT is a message transmission protocol in the publish/subscribe mode of the client server architecture. Its design philosophy is lightweight, open, simple, standardized, and easy to implement. These characteristics make it a good choice for many scenarios, especially for constrained environments such as machine-to-machine communication (M2M) and Internet of Things (IoT) environments.

The packet format of the MQTT protocol is very simple, consisting of three parts: Fixed header, Variable header, and Payload.

86ae2669f51d478b3c45aa729455656f.png

Fixed header : contains control message type, flag bits and remaining length.

9fcfc72cc86bb8723482da88e9f9a6ac.png

The upper 4 bits (bit7~bit4) of the first byte of the MQTT message indicate the type of control message , which can represent 16 protocol types in total:

598fe8f3a34387c7a2bffdf888722155.png

The lower 4 bits (bit4~bit0) of the first byte of the MQTT message are used to specify the flag bits of the data packet , and only the PUBLISH control message is used.

af7f38023ebe6987a0492ac0d5330d2f.png

Remaining length:  The second byte of the MQTT message is a field used to identify the length of the MQTT packet, with a minimum of one byte and a maximum of four bytes. The lower 7 bits of each byte are used to identify the value, and the range is 0 ~127.

11bcde1d7dd3037920d22c2b25ac6b7e.jpeg

Variable header : exists in some types of MQTT packets, and the specific content is determined by the corresponding type of packets.

Payload : It exists in some MQTT data packets and stores the specific business data of the message.

  Transport layer - TCP protocol  

MQTT connection is established on the basis of TCP connection, and TCP provides reliable data connection. When an MQTT message is to be transmitted, the message data will be transmitted sequentially through an opened TCP connection in the form of a stream, and TCP will divide the received data into small blocks, and each block is a TCP packet.

Since the data is sent in small blocks, complete and reliable data transmission is mainly reflected in: whether the packet is complete, whether the packet sequence is normal, whether the packet is damaged, and whether the packet data is repeated. These can be controlled through TCP checksums, sequence numbers, acknowledgment responses, retransmission control, connection management, and window mechanisms.

TCP is a transmission control protocol. Transmission control mainly depends on the six flags contained in the header. They control the transmission status of the message and the actions taken by the sender and receiver for the data. When their values ​​are 1, the respective functions corresponding to the flags are allowed to be executed. For example, when the URG is 1, the urgent pointer part of the message header is valid.

  • URG Urgent pointer

  • ACK confirms that the sequence number is valid

  • The PSH receiver should deliver this segment to the application layer as soon as possible.

  • RST to reconnect

  • The SYN synchronization sequence number is used to initiate a connection

  • The sender of FIN completes the sending task

97c1a3a8cd536fbb6bab7699bd12ca11.png

Source port and destination port : identify the port numbers of the sender and receiver. A TCP connection is confirmed by 4 values: source IP, source port, destination IP, destination port, where the source IP and destination IP are included in the IP packet.

Header length : Indicates the byte length of the TCP header, and can also mark how many bytes are the data that needs to be transmitted.

TCP segment number : The serial number of the first byte of the data sent in this segment of the message. Each byte of data in each segment of the message has a serial number. The serial number of the first byte starts from 0 and increases by 1 in turn. Add to 2 to the 32nd power minus 1 and start from 0 again.

TCP segment confirmation sequence number  : When the header flag ACK is 1, the confirmation sequence number is valid. After the TCP segment is received by the receiving end, it will send back a confirmation number to the sending end, which is the last byte sequence number received last time plus 1.

Checksum: It is calculated by the sender and verified by the receiver. If the receiver detects that the checksum is incorrect, it indicates that the TCP segment may be damaged and will be discarded. At the same time, the receiver sends back a repeated confirmation number (with the latest correct one) The confirmation number of the message transmission is repeated), indicating that the received TCP segment is wrong, and inform itself of the sequence number it wants to receive. At this time, the sender needs to retransmit the erroneous TCP segment immediately.

Urgent pointer : When the header flag URG is 1, the urgent pointer is valid, indicating that the sending end wants to send urgent data to the receiving end. The urgent pointer is a positive offset, which is added to the sequence number of the TCP segment to calculate the sequence number of the last byte of urgent data. For example, when the receiver receives the data, it starts to read from the byte whose serial number is 1000, and the urgent pointer is 1000, then the urgent data is the byte whose serial number is from 1000 to 2000. It is up to the receiver to decide what to do with this data.

Window size : Determines the throughput of a TCP block data stream. It should be noted that it indicates the amount of data that the sender allows the other party to send. For example, if the window size in the sender’s header is 1000, it means that the sender can accept up to 1000 bytes of data from the other party. This is related to the data buffer space of the sender and will affect the performance of TCP.

Header flag PSH : If you need to tell the receiver to submit all the data to the receiving process immediately, the sender needs to set PSH to 1. The data here is the data transmitted with PSH and all the data received before. If the receiver receives the sign that PSH is 1, it needs to submit the data to the receiving process immediately without waiting for other data to come in.

Reset flag RST : When RST is 1, it means that there is an abnormality in the connection, and the receiver will terminate the connection and notify the application layer to re-establish the connection.

Synchronization number SYN : used to establish a connection, involving the three-way handshake of TCP.

  1. When starting to establish a connection, the client sends a TCP packet to the server, the SYN of the packet header is 1, and carries an initial sequence number, indicating that this is a connection request.

  2. If the server accepts the connection, it will send a TCP packet to the client. The packet will contain SYN and ACK, both of which are 1, and also contain a confirmation sequence number, which is the initial sequence number from the client + 1, indicating that the connection has been accepted.

  3. After the client receives the packet sent in the previous step, it will send another confirmation message packet to the server. The ACK is 1, and it will carry the confirmation sequence number again. The value is the confirmation sequence number from the client in the second step + 1. After receiving the confirmation message, the server enters the connected state.

In the confirmation packet in the third step, the data to be sent can be carried.

Connection termination flag FIN : used to close the connection, when one end completes the data sending task, it will send a FIN flag to terminate the connection, but because TCP will transmit data in two directions (CS, SC), each direction has its own Sending FIN & Confirm closes the process, so there will be four interactions, also known as four waves.

  1. If the data of the client application layer is sent, it will cause the client's TCP message to send a FIN, telling the server to close the data transmission.

  2. After the server receives this flag, it sends back an ACK, confirming that the serial number is the received serial number plus 1, and at the same time, TCP also sends an end-of-file character to the application.

  3. At this time, the server closes the connection in this direction, causing its TCP to also send a FIN.

  4. After receiving it, the client sends back a confirmation ACK, the sequence number is the received sequence number + 1, and the connection is completely closed.

The sequence number of the TCP segment and the acknowledgment number ensure the order of the data, check and ensure the integrity of the data, and the urgent pointer ensures that the urgent data can be processed in time. In addition, TCP also has some timeout retransmission, congestion avoidance, and slow start mechanisms, all of which can ensure that the packet data is completely transmitted to the target end in order.

  Network layer - IP protocol  

If the TCP group is the container that packs the goods, then the IP is the truck that delivers the container. The IP protocol provides a connection between two nodes to ensure that TCP data is sent from the source to the terminal as quickly as possible, but it cannot guarantee the reliability of transmission.

The IP layer will encapsulate the TCP packets transmitted from the upper layer, bring its own header, and then perform routing, whether to fragment and recombine, and finally reach the destination. In this process, the IP header plays an important role. Let’s let Let's look at the structure of the header.

a1b8708c08168f83ab6633861a03e063.png

Version : Indicates the version of the current IP protocol. The current version number is 4, and the other is 6, that is, IPV4 and IPV6. If the versions of the sending and receiving ends are inconsistent, the current IP datagram will be discarded.

Header Length : The length of the entire header, up to 60 bytes.

Type of Service (TOS) : It is used to distinguish the type of service, but in fact, the IP layer has not been actually used when it is working. The existing TOS has only 4bit subfields and 1bit unused bits. Unused bits must be set to 0. Only one of the 4 bits of TOS can be set to 1, which is used to indicate the current service type. The four service types corresponding to 4bit are: minimum delay, maximum throughput, highest reliability and minimum cost.

Total length : Indicates the total length of the current datagram message, in bytes, and can be combined with the length of the header to calculate the size and starting position of the data in the message.

The following three header fields relate to the fragmentation and reassembly process of IP datagrams. Since the network layer generally limits the maximum length of each data frame, the IP layer sends datagrams and queries each datagram of the current device network layer while selecting the route. The maximum transmission length of a data frame. Once exceeded, the datagram will be fragmented and reassembled after reaching the destination. At this time, the following three fields will be used as the basis for reassembly. It should be noted that because of the route selection process, the maximum transmission length of the data frame is different for each layer of routing equipment that the datagram passes through, so fragmentation may occur during any route selection process.

Group ID : This ID is equivalent to an ID. Every time a fragment is successfully sent, the IP layer will add 1 to the group ID.

Flag : A total of three bits are occupied, which are R, D, and M. R is not used yet, and D and M are useful. This field indicates the fragmentation behavior of the datagram. If D is 1, it means that the data does not need to be fragmented, and the transmission is completed at one time; if M is 1, it means that the data is fragmented, and there is still data behind it. When it is 0, it means that the current datagram is the last fragment. Or just this one shard.

Fragment offset : It identifies the position of the current fragment from the beginning of the original datagram. After fragmentation, the total length of each fragment will be changed to the length value of this fragment, not the length of the entire datagram.

Time to Live : (TTL) can determine whether a datagram is dropped or not. Because IP sends data hop by hop, the data may be forwarded between different IP layers with routing functions set, so the time to live indicates how many routes the datagram can pass through at most, and each layer passed through Routing, the value minus 1, when the value is 0, the datagram is discarded, and a message with an error message is sent (ICMP, a component of the IP layer, is used to convey some error messages) to the source. Time to live can effectively solve the problem that datagrams are always forwarded in a routing loop.

Header checksum : Check the integrity of the datagram, the sender sums the headers, stores the result in the checksum, and the receiver calculates it again. If the calculation result is consistent with the result in the checksum, it means that the transmission process is OK, otherwise the datagram will be discarded.

Upper-layer protocol : determines which upper-layer protocol the receiving end will hand over the data to for processing, such as TCP or UDP.

Source IP : The IP of the sender is recorded, which is used when sending back error messages.

Destination IP : Indicates the destination IP, which is used to make decisions for each route selection.

routing

Because the IP header only contains the destination IP address and does not reflect the complete path, when data is sent out, the IP layer will make a routing decision based on the query results of the destination IP in the local routing table, and the datagram will be routed one by one. The hops are transported to the destination, and each hop here is a routing selection.

The IP layer can be configured as a router or as a host. When configured as a routing function, datagrams can be forwarded. When configured as a host, if the destination IP is not the local IP, the datagram will be discarded.

When the IP layer with routing function judges which station to forward to when the target IP is not the local address? To understand this problem, you need to understand the structure of the routing table. The following is the routing table maintained by the IP layer:

01086963fbb3113afca5a65a240c137e.png

  • Destination (destination IP): Indicates the network address or host address that the IP datagram will eventually arrive at or pass through.

  • Gateway (next hop address): the address of the adjacent router currently maintaining the routing table device

  • Flags: Indicates the attributes of the current route record, specifically represented by five different flags:

    • U: The route can be used

    • G: If there is this flag, it means that the next hop is a gateway, if not, it means that the next hop is in the same network segment as the current device, that is, the datagram can be sent directly

    • H: Whether the next hop is a host or a network, if there is this mark, it means the host, if not, it means the route of the next hop is a network

    • D: The route is created by a redirect message

    • M: The route has been modified by a redirect message

  • Interface: the physical port of the current routing entry

Every time a datagram is received, the IP layer will query the routing table according to the destination IP, and will lead to three results according to the query status:

  • A routing entry that exactly matches the destination IP is found, and the packet is sent to the next-hop route (Gateway) or network interface (Interface) of the routing entry

  • Find a routing item that matches the network number of the destination IP, and send the message to the next-hop route (Gateway) or network interface (Interface) of the routing item

  • If the first two are not found, it depends on whether there is a default routing item (default) in the routing table, and if so, send it to the designated next-stop route (Gateway)

If none of the above three results, the datagram cannot be sent. This is how the IP datagram is sent to the destination host hop by hop, but the datagram has an inherent length, and once it exceeds the MTU of the destination host, it will be fragmented.

The concept of datagram fragmentation

When TCP is performing a handshake, it will determine the maximum data size (MSS) that TCP data can transmit each time according to the maximum transmission unit (MTU) of the destination IP layer. After that, TCP will group the data according to the MSS. Each packet will be packed into an IP datagram. When an IP datagram passes through any layer of routing in the routing process, it may be restricted by the MTU and be fragmented. At this time, the M flag in the 3-bit flag of the IP header is set to 1, indicating that fragmentation is required. The header of each slice is basically the same, but the slice offset is different. According to the fragment offset, these fragments are reassembled into a complete IP datagram (a TCP packet) at the destination. IP transmission is out of order, so the resulting datagram is also out of order, but if the data is complete, TCP will sort it according to the fields in the header. Once the IP fragment is lost and the IP layer cannot form a complete datagram, it will tell the TCP layer and TCP will retransmit.

  Link Layer - ARP Protocol  

After the IP layer encapsulates the data, there is only the IP address of the target host. Just having an IP address does not directly send datagrams, because each hardware device has its own MAC address, which is a 48-bit value. Now that we know the address of the target IP, we need to find the MAC address corresponding to this IP. In this process, the MAC address corresponding to the target IP is finally obtained by querying the routing table and combining the ARP protocol of the link layer.

The ARP protocol implements the mapping from IP addresses to MAC addresses. At the beginning, the starting point does not know the MAC address of the target, only the target IP. To obtain this address involves ARP request and response. Similarly, ARP also has its own packet, first look at the packet format.

2cee7bd928212041f2ccef596297def0.png

Ethernet Destination Address : The MAC address of the destination end, when there is no ARP cache table, here is the broadcast address.

Ethernet source address : MAC address of the sender.

Frame type : Different frame types have different formats and MTU values, and different types have different numbers. Here, the number corresponding to ARP is 0x0806.

Hardware type : Refers to the link layer network type, 1 is Ethernet.

Protocol type : Refers to the address type to be converted, 0x0800 is the IP address. For example, convert an Ethernet address to an IP address.

Operation type : There are four types, namely ARP request (1), ARP reply (2), RARP request (3), and RARP reply (4).

Source MAC address : Indicates the MAC address of the sender.

Source IP address : Indicates the IP address of the sender.

Destination Ethernet address : Indicates the MAC physical address of the target device.

Destination IP address : Indicates the IP address of the target device.

Before the two devices send a message, the link layer at the source end will use the ARP protocol to inquire about the MAC address of the destination end, and ARP will broadcast a request, and every host on the Ethernet will receive this broadcast. The purpose of the broadcast is Ask the MAC address of the target IP, the main content is to introduce your own IP and MAC address first, and then ask if you have the target IP, please reply your hardware address. If a host receives the broadcast and sees that it has this IP, and the request contains the source IP and MAC address, it will respond with an ARP reply to the source host. If there is no target IP, the request will be discarded. It can be seen that the request is broadcast outward, and the response is a separate response.

But you can’t go through a request-response process before every communication. After the response is successfully received, the mapping relationship between IP and MAC address will be cached in the ARP cache table. The validity period is generally 20 minutes, which is convenient for the next time at the network layer. Encapsulate directly, so the complete process should be:

After the IP layer receives the TCP packet, before sending or encapsulating, it queries the routing table:

  • When the target IP is in the same network segment as itself, first go to the ARP cache table to find out whether there is a MAC address corresponding to the target IP, and if so, send it to the link layer for encapsulation and sending. If there is not in the cache table, it will be broadcasted, cached after obtaining the MAC address, and the IP layer will encapsulate the TCP, and then hand it over to the link layer to encapsulate and send it out.

  • When the target IP is not in the same network segment as itself, the packet needs to be sent to the default gateway. If there is a MAC address corresponding to the gateway IP in the ARP cache table, it is handed over to the link layer for encapsulation and sent out. If not, broadcast it, cache it after obtaining the address, and then encapsulate the TCP at the IP layer, and then hand it over to the link layer to encapsulate and send it out.

Ethernet data frame

Everything above is ready, and what is encapsulated and sent is actually an Ethernet data frame. The Ethernet destination address, Ethernet source address, and frame type form the frame header. A preamble and a frame start delimiter are also inserted before the header to inform the receiver to do some preparations. The frame check sequence FCS is added to the tail to detect whether the frame is wrong.

ed354871adeb2e2688dc9b64f42d5133.jpeg

Preamble : Coordinate the clock frequency of the terminal receiving adapter so that it is the same as the frequency of the transmitting terminal.

Frame start delimiter : The sign of the start of the frame, indicating that the frame information is coming and ready to receive.

Destination address : The MAC address of the network adapter that receives the frame. When the receiving end receives the frame, it will first check whether the destination address matches the local address. If not, it will discard it.

Source address : MAC address of the sending device.

Type : Decide which protocol to process the data after receiving the frame.

Data : The data handed over to the upper layer. In the context of this article, it refers to IP datagrams.

Frame check sequence : To detect whether the frame is wrong, the sender calculates the cyclic redundancy check (CRC) value of the frame and writes this value into the frame. The receiving computer recalculates the CRC and compares it with the value of the FCS field. If the two values ​​are not the same, data loss or alteration occurred during transmission. At this time, the frame needs to be retransmitted.

transmit and receive

  • After receiving the datagram from the upper layer, it is determined whether to divide it into small pieces according to the MTU and the size of the datagram, which is the process of fragmenting the IP datagram.

  • Encapsulate the datagram (block) into a frame and pass it to the underlying component. The underlying component converts the frame into a bit stream and sends it out.

  • The device on the Ethernet receives the frame and checks the destination address in the frame. If it matches the local address, the frame will be processed and passed up layer by layer (demultiplexing process).

epilogue

Above, we sorted out the complete network process of IoT devices collecting data from sensors, being encapsulated into MQTT messages by end applications, encapsulating layer by layer through network protocols, and then splitting layer by layer in the cloud receiving system. I hope you can understand IoT related MQTT, TCP, IP, ARP network protocols are helpful.

Past recommendation

☞ IDC China 2022 IoT Platform Evaluation Report

☞ IoT Platform Trends in 2022: Privatization

☞ 5 failed lessons worth sharing about Internet of Things startups

☞ Selection and comparison of four domestic IoT platforms

☞ Is the [IoT platform] of cloud vendors not popular?

7de3979f21edff5597664320b66b9129.png

d9200b11b685b44aef2bfb3e4bf20f0b.gif

d788a5c856fbee730613310c696ca9fc.gif

969330fbf80e2fe871a9b498f471ba41.gif

51c1af2611e8e658b43efd197ea74d39.gif

Guess you like

Origin blog.csdn.net/klandor2008/article/details/131820839