Talk about network remote sensing technology, from active detection and passive detection to Netflow and INT

Network remote sensing, Network telemetry, why is it called "telemetry"? My personal understanding is to "collect" the data in the network, which is actually a means of collecting network data . Due to work needs, I have come into contact with some network remote sensing technologies. This article will talk about mainstream network remote sensing technologies. This article will introduce the remote sensing technology based on "flow" in the traditional network, represented by Cisco's Netflow technology and the IPFIX standard of IETF, and then to the recently popular INT technology. INT technology will focus on Cisco's IOAM and Huawei's PBT two kinds.

【Scenes】

Network remote sensing is a network information collection technology. The purpose is to collect information in the network. The larger the network, the more difficult it is to troubleshoot problems. Therefore, some technologies are needed to conduct real-time traffic analysis and monitoring of the network or to automatically troubleshoot the network. "Break circuit", and in the DCN network (Datacenter Network), a set of means is needed to realize the monitoring and detection of the network , and a set of technology is needed to realize the real-time monitoring of the network, so the network remote sensing was born to solve this problem of a technique. Network remote sensing is generally divided into two types, namely active detection and passive detection. Passive detection is represented by Cisco's Netflow technology, while active detection is mostly similar to the guided probe components in Microsoft Everflow (there is also Traceflow of the virtual network represented by vmware, and Huawei's FusionNetDoctor, of course, Huawei's performance is a bit low ...be honest).

【Active Detection】

Active detection, as the name suggests, is " actively detecting the status of the network ", represented by the guided probe component in the Microsoft Everflow system. Microsoft published an article called "Packet-Level Telemetry in Large Datacenter Networks" article, this article mainly introduces a set of network remote sensing technology implemented by Microsoft at the data packet level, if you are interested, you can go to google, I personally think it is a very good article. It mentions an important component called "guided probe" , so what is this guided probe used for?

Figure 1. Typical virtual network topology

Figure 1 is a typical virtual network topology. Virtual machine A to virtual machine B needs to pass through three virtual switches and two virtual routers. Then suppose a "circuit break" occurs from virtual machine A to virtual machine B at a certain moment . As a result, the service traffic from virtual machine A to virtual machine B is blocked. How to locate the specific location of the fault and the cause of the fault?

Microsoft's guided probe and vmware's traceflow do just that. The guided probe will inject a "detection data packet" into VM A, and then set some "detection identification points" on the network element. This data packet will upload status information every time it passes through a network element, as shown in Figure 2.

Figure 2. An implementation of active probing

Taking Figure 2 as an example, I set some "detection identification points" at the input and output of each network element, and the "detection data packet" will upload status information to the controller after passing through the "detection identification points". If the input of virtual switch 3 in machine B fails, the final uploaded information will lack "OUT of virtual switch 3", then the controller can judge that there is an "occurring" fault at virtual switch 3, then it can The management plane is notified through the northbound channel, and then the management plane notifies the user or sends an alarm directly.

Of course, it’s more than that. If the data surface is developed by our own company, there is an upgraded gameplay. If you can perceive that “a specific background flow has broken down”, the “flow” can be identified by 5-tuple (src ip, dst ip, src port, dst port, l4 protocol or add tos), then when I inject the probe data packet, I use the background traffic with "broken flow" as a template to construct the background traffic with "broken circuit" The same data packets are actively detected, and if the data plane is developed by our own company, it is completely possible to set a "detection identification point" at the packet loss point in the forwarding of the data plane, so that not only can we know the specific interruption of the background flow that occurs location, you can also know the reason for the disconnection, so that you can directly tell the user "xxx traffic has a disconnection, at the location of xxx, the reason for the disconnection and packet loss is xxx", for example, Figure 3 is the display result of VMWare's Traceflow.

Figure 3. VMWare's Traceflow effect diagram

The above process is most of the implementations of active detection. Microsoft’s Everflow, Vmware’s Traceflow and Huawei’s FusionNetDoctor are all implemented in this way (here I must blow a wave of Vmware, it’s really good, really cool. ..) But active detection also has some fatal flaws:

1) Only faults that are occurring can be detected. If the failure is intermittent, then active detection is powerless.

2) The strength of the detection is limited. For example, if the fault that is occurring is packet loss, and the packet loss is only part of the packet loss, then the detection data packet sent out is likely not to be lost, showing a result that is not consistent with the real world. (But this method can be injected in batches to observe)

3) Cannot backtrack. This point is basically the same as the first point. If the disconnection occurred at a certain point in the past, such as midnight, then active detection will not be able to know the reason and location of the disconnection at a certain point in the past.

4) It is necessary to set the detection identification point on the data plane, and the detection identification point identifies the detection data packet and cannot affect the normal data plane forwarding performance.

As mentioned in Microsoft's Everflow, this kind of active detection is mainly to detect the ongoing disconnection, because the fault of the ongoing disconnection has the highest priority, and it is enough to achieve this, and there is no way to rely on a single technology to cope with all scenarios.

Of course, after active detection, I will write a special article to explain some technical difficulties and implementation methods.

【Passive detection】

Passive detection is a kind of detection corresponding to active detection. The definition of passive detection is " detection without affecting the current network traffic ". Some impact (a new stream of traffic appeared). Passive detection is represented by Cisco's Netflow, but in recent years, some new technologies have also appeared, such as the INT technology proposed by Cisco and its specific application IOAM, and the PBT technology proposed by Huawei based on the optimization of Cisco's IOAM technology. Since the technical content of passive detection is more complicated than that of active detection, it will be introduced separately next, and the Netflow technology will be introduced first.

【Netflow】

Netflow technology was proposed by Cisco in 1996 (I was born in the same year as me...), and it has been 24 years now. It is already a very mature technology, and each company has developed some ideas based on Cisco Netflow. Dialect version, such as jFlow, sFlow and so on. The architecture of Netflow is often shown in Figure 4.

Figure 4. Netflow architecture diagram

As long as the collection technology based on Netflow (in fact, not only Netflow, but mainly the traffic collection technology is basically the same...), no matter what kind of technology is made or how big it is, it is basically inseparable from these components of the Netflow architecture.

1) Netflow Exporter. The Netflow exporter is used to export the flow data to the Netflow Collector through the Netflow protocol.

2) Netflow Collector. The Netflow collector is used to analyze the Netflow data packets after receiving them, and store the analyzed flow data in Flow Storage.

3) Flow Storage. Flow database stores parsed Netflow data. Generally, TSDB (time series database) is mainly used for flow databases. Because it is necessary to look back at a certain point in time, or change and status information of traffic distribution in the past period of time, TSDB is more Suitable, the more popular ones are influxDB and Prometheus.

4) Analyzer/Monitor. The analyzer is used to analyze and present the data in the streaming database.

Of course, there is another final component that is the Flow cache of the Netflow device. There will be a table in the Netflow device, which will store flow information, as shown in the figure

Figure 5. Schematic diagram of Netflow Main Cache

The principle is also very simple, that is: when a data packet of a certain flow enters the current Netflow device for the first time, the Netflow device will extract and add a Flow Entry to the Netflow cache according to the 5-tuple information (defined here) of the data packet , this Flow Entry represents this flow, then you can count the information of this flow, and periodically (Netflow cache usually has an aging mechanism) export and encapsulate it as a Netflow protocol packet and send it to the Netflow Collector

The principle of Netflow technology is very simple, which is to collect flow information, upload it, and then analyze it. Of course, Netflow also has disadvantages, namely:

1) The collection granularity is "flow", which cannot reach finer-grained "packets", so it is difficult to count packet loss and delay.

2) It will bring performance loss to the Netflow device, and the loss of performance will often bring about the "observer effect". For example, a certain network phenomenon, taking network congestion or packet loss as an example, originally there would be no congestion or packet loss in the network, but because the Netflow collection function is enabled, the collection function consumes the performance of the Netflow device, resulting in the performance of the Netflow device in processing network traffic. decline, and then congestion or packet loss occurs, then it is the "observer effect". The final result does not already exist, but occurs because of the action of "observation". If you do not observe, it will not happen, then This situation is more painful. The performance problem will bring the following problems.

1. Due to performance degradation, it is impossible to update the Flow cache for each data packet in the DCN network, so there is often a sampling frequency. It is more common to "collect the first data packet every 100 data packets". method to reduce performance consumption, but this method will cause additional problems. For example, there are usually some "elephant flows" and "mouse flows" in the network. Representatives of elephant flows are FTP data flows. The characteristics of these flows The amount of traffic is large, and it will disappear after a period of time; while some representatives of "rat flow" are some control protocols or negotiation protocols. The traffic is characterized by a small amount, but it will continue, and the reliability requirements are high. However, if there is a sampling frequency, it is likely that the "mouse flow" cannot be collected. In the end, it seems that the "elephant flow" covers the "mouse flow". Only FTP protocol traffic can be seen, and some other TCP protocol traffic cannot be seen. .

2. The information uploaded by the Netflow protocol is limited. Although Netflow v9 already supports templates to collect data, it is still limited and cannot collect some private data of enterprises. The transport layer protocol of Netflow v9 is UDP, and the UDP protocol itself The reliability is poor, and once the uploaded information is lost, additional processing is required.

3. There are too many dialect versions of the Netflow protocol, such as sFlow and jFlow, there will be certain compatibility problems, and the Netflow protocol is a traffic export protocol proposed by Cisco, not a standard.

Among the shortcomings of Netflow, the first is not actually a defect of Netflow technology, but a common defect of network information sampling based on "flow". The third is a defect in the design of a transport layer protocol such as Netflow. The second is more of a "philosophical" question. "If you want to see more, you have to pay more." It is unrealistic to have a gun with a long range, high accuracy, a large case, a small size, and a high shooting speed."

【IPFIX】

The IPFIX protocol, a "standard flow information export protocol" proposed by IETF, aims to replace the Netflow v9 protocol and many dialect version protocols. It can be regarded as a future trend. Most manufacturers are already compatible with the IPFIX protocol, such as Cisco, H3C, Huawei switches, and VMware also use the IPFIX export protocol as a standard flow information export protocol.

However, it should be noted that the IPFIX protocol is only an "information export protocol", and its purpose is to replace the information export protocol from Exporter to Collector in the Netflow architecture, but the overall architecture is basically like the Netflow architecture. The IPFIX protocol is developed from the Netflow v9 protocol. The version of the protocol header in the Netflow protocol is 9, which means Netflow v9, and the protocol header in the IPFIX protocol is basically consistent with Netflow v9, and the version is a (10), which is Netflow v10 , as shown in Figure 6.

Figure 6. Comparison of IPFIX header format and Netflow v9 header format

As a standard flow information protocol launched by IETF, IPFIX has made the following improvements on the basis of Netflow v9:

1) The transport layer protocol of IPFIX is more diversified than Netflow v9, and supports UDP, TCP, and SCTP protocols as the choice of transport layer protocol; (for this point, please refer to Section 10.1 of RFC 7011)

2) Compared with the Netflow v9 protocol, IPFIX has added new features such as enterprise fields and variable-length content, which can support the export of private data within some enterprises (this is the most convenient feature in my opinion); (rfc 7012 has a detailed explanation)

3) Compared with Netflow v9, the template content of the IPFIX protocol has increased a lot, and more information can be exported. (see also rfc 7012)

It can be seen that IPFIX is more of a protocol standard than Cisco's Netflow. The enterprise field in it can be well compatible with some dialect versions of the flow information export protocol, which solves many compatibility troubles. From a functional point of view, IPFIX is compatible with Netflow v9 (note that the function value here refers to the fact that IPFIX is more complete than Netflow v9 in the exported content), but it is not compatible from the perspective of the protocol alone (that is, it still needs to increase the amount of development of IPFIX) .

【INT】

INT technology is a new type of network remote sensing technology in recent years. The full name is Inband Network Telemetry, which means "inband network remote sensing technology". It is a technology proposed by Cisco. INT is a technical means and an idea. In Cisco The application in the product is called IOAM (inband OAM), so there is a new concept here, what is "inband"? The literal translation of inband is "in-band", but how to understand "in-band"?

My personal understanding is: Flow monitoring technologies similar to Netflow or IPFIX can be called "outband", that is, out of band. To give an example in life, out-of-band is like you are a monitor. You are monitoring the state of this flow, and you are still in the position of a bystander. Since you are in the position of a bystander, the method of observation and observation The location of the observation is bound to have a large error; and how to maximize the elimination of this error? The experience in life tells us that it is called "putting yourself in the shoes of the person", that is, we put ourselves in the perspective of the person involved to look at the matter, then we can see more of the true face of the matter, and inband is like this, and inband Network telemetry is realized in this way , change the bystander to monitor the flow, and instead implant the detection information into the data packet, which is equivalent to the monitoring point mounting every data packet in the flow, so what the data packet experiences must be what the monitoring point experiences of.

Figure 7. Implementation of INT

As shown in Figure 7, the implementation of INT is to insert the monitoring checkpoint into the inside of the data packet, that is, the OAM layer on the way. The usual practice of INT is to insert the header (header) of the data packet and the internal data (payload) Insert an OAM layer between them, then this data packet changes from an ordinary network data packet to a "marked" data packet by us. To give another example, I believe that friends who have watched the Animal World program know that oceanographers will install remote sensors on marine animals in order to monitor the behavior of some animals, such as sea turtles, dolphins, penguins, etc. Then finally the ocean Scientists can recover the dolphin's detector, and analyze the dolphin's trajectory and behavioral characteristics based on the data recorded in the detector. Then INT is this principle, between Figure 8.

Figure 8. Schematic diagram of the principle of INT technology monitoring network status

As shown in the example in Figure 8, after a data packet (like a dolphin) enters the "starting point" network element device (captured by scientists), it is inserted into the OAM layer (installed with a detector), and then the data packet passes through Each network element (every piece of ocean) will insert detection information into the OAM layer (recorded by the detector), including the behavior of the network element on the data packet, whether it is forwarded or discarded (the dolphin dies), and finally the data packet arrives at the end , the OAM layer in the data packet is stripped out, and uploaded to a remote analysis server through the information export protocol (the detector is recovered and analyzed by the scientist). Then there is a problem here. How does the network element device, that is, the transmission node in the OAM domain, know what information to collect?

Figure 8. The OAM layer in the IOAM packet

In fact, the OAM layer is composed of two parts, one part is called instruction, which is the instruction, and the other part is data, which is the data. The instruction will carry the information to be collected, and the translation node only needs to insert the corresponding information into the The data collection can be completed in the data section.

It can be seen that the principle of INT technology itself is very simple, which is to insert a custom layer inside the data packet, then record the information passing through each network element, and finally upload the detection information to the remote analysis server. So what advantages does INT technology bring, or in other words, how can this technology be applied?

1) Packet loss monitoring at the granularity level of data packets. Since the granularity of monitoring has changed from flow to finer-grained packets, ideally, INT technology can know the situation of packet loss in the network and the reasons for packet loss. The location of the packet loss.

2) Capture the jitter phenomenon in the network. Some jitters often appear in the network. The characteristics of jitter are often difficult to capture for a short time. Since INT technology is an in-band technology, ideally it can still be captured in the network. Delay or packet loss jitter.

3) Intelligent routing and route selection. According to the OAM data information, you can know the delay, bandwidth and packet loss of each link in the network. Then you can perform intelligent route selection based on these data and route some important traffic. Re-route.

However, INT technology also brings some obvious disadvantages. Take the traditional INT technology, which is Cisco's IOAM, as an example:

1) In the DCN network or the operator network, the number of data packets is huge. Inserting each data packet into the OAM layer and monitoring it will generate a huge amount of information. How to compress and deduplicate such a huge amount of information ?

2) The "starting point device" (called encapsulation node in INT technology) needs to insert a new OAM layer for each data packet, which will cause huge performance consumption. If the forwarding process of the forwarding device is implemented by software , that is a devastating blow to forwarding performance (you need to constantly apply for new space, copy the original data packets, and then insert a new layer, in fact Cisco has also noticed this, so in There are two implementations when the encapsulation node inserts a new OAM layer, which will be discussed later when analyzing INT in detail).

3) The forwarding device needs to continuously insert the detection data into the data packet, and the checksum of the data packet needs to be recalculated, which is also a performance consumption point.

4) In the IOAM technology, the forwarding node will continuously insert the OAM layer between the hdr and the payload, so considering that the data packet will be fragmented, the length of the OAM layer must be limited, which is further equal to the forwarding device through which the OAM data packet passes ( In the IOAM technology, the domain formed by these forwarding devices is called an OAM domain (OAM domain) cannot be too many.

For the software implementation of IOAM, you can go to the source code of VPP. Versions after VPP 17.xx all have IPv6 IOAM implementation.

【PBT】

PBT (Postcards-Based Telemetry) technology is essentially an INT technology. It is an "upgraded version" based on Cisco's IOAM technology proposed by Huawei. It can be understood as an optimized version. Huawei mainly targets the second item in the IOAM defect. , Article 3, and Article 4 have been optimized, and Huawei has launched two optimized versions, one is PBT-I, and the other is PBT-M, the former is the super version, and the latter is the compatible version. So how does PBT do it? In fact, the optimization of PBT is somewhat similar to the previously mentioned active detection packet injection technology. Since PBT technology is essentially the same as IOAM technology, let's take the example of dolphin as an example:

  • IOAM: A dolphin was captured by an oceanographer. The oceanographer installed a detector on the dolphin. The detector cannot be connected to the Internet, but has a built-in hard disk. The dolphin took the detector to swim all over the Pacific Ocean. The information will be collected, and then captured by oceanographers one year later, the probe will be dismantled, and the oceanographer will conduct analysis based on the data in the probe's built-in hard disk.
  • PBT: A dolphin was captured by an oceanographer. The oceanographer installed a detector on the dolphin. There is no hard disk inside the detector, but there is a communication module. The dolphin swims all over the Pacific Ocean with this detector. The information will be collected and sent to the remote oceanographer's server immediately. The oceanographer can analyze the data received by the server, or store the data in the data center first. After a year, the oceanographer captures the dolphin , taking into account the sense of responsibility to protect marine animals, the detectors were dismantled and analyzed based on the complete data from the data center.

It can be seen that the biggest difference between PBT and IOAM technology is that PBT does not carry OAM detection data, but will upload information immediately without passing through a forwarding node (strictly speaking, a translate node in the OAM domain), such as PBT-I technology The realization principle of can be shown in Fig. 9 .

Figure 9. Implementation principle of PBT-I technology

Then the principle of PBT-T is described as follows. The data packet passes through the OAM domain (which can be understood as the monitoring domain of INT) and enters the "start node" (encapsulation node). The start node marks the data packet. Note that this is the mark. It is not inserted into the OAM layer. It is the same as the packet injection technology in the active detection. Then the marked data packet can be called the PBT detection data packet, and then the PBT detection data packet passes through every network device in the OAM domain (translation node), the transmission node will check the flag bit in the data packet to see if it is a PBT detection data packet, and if it is a PBT detection data packet, upload the monitoring data immediately.

But there is still a problem here. Since the "monitoring template" + "monitoring data" of the original IOAM data packet is reduced to only 1 bit of the flag, how does the device know what information to collect and upload? This place can be realized through a template similar to ipfix, that is, to implement the management or control what information is collected for configuration notifications for each transport node in the OAM domain. In this case, the management and control plane needs to deliver template configurations for data collection to all nodes in the OAM domain, unlike IOAM, which only needs to deliver template configurations for data collection to the encapsulation node.

According to the above principle introduction, we can at least know that compared with IOAM, PBT-I has made the following obvious improvements:

1) The amount of collected data is no longer limited by the length of the data packet, because the collected data is no longer inserted inside the data packet.

2) There is no need to perform operations such as inserting and collecting data which is very wasteful in performance, recalculating checksum and other operations for each data packet. For ipv4 data packets, you only need to look at the mark in the ipv4 header, such as the reserve bit in TOS, and you can know what to do. What information to upload.

These two improvements alone are actually a qualitative change, especially the second improvement, which greatly increases the feasibility of INT technology on the software forwarding platform (the most feared thing about the forwarding platform implemented by software is to reorganize the data packets. Allocate memory + memory copy + recalculate checksum, these will bring a lot of performance consumption), but there is no perfect technology in the world, and the improvement of PBT-I technology also brings other disadvantages:

1) The management and control plane needs to issue template configuration information to all nodes in the OAM domain, and inform the nodes in the OAM domain what information to report when encountering PBT detection packets.

2) Since information needs to be uploaded without passing through a translate node, the number of uploaded information will further increase, which also brings the risk of losing information uploaded by a certain node.

3) Time synchronization and information reorganization are required, that is, how to connect and assemble the information uploaded by these n network elements after a PBT detection data packet passes through n network elements.

4) It is not flexible and cannot be customized. For example, there are five streams A, B, C, D, and E in the network. Among them, the information collection I need to collect for two streams A and B is M, and for three streams C, D, and E The set of information collected by the flow is N. But there is only one marker bit, and there is no way to make a finer-grained distinction.

Although these shortcomings have been brought about, the improvement of PBT-I has indeed made people see the feasibility of software to realize INT technology. After talking about PBT-I, let’s talk about the “compromised version” of PBT-M. The draft proposed by Huawei also admitted that the PBT-I technology was too extreme, which led to the above-mentioned problems. Can we do it a little bit? Not so extreme, how about a compromise solution? So came the compromise version of PBT-M.

Compared with PBT-I technology, PBT-M does not completely abandon the OAM layer, but compared with IOAM technology, it also abandons the insertion of detection data in the OAM layer, so PBT-M is equivalent to a compromise between IOAM and PBT-I ,Right now:

When the data packet passes through the encapsulation node for the first time, it will also be inserted into the OAM layer. However, inserting the instruction directly and not inserting the data is equivalent to inserting only the collection instruction, which is used to inform each network element that the current PBT detection data packet passes through. What kind of information is collected by the data packet, and then this information is exported to a remote analysis server. According to this feature, we can know that PBT-M actually optimizes 1 and 4 of the 4 shortcomings of PBT-T. The control plane only needs to send configuration to the encapsulation node. For example, the network There are five streams A, B, C, D, and E. The set of information I need to collect for the two streams A and B is M, and the set of information collected for the three streams C, D, and E is N. When the data packets of the two streams A and B pass through the encapsulation node, the encapsulation node inserts the OAM layer into A and B according to the configuration issued by the control plane, and the instruction is M, and inserts the OAM layer into C, D, and E, and the instruction is N , if the data packet of flow A passes through the translation node, the translation node will know the data corresponding to the instruction M to be uploaded according to the instruction M in the built-in OAM layer, so the flexibility is greatly enhanced. However, the compromise version of PBT-M will still bring disadvantage 2 in the IOAM technology, that is, the encapsulation node needs to insert the OAM layer for this data packet, which is still unfriendly to software implementation.

[Summary - talk about some of my own views]

1) The active packet injection technology can only troubleshoot some network faults that are currently occurring, so it has relatively large limitations, but the performance consumption is small and easy to use, but the granularity of diagnosis is very fine, and the faults and fault locations that are occurring in the network can be obtained immediately and the cause of the failure

2) Netflow or IPFIX, a traditional flow-based passive network detection technology, consumes more performance than active detection technology, and there are cases of inaccuracy and error, and the granularity is not satisfactory, but as a monitoring It is still a mainstream and easy-to-implement technology.

3) INT technology can be said to be a passive flow-based monitoring technology such as Netflow or IPFIX for network detection. However, the IOAM technology in INT technology is currently not suitable for software implementation. Of course, barefoot has launched a forwarding chip that supports INT technology. , so that IOAM technology can be applied, but the price of the chip is touching. It is said that a chip costs 8,000 knives (go grab it...). At present, barefoot has been acquired by Intel. What will happen next is still unknown. Currently, IOAM is an INT technology. The specific implementation is more based on hardware implementation, and then combined with P4 programming.

4) The PBT technology in INT technology is currently PBT-I that is more friendly to software implementation. Of course, for PBT-M technology, if the encapsulation node is a hardware forwarding device that supports INT, then the flexibility of PBT-M is indeed Better than PBT-I.

Reposted from: https://www.cnblogs.com/jungle1996/p/12209348.html 

Guess you like

Origin blog.csdn.net/fuhanghang/article/details/131393374