2.1 --> VPP/Vector Packet Process technical principle analysis

  1. Vector data packets (Packet Vector) and scalar packets The
    so-called "vector" is relative to the processing of "scalar" packets. Vector is a natural scientific term that has both size and direction. Quantities, also known as vectors. In computer science, Vector Graph can be wirelessly magnified and never deformed.
    Scalar message processing method : It
    is a common way of thinking and logic for humans, that is, messages are processed according to the order of arrival, the first message is processed, the second is processed, and so on. A more formal way also requires a combination of processing interrupts, and traversing the call stack (eg calls b, calls c, calls d... return return return), and then returning from the interrupt, the function will be frequently nested calls. Finally, the process performs one of the following three operations: a, no processing, b, discarding or rewriting, c, forwarding the message.
    Insert picture description here
    It can be seen that the traditional scalar message processing method has an obvious defect-I-Cache misses (cpu quality cache jitter). Because Cache has the characteristics of time limitation and space limitation, each data message will generate an I-Cache misses. In the face of this problem, there is no solution other than providing a larger Cache space.

Vector data message processing method : The
vector message processing method is to process multiple messages at a time, that is, process a packet vector at a time instead of a single packet. A batch of messages received from the underlying hardware queue Rx Ring (receiving queue) is formed into a message array called Packet Vector vector message, and then the Packet Processing Graph is used to organize the processing flow.
Insert picture description here

The vector packet processing method actively utilizes the time limitation of the Cache, and organizes a group of Packets into a Packet Vector. If the I-Cache hits, then this group of Packets will all hit; otherwise, they will all miss. In the case of a miss, the first packet Packet-1 in the Packet Vector is used to preheat the I-Cache image, and the CPY quality cache acceleration is performed for the subsequent Packets in the Packet Vector. In this way, the processing performance of the remaining Packets in the Packet Vector can directly reach the limit.
Insert picture description hereIn short, the vector message processing method allocates the cache Miss Time of the entire Packet Vector to the cache Miss Time of the first message (packet-1), which significantly reduces the processing overhead of a single message.
It can be seen that vector message processing solves the main performance defects of scalar message processing and has the following advantages:
a. Solving the problem of I-Cache jitter;
b. Preheating the I-Cache to alleviate the problem of read instruction delay. High performance and more stable.

  1. Vector diagram of data messages
    The software architecture of VPP includes a development framework and a series of Graph Nodes organized according to the Packet Processing Graph.
    (2.1) Packet Processing Graph: It is
    composed of multiple Graph Nodes (let's call it graph nodes). The Graph Node decomposes the entire message processing flow into successively connected Service Nodes; the Packet Vector is first Processed by a Graph Node, and then processed by the second, Nth Graph Node in turn, and so on.
    Insert picture description here

(2.2) Development framework:
Contains basic data structures, timers, drivers, schedulers that allocate CPU time slices between Graph Nodes, and performance tuning tools (eg counters, packet capture tools). The VPP development framework adopts the Plugin mechanism, and Plugin Graph Node and VPP Build-in Graph Node are treated the same, which facilitates rapid and flexible development of new functions. Therefore, the plug-in mechanism enables developers to make full use of existing modules and quickly develop new functions. In fact, the essence of a plug-in is a Graph Node that implements a specific function, but it can also be a driver or CLI. Plugin Graph Node can be inserted into any position in the Packet Processing Graph of VPP.

  1. Packet Processing Graph processing flow
    As shown in the figure below,
    Insert picture description hereVPP first polls the receiving queue of the Ethernet interface from the Input Node to obtain batches of data packets; then these data packets form a Packet Vector or frame according to the function of the next Graph Node (Frame). For example: ethernet-input Node collects all IPv6 data packets and passes them to ip6-input Node.

When the ip6-input Node is scheduled, it takes out the first packet-1, uses Dual-Loop or Quad-Loop and prefetches the message to the cpu cache technology to process the message, you can Effectively reduce the number of I-Cache misses to achieve optimal performance.

After the ip6-input Node node has processed all the messages in the current frame, it will pass the messages to different subsequent nodes. For example: if a message fails to be checked, it will be passed to the error-drop node. The normal message is passed to the ip6-lookup node. packet-1 passes through different Graph Nodes in turn until they are sent out by the interface-output Node.

  1. The Plugin mechanism of
    Packet Processing Graph VPP The Node in the Packet Processing Graph can be replaced. When this feature is combined with the mechanism of VPP that supports dynamic loading of Plugin Node, new functions can be quickly developed and deployed without the need to create and compile a custom Version of the code.
    Insert picture description hereIn short, VPP's Graph Node organization allows users to insert new Graph Nodes or rearrange the processing order of Graph Nodes through Plugin according to their needs, which is very convenient for expansion and will not affect the original core processing flow. .

  2. Features of Packet Processing Graph

(5.1) Each Graph Node uses Packet Vector as the smallest input/output processing unit;
(5.2) From a software engineering perspective, each Graph Node is independent and autonomous;
(5.3) From a performance perspective, it can Optimize the use of cpu instruction cache (I-Cache), make full use of the vector structure of the CPU, make message memory loading and message processing interleaved, and achieve a more effective use of the CPU processing pipeline.
(5.4) Predict the forwarding objects between reused messages (such as neighbor table and routing table lookup), and pre-load the message content to the CPU's local data cache (D-Cache) for the next cycle. These effectively use computer hardware The technology enables VPP to have more fine-grained parallelism.
The Packet Processing Graph feature of VPP makes it a loosely coupled and highly consistent software architecture. Each Graph Node uses Packet Vector as the smallest processing unit for input and output, which provides loose coupling characteristics. Common functions are combined into each Graph Node, which provides a highly consistent architecture.

Reference link:
https://fd.io/
https://fd.io/documentation/
https://openfastpath.org/
https://www.metaswitch.com/blog/fd.io-takes-over-vpp
https://blog.csdn.net/Rong_Toa/article/details/107040703

Guess you like

Origin blog.csdn.net/weixin_38387929/article/details/115057120