High Performance Network SIG Monthly News: Long-term investment has been recognized by the industry, and a new virtio reviewer has been added

High-performance network SIG (Special Interest Group): In the era of cloud computing, software and hardware are developing rapidly, and new application forms such as cloud native and microservices are emerging, allowing more data to flow between processes, and the network has become the data center for these data. The carrier of streaming plays an unprecedentedly important role in the entire cloud era. In this era of the Internet of Everything, the efficiency of network communication on the cloud is crucial to various services. The High Performance Network Interest Group is committed to using new high-efficiency communication technologies such as XDP, RDMA, VIRTIO, and combining the idea of ​​software and hardware integration to create high-performance The performance network protocol stack improves the network performance of data center applications in the era of cloud computing.

01 Overall progress of SIG this month

This month's High Performance Networking SIG is focusing on Anolis OS kernel networking, SMC and virtio.

Key developments this month:

  • SIG member Xuan Zhuo has become a reviewer for the virtio core/virtio-net subsystem in the upstream Linux kernel community. Xuan Zhuo's input in the virtio community over the past three years has been widely recognized.

  • This month, SIG completed upstream virtio-net's support for XDP socket zero copy, which can greatly improve the performance of sending XDP sockets under virtio. This feature has been supported on Dragon Lizard's ANCK kernel for more than a year. Now, we contribute this feature to the Linux upstream community. At present, the community has completed the review of the virtio-net XDP reconstruction part, and it is expected to cooperate after the 5.8 window period. enter.

02 ANCK kernel network

A new security vulnerability fix has been added in the network direction this month: CVE-2023-1074 (SCTP-related).

03 High performance network protocol stack SMC

This month, the high-performance network SIG's work in the SMC field mainly focuses on promoting native high-performance communication, and discussing the eBPF-based policy replacement of the two solutions in the Linux upstream community.

Native high-performance communication solution

Native loopback and inter-container (cross-netns) communication is already a common data path, which is widely used in data processing and cloud-native scenarios. For example, in cloud-native scenarios, service mesh communicates with business processes and sidecars through proxy processes. SMC provides a native (loopback and container) high-performance communication solution. Compared with traditional user-mode IPC, and the kernel's TCP loopback or UNIX domain socket and other solutions, in addition to performance advantages (for detailed data, please refer to the LWN link https ://lwn.net/Articles/929934/), and transparent to the application without intrusion or modification.

In the previous version of the review, the community gave some positive feedback: s390 PCI maintainer Niklas approved the loopback negotiation process consisting of 64bit random GID + 64 bit token. In terms of probability, to achieve a collision probability of 10^(-15), 8.2×10^11 attempts are required, so a random GID is acceptable. We have fully communicated with IBM's agreement owner Jerry, and then wait for the community to complete the discussion and determine the final plan.

This month, SIG sent the v5 version of the SMC loopback solution to the Linux upstream community. The new version adds SEID selection logic, fixes the potential problem of unregister_dmb, and fixes the problem of running abnormally under the s390 architecture.

Alternative scheme based on eBPF policy

SMC provides the ability to dynamically fall back to TCP. The decision factor of the current fallback strategy is mainly whether the RDMA/ISM connection is successfully established. Since the short link performance of SMC is not as good as that of TCP, in order to make SMC more general, it is planned to add policy-based fallback TCP capability to SMC to help SMC better adapt to different application models and scenarios.

This month, we sent the RFC and the official replacement patch to the Linux bpf and net communities, and got feedback from the eBPF maintainer, including some problems such as module symbols and file names. We are revising and continue to promote the acceptance of the proposal by the SMC and bpf communities.

04 virtue

community influence

This month, Xuan Zhuo, a member of the High Performance Networking SIG, became a reviewer for the virtio core/virtio-net subsystem in the linux kernel community. Xuan Zhuo's investment in the virtio community in the past three years has been widely recognized by the community.

This month, SIG also fixed a virtio net bug in the xdp scenario: [PATCH net] virtio_net: bugfix overflow inside xdp_linearize_page() - Xuan Zhuo

virtio-net support AF_XDP zerocopy

AF_XDP is a new framework for sending and receiving packets that bypasses the kernel protocol stack. It can directly transfer the received packets of the driver to the user mode, and can also transfer the packets directly from the user mode to the driver to send out. Compared with the kernel's UDP protocol stack, its performance can be improved by 3-7 times PPS. But it depends on the driver for zerocopy support. This work is divided into several parts:

  • Virtio core supports DMA premapped

This part is to enable the virtio core framework to support submitting DMA address operations (link). In the current implementation, all DMA operations are completed in the virito core. We need to allow the driver to support passing the DMA address to the virtio core, because AF_XDP will complete all address DMA operations and some virtqueue reset related operations in advance.

This part involves virtio's DMA-related API usage issues. Since virtio cannot use DMA API in some scenarios, we hope that DMA API or AF_XDP can support virtio in this scenario. This triggered a relatively extensive discussion, and the current conclusions are:

  • AF_XDP may move to use dma-bufs for DMA API in the future.

  • The DMA API does not support special scenarios like virtio.

These factors have caused our advancement work to encounter great difficulties. After consideration and discussion, we consider not supporting virtio without VIRTIO_F_ACCESS_PLATFORM for the time being, which usually occurs on some old virtio devices.

  • virtio-net XDP refactoring

This part has been reviewed and will submit another version after the 5.8 window period.

virtio-net inner header hash

The tunneling protocol sometimes encapsulates data packets of different streams into data streams with the same quintuple in the outer header, and these data streams will be hashed to the same receiving queue, thereby losing the performance of RSS. In order for virtio to support inner header hash for the tunneling protocol, High Performance Network SIG initiated the proposal of virtio_net: support inner header hash, hoping to standardize it in virtio.

This month, v12->v13 and the virtio community discussed the problem of placing the tunnel type supported by the device side in the PCI device-specific configuration space in the previous design of the inner header hash. The advantage of this design is that the resident capability of the device can be initialized It is only read once by the driver, but with the addition of more and more new features of virtio, it is very unfriendly to some tiny devices to make the device-specific configuration space more and more bloated, so after the discussion, the device configuration space will be Moved to the control queue (the disadvantage of the new method is that each user's GET query will return both configured and supported fields, but acceptable).

In addition, the community is still questioning the number of tunnel types supported by inner header hash. The community is more inclined to support legacy protocols, so that legacy protocols can enjoy the RSS performance improvement brought about by entropy increase, such as GRE-2784, etc., but we want to Support more modern tunneling protocols, such as VXLAN/GENEVE, etc., so that these protocols can perform symmetric hashing based on the internal header, and obtain performance improvements such as less locking and warm cache in some scenarios, which still needs to be discussed .

The above is the monthly dynamics of the high-performance network SIG in April. Welcome everyone to join in the co-construction. For more SIG news, go to the official website of Dragon Lizard.

Related Links

High Performance Networking SIG homepage:

https://openanolis.cn/sig/high-perf-network

SMC loopback solution v5 version:

https://lore.kernel.org/netdev/[email protected]/

DMA address operation:

https://lore.kernel.org/all/[email protected]/

virtio-net XDP refactoring link:

https://lore.kernel.org/all/[email protected]/

virtio_net: support inner header hash :

https://lists.oasis-open.org/archives/virtio-dev/202304/msg00465.html

-- over--

In order to provide you with better content and services, the Dragon Lizard Community sincerely invites you to participate in the questionnaire survey. Please scan the QR code below or click this link to fill in . We will select high-quality feedback and send out Dragon Lizard merchandise!

 

Guess you like

Origin blog.csdn.net/weixin_60347558/article/details/130626074