iSCSI vs iSER vs NVMe-TCP vs NVMe-RDMA

iSCSI

iSCSI (Internet Small Computer System Interface) is a block storage protocol that extends the popular SCSI protocol to a TCP/IP network, thereby creating an IP-based storage area network SAN. It is also a built-in support for Daoyunxing FASS all-flash software-defined storage One of the block storage protocols.

iSCSI has a very solid foundation: SCSI, TCP and IP, so it has been widely used in the past 20 years, and many operating systems and hypervisors now have built-in support. The Linux Kernel 3.1 version adds iSCSI Target support to its kernel-state Linux SCSI Target (commonly known as LinuxIO, abbreviated as LIO).

As a pure software solution, iSCSI has a very low cost, it does not need to change the existing system, including hardware and software, and the management of iSCSI storage network is actually the management of IP/Ethernet. Of course, if performance is required, iSCSI can also use hardware implementation or hardware acceleration.

 iSER 

iSER (iSCSI Extensions for RDMA) is the evolution of the iSCSI protocol. It takes advantage of the advantages of RDMA (Remote Direct Memory Access) to improve the performance of the iSCSI protocol. It is also the built-in support of Daoyunxing FASS all-flash software-defined storage One of the block storage protocols.

iSER can efficiently convert iSCSI commands into RDMA transactions, and RDMA has three well-known advantages:

# 01 : Zero copy/zero copy , using the traditional network protocol stack requires multiple data copies, while using RDMA, data can be directly sent to or received from the network card, reducing memory bandwidth usage.

# 02 : Kernel bypass/kernel bypass , the traditional kernel-mode network protocol stack needs to switch in and out of the operating system kernel multiple times, which brings a lot of context switching overhead. RDMA can perform data transmission in user mode, greatly reducing the overhead.

# 03 : No CPU involvement/no CPU participation , the hardware implements the network transmission stack , in addition to not requiring software implementation to occupy CPU cycles, it also avoids contention for the CPU internal Cache.

Therefore, compared with iSCSI, iSER can achieve high performance and low latency. In fact, iSER can be considered as a hardware solution, which requires an Ethernet card that supports RDMA: iWARP or RoCE, where iWARP is based on TCP/IP protocol, while RoCE is based on the UDP/IP protocol. Due to the difference between the four-layer transmission protocols TCP and UDP, the iSER client and server using two different solutions cannot work together.

In addition, for iSER RoCE, it also has additional requirements brought by the RoCE network: lossless Ethernet, features including PFC, ECN, and corresponding management capabilities. Relatively speaking, iSER iWARP only needs the corresponding network card and can simply run on all standard Ethernet networks.

The Linux Kernel 3.10 version adds iSER Target support to the kernel state Linux SCSI Target.

NVMe over TCP

NVMe over TCP (abbreviated as NVMe/TCP) is a new protocol added to the 1.5.0 version of Daoyunxing FASS software . It is one of the many implementations of NVMe over Fabrics (abbreviated as NVMe-oF). Somewhat similar to iSCSI, it extends NVMe-oF capabilities to the TCP protocol stack, and it is also a pure software solution. Very different from iSCSI, which is based on the new NVMe architecture, and iSCSI is based on the powerful but ancient SCSI architecture, they have significant differences:

point of difference

In terms of hardware specifications , NVMe devices can support up to 64k (65,535) hardware queue pairs (each pair of queues includes a send queue and a completion queue), and the depth of each queue pair can reach a maximum of 64k (65,535). In contrast, the number of queues for a SCSI terminal device is 1, and the queue depth is 256. The specifications of the SCSI initiator vary from person to person. Generally, it is a single queue with a depth of about 1k (1000). There are very few hardware implementations of multiple queues.

In terms of software , the NVMe native protocol stack is designed so that each hardware queue pair is mapped to a CPU core. There is no lock between different queues and no synchronization is required. SCSI protocol implementations vary by operating system, but generally, earlier SCSI protocols had only one software queue and a large lock on this queue, which limited the performance of the SCSI protocol.

NVMe/TCP and iSCSI inherit the advantages and disadvantages of the two respectively

In NVMe/TCP, each hardware queue pair is mapped to a TCP stream and allocated (best effort) to an independent CPU core, using multi-core processors to enhance overall performance, while earlier iSCSI generally uses It is a single-process, single-TCP stream design that can only use one CPU core—until Linux Kernel 3.17.

NVMe/TCP appeared in 2018, and Linux Kernel 5.0 (and Kernel 4.21) supports NVMe/TCP. Daoyunxing FASS all-flash software-defined storage uses a unique user-mode high-performance NVMe/TCP Target.

It is worth mentioning that Linux Kernel evolution improves the performance of NVMe/NVMe-oF/SCSI/iSCSI:

The blk-mq (Linux Multi-Queue Block IO Queueing Mechanism) was introduced in Linux Kernel 3.13. Since then, Linux block storage has truly entered the multi-queue era, and the NVMe architecture has also benefited from this, but the SCSI architecture has to wait.

The scsi-mq (multi-queue SCSI layer) was introduced in Linux Kernel 3.17. In fact, it is designed to support the old blk-sq (single-queue block device layer), but blk-mq is required to implement multiple queues. Up until this point, iSCSI/SCSI hadn't gotten a notable boost.

In Linux Kernel 5.0, blk-mq becomes the default option, which may have a significant impact on reality, because before that, users need to make settings to be able to play the best performance of the device.

NVMe over RDMA

The NVMe (Non-Volatile Memory Express) version 1.0 released in 2011 was originally a protocol designed for high-speed direct-connected flash memory storage. As mentioned earlier, it can support 64k queues and 64k queue depths. In 2016 The NVMe-oF specification introduced expands NVMe into a storage network, which was originally designed for the RDMA transmission protocol and later joined the FC (Fibre Channel) protocol.

NVMe over RDMA (it can be abbreviated as NVMe/RDMA, but generally NVMe/RDMA is the default when using the term NVMe-oF) combines high-performance NVMe architecture with high-performance RDMA to obtain an extremely high-performance storage network.

The relationship between NVMe/RDMA and NVMe/TCP is similar to the relationship between iSER and iSCSI. NVMe/RDMA is similar to iSER and belongs to a hardware solution. In particular, NVMe/RDMA also includes support for InfiniBand (running on InfiniBand SCSI is called SRP). The application conditions of NVMe/RDMA are similar to those of iSER: a network card that supports RDMA is required.

At the same time, for NVMe/InfiniBand and NVMe/RoCE, a network that supports InfiniBand, or a lossless Ethernet network with PFC and ECN is required, while NVMe/iWARP only needs ordinary Ethernet to operate.

The Linux Kernel 4.8 version provides support for NVMe-oF. Daoyunxing FASS all-flash software-defined storage uses a unique user-mode high-performance NVMe/RDMA Target, which has the characteristics of zero copy and kernel bypass.

Summarize

iSCSI, iSER, NVMe/TCP, and NVMe/RDMA belong to two architectures of SCSI and NVMe, and each provides a pure software or RDMA hardware-based design.

In the case of using RDMA hardware acceleration, both iSER and NVMe/RDMA can provide high performance, while the pure software solutions iSCSI and NVMe/TCP have slightly lower performance than the hardware version but do not need to pay any hardware costs.

SCSI architecture (iSCSI/iSER), compared with NVMe architecture (NVMe-oF, including TCP and RDMA), NVMe architecture has fundamental advantages: the number of instructions and the number of context switches required for each IO are lower, whether it is It is more efficient for both the client and the server, so it will have better performance, but a unique advantage of iSCSI is that it can support older operating system kernels, and users can use it without any modification.

Daoyunxing FASS all-flash software-defined storage also provides support for the above four block storage protocols, which can provide users with very flexible performance/cost options.

Guess you like

Origin blog.csdn.net/liuben/article/details/124780049