Siege lion readme Shu OK1043A-C DPDK Ambient Experience

Feiling embedded in June this year launched --FET1043A-C core board with NXP's QorIQ® LS1043A processor design, with four ARMv8-A architecture Cortex-A53 cores clocked at 1.6G, low power consumption, energy-efficient. Up to 10GB of quad-channel SerDes includes a variety of flexible configurations, has played a QorIQ® LS1043A network processor performance to the maximum extent in supporting Feiling floor design, the use of a Gigabit, Gigabit six of design, with internal processor DPAA1 acceleration engines, combined with large-capacity 2GB DDR4 memory, network performance is simply a monster.

Siege lion readme Shu OK1043A-C DPDK Ambient Experience

Hardware so good, then the traditional Linux kernel also whether its good match? the answer is negative.

The reasons are the following aspects:

❶ interrupt handling. When a large amount of network packet arrives, will produce frequent hardware interrupt request, the hardware interrupt lower priority can interrupt soft interrupt before or during the execution of a system call, if this frequently interrupted, it will produce more high performance overhead.

❷ memory copy. Under normal circumstances, a network packet from the network adapter to the application goes through the following process: data transmitted from the kernel buffer to open up the card via the DMA, etc., and then copied from the kernel space to user mode space, a protocol stack of the Linux kernel this time consuming operation even accounted for 57.1% of overall processing of the data packet.

❸ context switch. Frequently arrive hardware interrupts and soft interrupt may call at any time to seize the operation of the system, which will generate a lot of context switching overhead. In addition, in the framework of multi-threaded server-based design, the scheduling between threads also have frequent context switching overhead, the same lock competitive energy is also a very serious problem.

❹ localized failure. Today is more mainstream core processor, which means that a packet processing across multiple CPU cores may, for example, a packet may be interrupted in cpu0, kernel mode processing cpu1, user mode process cpu2, so cross multiple cores, likely to cause CPU cache invalidation, resulting in localized failure. If NUMA architecture, but will result in cross-NUMA access memory, performance has been greatly affected.

❺ memory management. Traditional server memory page is 4K, in order to improve memory access speed, avoiding cache miss, the cache can increase the entry mapping table, but this would affect the retrieval efficiency of the CPU.

Based on the above issues, we can see the kernel itself is a very big bottleneck. It is clear that the solution is to find ways to bypass the kernel. Many seniors through research pioneer, DPDK stand out in many programs.

Siege lion readme Shu OK1043A-C DPDK Ambient Experience

Siege lion readme Shu OK1043A-C DPDK Ambient Experience

"Paper come Zhongjue know this practice is essential," Let us experience the next DPDK through an example.

First, DPDK environment, need to modify the device tree, the network configuration to the user state. You need to use the device tree file:

OK10xx-linux-fs/flexbuild/build/linux/linux/arm64/fsl-ls1043a-rdb-usdpaa.dtb

The fsl-ls1043a-rdb-usdpaa.dtb development board copied to the root directory, a replacement device tree uses the following commands:

mv/run/media/mmcblk0p2/fsl-ls1043a-rdb-sdk.dtb/run/media/mmcblk0p2/fsl-ls1043a-rdb-sdk.dtb.bak

cp/fsl-ls1043a-rdb-usdpaa.dtb /run/media/mmcblk0p2/boot

LN-s /run/media/mmcblk0p2/boot/fsl-ls1043a-rdb-usdpaa.dtb/run/media/mmcblk0p2/boot/fsl-ls1043a-rdb-sdk.dtb

reboot

After the successful launch of the development board to replace Input: ifconfigfm1-mac1

If prompted Devicenot found that success has been replaced.

Restore the default configuration after completion of the test DPDK:

cp/run/media/mmcblk0p2/fsl-ls1043a-rdb-sdk.dtb.bak/run/media/mmcblk0p2/fsl-ls1043a-rdb-sdk.dtb

reboot

After the network configuration to user mode, then how do we use them? TCP / UDP ye use? Do not worry if using TCP or UDP in a transplant DPDK also need to DPDK the stack. Entry stage we start to experience a DPDK contains a test routine Layer 2 forwarding it.

Layer forwarding network topology as shown below:

Siege lion readme Shu OK1043A-C DPDK Ambient Experience

Use Port2 platform and Port3 OK1043A-C (corresponding fm1-mac3 and fm1-mac4), and the data between LinuxHost OK1012A-C forwards. LinuxHost and OK1012A-C you can replace other network devices.

Configuration OK1043A-C:

l2fwd-c 0xf -n 1 -- -p 0xc -q 1 --no-mac-updating

parameter

Explanation

-c

Use 4-core Core mask 0xf

-n

Memory Channels

-p

Port mask 0xc use binary 1100 port3 port2

-q

Each core number queue defaults to 1

--no-mac-updating

Alternatively MAC not converted

Configuration OK1012A-C:

ifconfigeth0 192.168.1.200

tcpdump-i eth0 -vv -n -e

Configuring Linux Host:

ifconfigeth0 192.168.1.120

sudomodprobe pktgen.ko

echo"add_deviceeth0"> /proc/net/pktgen/kpktgend_0

echo"dst_mac6e:56:7d:85:ce:4d"> /proc/net/pktgen/eth0

echo"dst192.168.1.200">/proc/net/pktgen/eth0

echo"pkt_size64"> /proc/net/pktgen/eth0

echo"count1000000"> /proc/net/pktgen/eth0

echo"start"> /proc/net/pktgen/pgctrl

We let LinuxHost host to send one million 64-byte packets out, test OK1043A-CDPDK forwarding capability of these packets.

Siege lion readme Shu OK1043A-C DPDK Ambient Experience

By looking at a serial printer information OK1043A-C, we found DPDK have all received packets forwarded out completely. At the same time careful you will find that when using DPDK data forwarding, CPU load has been high, as it has been in the application layer poll, to see whether there are packets need to be addressed.

Original link: https://www.forlinx.com/article_view_267.html

Guess you like

Origin blog.51cto.com/14771158/2485451