See the performance factors from the snake test of DPDK

The snake test generally turns the data packets back and forth between various ports, forming a relatively large full load.

testpmd is used by dpdk to verify the performance of two directly connected network cards, and the two sides play traffic. If there is no hardware (how come you have nothing?) we can still play. Taps under Linux are particles that appear in pairs, no, virtual network cards. After creation, no bridges are needed. They are natural good friends. . .

# ip link add ep1 type veth peer name ep2
# ifconfig ep1 up; ifconfig ep2 up
Check ifconfig, does ip link appear?

For the installation and operation of testpmd, see: http://dpdk.org/doc/quick-start
testpmd needs to add --no-shconf to run multiple instances
. It seems that hugepage is not released after multiple runs, and its performance does not decrease much, --no- huge

# ./testpmd --no-huge -c 7 -n3 --vdev="eth_pcap0,iface=ep1" --vdev=eth_pcap1,iface=ep2 -- -i --nb-cores=2 --nb- ports=2 --total-num-mbufs=2048
testpmd> start tx_first
testpmd> show port stats all
testpmd> show port stats all //twice
  Rx-pps: 418634
  Tx-pps: 436095


Let's create another pair of taps tests and run two sets at the same time:
# ip link add ep3 type veth peer name ep4
# ifconfig ep3 up; ifconfig ep4 up
# ./testpmd1 --no-huge --no-shconf -c 70   --vdev="eth_pcap2,iface=ep3" --vdev=eth_pcap3,iface=ep4 -- -i --nb-cores=2 --nb-ports=2 --total-num-mbufs=2048

The performance of the two running at the same time is similar, because the -c parameter spreads the program to different cores, and you can see the top command by pressing "1". What will happen to the performance of

the two pairs in series? Originally the data was in EP1<->EP2, EP3<->EP4, now it is changed to EP2<->EP3, EP4<->EP1.

# ./testpmd --no-huge --no-shconf -c70 --vdev= " eth_pcap1,iface=ep2 " --vdev= eth_pcap2,iface=ep3 -- -i --nb-cores=2 --nb-ports=2 --total-num-mbufs=2048
testpmd> show port stats all
this Then you will see that the pps are all 0! Because one side of the message is sent to the tap peer end is not connected. Now we connect ep4-ep1 in another window:
# ./testpmd --no-huge -c7 -n3 --vdev=" eth_pcap0,iface=ep1 " --vdev= eth_pcap3,iface=ep4 -- -i --nb-cores=2 --nb-ports= 2 --total-num-mbufs=2048
testpmd> start tx_first
testpmd> show port stats all
testpmd> show port stats all
  Rx-pps: 433939
  Tx-pps: 423428 It started

running, go back to the first window to show the same traffic, So far snake traffic is open.

The question is, why does the performance of the two series do not change much? !
# lscpu
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22
NUMA node1 CPU(s): 1,3,5,7,9,11 ,13,15,17,19,21,23
From the top, the core of testpmd is running on 1-2, 5-6, which crosses the memory efficiency of NUMA. . .
Well, the -c parameter is changed to 15, this is a bitmap, the actual use of core 4 2 0, the ep1-ep2 test result is improved by 50%:
  Rx-pps: 612871
  Tx-pps : 597219
Restore snake test, cpu is 15, 2A, the test performance is as follows, it seems to be a lot slower:
  Rx-pps: 339290
  Tx-pps: 336334
If the cpu is 15,1500, the result:
  Rx-pps: 540867
  Tx-pps: 496891
The performance is much better than spanning numa, but it is still 1/6 lower than that of a single tap pair. Then look at the snake results of 3 taps. The third group of cpu 150000 is still the same numa, but there is not much change:
  Rx-pps: 511881
  Tx-pps: 503456

Assuming that the cpu is not enough, the third testpmd program also runs on cpu 1500, the result is very sad:
  Rx-pps: 1334
  Tx-pps: 1334 The


above test description:
1. Try not to pass data across numa
2 . The total throughput of data processing by tying cpu, drumming and transferring flowers determines the slowest application
3. CPU cannot be reused, and switching scheduling seriously affects performance

===================== ===
Create a bridge br0, add ep1, ep3, ep5 to it, use testpmd to test ep2-ep4, this is a standard bridge, and see how much the performance drops:
#brctl add br0
#brctl add ep1; brctl add ep3
# ./testpmd --no-huge --no-shconf -c15 --vdev="eth_pcap1,iface=ep2" --vdev=eth_pcap3,iface=ep4 -- -i -- nb-cores=2 --nb-ports=2 --total-num-mbufs=2048

  Rx-pps: 136157
  Tx-pps: 128207
600kpps dropped to about 130k, less than 1/4. . . Try it with ovs when you have time.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326657122&siteId=291194637