DPDK Profile

DPDK (Data Plane Development Kit) is 6WIND, Intel and other companies to develop, mainly based on the Linux system is running for library collections and drive fast packet processing, data processing can greatly improve performance and throughput, improve data productivity plane applications.

DPDK using polling (Polling) rather than an interrupt to process the data packet. Upon receipt of the packet, the DPDK overloaded network card driver does not notify the CPU via interrupt, but directly to the data packets into memory, to deliver application-layer software through an interface provided by the direct process DPDK, this saves a lot of CPU interruption time and memory copy time.

DPDK need why
the Where are The Bottlenecks?
=> The System Calls the system call to the kernel application request, call the appropriate kernel function required to complete the processing, the processing result back to the application
=> interrupt Processing 
=> User Switching BETWEEN space and kernel space => data copy kernel plane to the user plane data replication
=> scheduling and content switch of schedule and content switching process user processes user
=> Inefficient read / write of memory , cache miss is not enough effective memory access, many cache miss, cache memory is used to cache data. CPU has access to the data in the Cache Cache, known as the "hit" (Hit), otherwise known as the "missing" (Miss).
=> Lock / unlock involved in shared data structure access shared lock operation structure requires
More than a few will spend a lot of clock cycle, for real-time systems, can not be tolerated.
At The Problems to Deal with How?
=> The Use the MODE Polled Driver (PMD) INSTEAD of interrupted Driven Network Device Driver
=> Access to Device from the User Space Kernel Space INSTEAD of access to peripherals directly from user space
=> Optimized memory access between PCIe device optimizing access to cache memory
  ==>, Huge Pages
  ==> Packet Processing Batch
  ==> HW / SW Control prefethcing, the Data Direct the IO
  ==> the Data Alignment
=> optimized the use lockless shared queue Data Structure to lock-free shared queue to optimize structure
=> Bind a single software thread to a logical core
The What WE need to do?
=> Need to unload at The Kernel NIC Driver and Switch to the pool the MODE Driver => DPDK Setup initialization time, the normal Network interface the Controller (NIC)
=> Need to the SET Huge Page Memory => DPDK setup initialization time, set the huge page (grep hugePages / proc / meminfo): system process is accessing memory through virtual addresses, but the CPU must convert it to really drive the physical memory address access memory. In order to improve the conversion efficiency, CPU caches recent mapping between virtual memory addresses and physical memory addresses, and stored in a mapping table maintained by the CPU. In order to maximize access speed of memory, you need to save as many mapping in the mapping table. In Linux, the memory is in the form of pages divided, by default, each page is 4K, which means that if a large physical memory, the mapping table entries will very much affect the retrieval efficiency of the CPU. Because the memory size is fixed, in order to reduce the entry mapping table, it can be taken only way to increase the size of the page.
=> Bind software thread to logical cores => CpuInfo.dat

Guess you like

Origin www.cnblogs.com/biggerjun2015/p/11723011.html