Research on Network Traffic Classification Technology Based on Deep Learning

Purpose : Network traffic classification has always been one of the hotspots of academia, industry, and network supervision departments. It refers to dividing mixed traffic into different traffic categories based on the characteristics or parameters of different network applications or protocols. On the one hand, the field of network security needs to identify intrusive traffic; on the other hand, it needs to classify and analyze the traffic of different applications when performing network management, so as to reasonably control and allocate resources to ensure network QoS. With the massive increase in the amount of data and types of network traffic, traditional classification methods are difficult to meet the requirements, and algorithms based on machine learning have become a research hotspot in network traffic classification. Aiming at the bottleneck caused by machine learning feature engineering, this paper studies the application of deep learning algorithms based on convolutional neural networks in network traffic classification.
Methods : 1. Three-dimensional convolutional neural network is applied to network traffic classification.
2. Aiming at the errors caused by the forcible classification of unknown categories into known categories by convolutional neural networks,
this paper improves the category judgment layer of the network. Through simulation experiments, this paper verifies that when the category judgment is wrong (including unknown category), the distribution of the probability value corresponding to the category with the highest probability is obviously different from the distribution of the probability value when the judgment is correct. Based on the above findings, this article sets a dynamic threshold for the category judgment layer. Under the optimal threshold found in training, this article can effectively identify unknown categories.
Insert picture description here
The data preprocessing module is divided into four parts: data stream cutting, key data extraction, dimension conversion, and time series combination.
Insert picture description here
1. Data stream cutting: Divide the original traffic into discrete data stream units, and each data stream is a sample. The judging standard of the data flow is the data packet with the same 5-tuple (source IP address, source port number, destination IP address, destination port number and transport layer protocol).
2. Key data extraction: first extract the first data packet of each data stream, discard the excess data packet, if the length of the data stream is insufficient, complete 0 data packets at the end. Then anonymize, that is, remove the IP address of the IP layer and the MAC address of the data link layer.
3. Dimension conversion: unify the length of the data packet, that is, extract the first/byte data in each data packet, discard the excess data, if the data packet length is insufficient, add 0 at the end. Then each byte of data is separately encoded with w bits on-hot, and each data packet is converted into two-dimensional data of /xm. If each byte of the input data is regarded as a pixel value, the output of this step is a two-dimensional grayscale image, which can be analyzed by image processing.
For example, the input data consists of three parts, and the possible values ​​are
{0,1,2}, {12,13,14}, {20,21,22}, then the one-hot code corresponding to each group of input data The result is 9 digits, which respectively represent whether the first part is 0, whether the first part is 1, whether the first part is 2, whether the second part is 12, whether the second part is 13, whether the second part is 14, and the third part Is it 19? Is the third part 20? Is the third part 21? If input x = (l, 14,  20), then output = ((0, 1, 0), (0, 0, 1), (1,0, 0)), the possible value of each byte of data It is 0 to 255, a total of 256 values, which can be encoded as 256-bit output data. In order to reduce the amount of system calculation to improve the real-time performance and adapt to the simulation environment of this paper, this paper normalizes the input data to 0-16 to form a 16-bit code.
4. Timing combination: Will? The two-dimensional data corresponding to the seven data packets are sequentially combined into 1 mn three-dimensional data. This step is similar to the combination of multi-frame images into a video file. The output data can be used as the input
control group of the three-dimensional convolutional neural network in video processing : 1. One-dimensional preprocessing After completing the work of data stream cutting and key data extraction, the data extracted from each data packet are sequentially connected to form one-dimensional time series data
of length l n. 2. Cutting-type two-dimensional preprocessing If the length of the new dimension is set to l, the one-dimensional time series data will be cut into i segments according to the length of each segment (l n)/i, and the i rows of the two-dimensional data will be formed in sequence.
3. One-hot coding type two-dimensional preprocessing performs one-dimensional time series data into one-hot coding to form two-dimensional input data.
Insert picture description here
Insert picture description here
Data set : USTC-TFC2016,

Insert picture description here

Guess you like

Origin blog.csdn.net/qq_43360777/article/details/105727139