Long short-term memory neuralnetwork for traffic speed prediction using remote microwave sensor data

I recently read Long short-term memory neural network for traffic speed prediction using remote microwave sensor data carefully, and record the main content and feelings of the article here. Originally, I wanted to turn in as an assignment, but the teacher changed the assignment, so I wrote the blog directly, hahaha. It took a few days to reproduce the article. It is not difficult. The result is basically the same as Ma's conclusion. I have time to write a recurring article.

1. Summary
Neural networks are widely used in traffic prediction. This research proposes for the first time that Long Short-Term Neural Network (LSTM NN) is used in traffic prediction. This model can capture nonlinear dynamic changes. The LSTM NN model overcomes the shortcomings of the disappearance of the gradient during the gradient descent process, and therefore exhibits exceptionally superior performance for long-term dependent time series data. In order to further verify the capabilities of LSTM NN, an empirical study was conducted based on the data collected by two microwave detectors in Beijing. And compared the prediction effect of LSTM NN with the traditional RNN model and other popular parameter and non-parametric models. Finally, it is found that LSTM NN can achieve the best prediction accuracy and stability.

2. Introduction
The successful application of intelligent transportation system depends on the accurate acquisition of traffic information. This is especially true of Advanced Traffic Management Systems (ATMS) and Advanced Traveler Information Systems (ATIS). One of the most important pieces of information is to predict the future traffic state. Predicting future traffic conditions can help travelers make better route selection and travel planning, and can also provide support for traffic experts to develop measures to relieve traffic congestion and improve road safety.
At present, coil data is usually used to predict travel time and traffic volume, and video data is used to obtain the real road speed. There are few studies on using other data for speed prediction. Compared with travel time, there are relatively more ways to obtain road speed parameters, such as GPS. Remote Traffic Microwave Sensors (RTMS) are widely used in engineering practice because they do not require closed roads for installation and other advantages. It can collect flow, speed, and occupancy parameters. The accuracy of the speed parameters collected by RTMS can reach 95%.
Compared with traditional statistical methods to make predictions, some artificial intelligence methods are becoming more and more popular. In particular, artificial intelligence methods have more advantages than traditional statistical methods when dealing with data sets with noise and missing data. Neural network (ANN) is one of the typical representatives of artificial intelligence methods. At present, there have been a lot of researches on neural networks used for prediction. The RNN model is very suitable for processing spatio-temporal data due to its structure, but there are two main problems: 1. We must determine the delay time step of RNN; 2 , RNN cannot capture long-term data dependence. LSTM NN is expected to overcome these two shortcomings of traditional RNN and achieve better prediction results.
The innovations of this paper mainly include the following three aspects: 1. LSTM NN is used for speed prediction for the first time; 2. This method can automatically determine the delay time step; 3. The performance of LSTM NN with traditional RNN and traditional statistical methods Compare.
The structure of this article is as follows: firstly, it reviews the research of traffic prediction; secondly, it introduces the structure of LSTM NN; thirdly, it conducts empirical research using Beijing data and combines LSTM NN and traditional RNN model Delayed NN, Elman NN, Nonlinear Autoregressive NN), SVM, ARIMA, Kalman filter methods were compared, and finally concluded, discussed and prospected.

3. Literature review
Traffic flow prediction can be divided into parametric methods and non-parametric methods.
3.1 Parameter method The
parameter method means to determine the model in advance, and then use the data to calibrate the parameters. Common parameter methods include analysis models and parameter calibration models. In the analysis methods, parameter calculations are calculated by equations, such as Bureau of Public Roads (BPR) function. The travel time is calculated based on the required capacity ratio. However, the parameters actually have a certain degree of randomness, so very good parameters cannot be obtained through equation calculation, which may lead to unstable results. For simulation models, the basic theory of traffic flow is the most classic method of obtaining state parameters. Based on this, many improved models have been proposed, such as dynamic wave model, molecular model, three-phase traffic flow model and so on. Although these analyses can help us better understand the internal operation rules and mechanisms of traffic, most of these models involve very ideal assumptions and limited data support, because the main body of traffic is people, and people have great mobility. Therefore, it is difficult to truly understand the mechanism thoroughly.
Time series analysis can also be divided into parametric methods and non-parametric methods. The most classic parameter method is the ARIMA model.
3.2 Non
-parametric methods The model structure and parameters of non-parametric methods are not fixed. Classical statistical models and artificial intelligence models are the two most popular models. Among them, the more commonly used and classic terms traffic parameter prediction methods include the following .
3.2.1 Kalman Filter
3.2.2 Support Vector Machine
3.2.3 Artificial Neural Network
There is a special type of artificial neural network called Recurrent Neural Network (RNN). Because its structural characteristics are very suitable for predicting time series data, traditional RNN models also include the following RNNs with different structures: (1) Elman Neural Network; (2) Time-Delay Neural Network; (3) Nonlinear autoregressive with exogenous inputs (NARX) neural network.
The traditional RNN model has superior prediction performance, but there are still the following two problems.
1. The traditional RNN model cannot train a long-delayed time series data.
2. Traditional RNN needs to set the delay time step in advance, but it is difficult to automatically obtain this delay time step.

4. Long short-term memory neural network (LSTM NN)
The LSTM NN model used in this article is composed of an input layer, a cyclic hidden layer, and an output layer. However, unlike the traditional RNN model, the basic unit of the model It is a memory block. The memory block includes memory units, which are composed of forget gates, input gates, and output gates. These gate units solve the problem of gradient disappearance to a certain extent, and can also automatically control the transmission and forgetting of historical information. Therefore, LSTM NN solves the shortcomings of traditional RNN to a certain extent. The basic structure of the memory block is shown in the figure below:
Insert picture description here
LSTM NN model structure

V. Model development
This paper selects the data of two microwave detectors for empirical research. The two microwave detectors have opposite directions and are installed on the highway. The time period of the data is from June 1, 2013 to June 30, 2013, the collection frequency is 2 minutes, and the collection content includes traffic, share, and speed. The missing data is filled with data adjacent to the time dimension. This article uses the first 25 days as the training set and the last 5 days as the test set. The model is used to predict the speed in the next two minutes based on the previous speed and traffic data. Except for LSTM NN, other models all change delay time steps and different input combinations, and each algorithm is executed 10 times to reduce randomness.

6. Results and comparison
All traditional RNN models maintain the same topology: 1 input layer, 1 hidden layer, and 1 output layer. There are 10 hidden neurons in the hidden layer. For the SVM model, Radial Basis Function (RBF) is used to train the parameters. The parameters p, d, and q of the ARIMA model are determined using the AIC criterion. For the Kalman filter method, the noise points are assumed to be Gaussian distributed.
LSTM NN consists of an input layer, an LSTM layer with memory blocks, and an output layer. Mean Absolute Percentage Errors (MAPE) and Mean Squared Errors (MSE) were chosen as indicators to compare the performance of different models.
Table 1 and Table 2 show the performance comparison of each model. According to the table, we can know that LSTM NN is the best model, and LSTM NN model exceeds the accuracy of traditional RNN model and SVM model by at least 28% in most cases. However, there is a situation where the MAPE of Elman NN is better than LSTM NN, but this result does not indicate that Elman NN is better than LSTM NN, because a more detailed comparison shows that Elman NN is very unstable and has a large MSE value. . This finding is consistent with that of Kikuchi and Nakanishi: Elman NN sometimes fails to learn very successfully. The NARX NN model is better than other traditional RNN models, but the advantage is very small. The SVM effect is also very good, comparable to the NARX NN model, but the SVM tuning is very time-consuming and labor-intensive. In terms of prediction effects, the effects of Kalman filtering and ARIMA models are far worse than other models, which may be caused by too many strong assumptions made by these two methods.
Insert picture description here

Tables 3 and 4 show the performance after setting the input to speed and traffic flow. Similar to the results in Table 1 and Table 2, LSTM NN is still superior to other methods. In addition, when speed and traffic flow are both used as input, the effect is due to a single historical speed as input, but the performance improvement is not obvious.
Insert picture description here

Further study the prediction effect, predict the speed on June 30, 2014, and draw the prediction effect of the NARX NN model and the real speed into a line graph as shown below. It can be found that NARX NN tends to underestimate the future speed. This may be due to insufficient learning ability.
Insert picture description here

Based on the above analysis, we can get the following three conclusions:
(1) For the traditional RNN model, the delay time step is very important, and the correct setting of the delay time step parameters can greatly improve the performance. LSTM can automatically determine the delay time step, so the effect is optimal.
(2) NARX NN surpasses other traditional RNN models. Because NARX NN can include previous input and additional input, Elman NN model will encounter problems such as long training time and occasional training failure, so it is not suitable as Speed ​​prediction model.
(3) The SVM model can achieve relatively good prediction results, but the parameter adjustment is very time-consuming and labor-intensive.

7. Conclusion
This article shows a new LSTM NN method for predicting speed. LSTM NN can learn the long-term correlation of time series, and can automatically determine the delay time step. This is very advantageous and suitable for the prediction of traffic parameters. In order to empirically study the effect of LSTM NN, 1-month speed data was collected, the first 25 days were used as the training set, and the last 5 days were used as the test set. In addition, three RNN models with different topologies and some other parametric and non-parametric models are used to compare with the LSTM NN model. The experimental results show that the accuracy and stability of the LSTM NN model are better than other models, and the following three useful findings are mainly obtained:
(1) As the delay time step increases, the accuracy of speed prediction increases. Setting the delay time step correctly can improve the accuracy of speed prediction. The LSTM NN model can achieve a good speed prediction effect without determining the delay time step.
(2) NARX NN can achieve better results than other traditional RNN models. Due to insufficient learning, Elman NN may produce unstable results.
(3) SVM is also very suitable for forecasting time series data and can produce good forecasting results, but it takes a lot of effort to adjust parameters.
Future research can consider using spatio-temporal information as the input of LSTM NN, such as using the speed of adjacent lanes as an additional input. In addition, the impact of different data collection levels on the prediction effect can also be analyzed. Improvements to the LSTM NN model, such as increasing the depth of the hidden layer, can also be used as research directions.

参考文献:
Ma X , Tao Z , Wang Y , et al. Long short-term memory neural network for traffic speed prediction using remote microwave sensor data[J]. Transportation Research Part C Emerging Technologies, 2015, 54:187-197.

Guess you like

Origin blog.csdn.net/qq_39805362/article/details/105509203