Modeling 2019 graduate mathematics (car condition established) - form and report codes

Time start of the race and open question I have to face a certain period of time conflicts, so choose one of the most simple (think), is easier to hurry up the subject of the results - the establishment of vehicle maneuvers.

The main tasks: clustering automobile division operating conditions (mainly paper-based order)

Structures herein are: Topics requirements - Basic Report - implementation code

Topics requirements:

Cars conditions (Driving Cycle) test cycle known as the vehicle, is described cars velocity - time curve (Figures 1 and 2, the total time is generally less than 1,800 seconds, without limitation criterion, a total time of 1180 seconds in FIG. , 2 total time of 1800 seconds), reflecting the kinematic characteristics of the road car, is an important, common basic technology in the automotive industry, is the basis for vehicle energy consumption / emissions standard test methods and limits, as well as each car the main performance indicators calibration reference at the time optimization. Currently, Europe, America, Japan and other developed countries, automobile, are used to adapt to be calibrated to optimize vehicle performance and energy consumption / emission certification in their cars with standard conditions.

The beginning of this century, China's direct use of European NEDC driving cycle (Figure 1) certification of automotive products consumption / emissions, effectively promoted the development of energy saving vehicles and technologies. In recent years, with the rapid growth of car ownership, road traffic situation has greatly changed our country, government, business and people increasingly find to NEDC conditions as a benchmark to optimize the calibration of the car, the actual fuel consumption and regulatory approvals result deviation more large, affecting the credibility of the government (such as a certain type of car, the Ministry of car fuel consumption marked 6.5 l / 100 km, the user experience actual fuel consumption may be 8.5-10 liters / 100 km). In addition, years of practice in Europe, also found many problems NEDC working conditions in favor of the world's light vehicle test cycle (WLTC, Figure 2). However, the conditions than the idle time and average speed of the two most important features of working conditions, greater differences with China's actual conditions of cars. As vehicle development, based on the evaluation of the most basic, in-depth research, the development of our country reflect the actual road conditions test conditions, it becomes increasingly important.

On the other hand, China's Liao wide area, different levels of development in various cities, weather conditions and traffic conditions, making the condition characterized by cars with workers in various cities there is obviously different. Therefore, based on the city's own data to build cars with Urban Driving Cycle cars more and more urgent, hope to build the cars with the conditions and the city try to match the car's driving situation, ideally fully representative of the city car driving situations (can also be understood as concentrate on the actual driving situation), currently in Beijing, Shanghai, Hefei and so has built cars with conditions in cities.

To better understand the importance of Vehicle Driving Cycle curve constructed to a certain type of vehicle fuel consumption, for example, a brief description of how the Ministry of fuel consumption labeling is to test out? Ministry of Industry and fuel consumption labeling is not the actual model car fuel consumption on a real road, but based on national standards (such as "GB27840-2011 heavy commercial vehicle fuel consumption measurement methods"), in the laboratory according to automotive driving cycle curve, according to certain criteria, detected, calculated. Thus, the Ministry whether marked fuel consistent with the actual fuel consumption, motor vehicle driving cycle curve closely.

 

Figure 1 European NEDC conditions

Figure 2. World WLTC conditions

 

Second, the proposed target

In the above background, in accordance with Annex (three data files, each data file to the same vehicle data collected at different time) data in a city has to offer light vehicle road actually collected (sampling frequency 1Hz) constructed to reflect the participation of a data acquisition cars feature cars condition curve (1200-1300 seconds), which is reflected in the curve vehicle movement characteristics (e.g., average speed, average acceleration, etc.) representative of the respective characteristic data sources collected , the smaller the error between the two, the better the constructed representation cars condition.

 

Third, resolve problems

1. Data Pretreatment

The raw data collected by the data acquisition device cars with direct recording of data will often contain some bad value, bad data includes several types:

(1) Due to high-rise buildings or over-covered tunnel, GPS signal is lost, resulting in the time of the data provided are not continuous;

(2) plus automobile, abnormal deceleration data (typically ordinary car: 0 to 100km / h acceleration time is greater than 7 seconds, the maximum deceleration emergency brake. 8 m ~ 7.5 / S 2 );

(3) Long-term parking (such as parking does not stall waiting for people, but parking stall the acquisition equipment is still running, etc.) abnormal data collected.

(4) long traffic jams, where intermittent low speed (the maximum speed is less than 10km / h), the processing generally be idling.

(5) is generally believed that the idle time exceeds 180 seconds is abnormal, the idle time may be the most 180 seconds treatment.

Please rational design method of the above bad data pre-processing, the number of records of each file and gives the post-processed data.

2. Extraction kinematic fragment

Kinematic fragment is an automobile vehicle starts from idle state to the beginning of the next interval between an idling state, shown in Figure 3 (based on kinematic conditions cars fragment constructed curve is one of the most common methods today, but not the method steps necessary to build some cars condition curve does not need to be divided into segments and extraction kinematics). Please rational design method, after the data is divided into a plurality of the processed fragments kinematics, and kinematics given number of segments of each data file is finally obtained.

 

FIG fragment defined kinematic 3

3. Construction Automobile driving conditions

According to the data requested by the process, constructed to reflect the participation of a data acquisition automobiles cars driving cycle characteristics curve (1200-1300 seconds), the vehicle movement characteristic curve representative of the collected source data (data of the treated corresponding features), the smaller the error between the two, the better the constructed representation cars condition. Claim:

(1) scientific and effective construction methods (mathematical model or algorithm, in particular, to encourage innovative approaches, if the existing method, you must specify the source);

(2) reasonable vehicle motion characteristic evaluation system (at least including but not limited to the following indicators: the average speed (km / h), the average running speed (km / h), the average acceleration (m /), the average deceleration (m /) idle time ratio (%), the acceleration time ratio (%), the deceleration time ratio (%), the standard speed (km / h) difference, standard deviation acceleration (m /) and the like);

Each index (3) according to your constructed cars conditions and dynamic characteristics of automotive assessment system, were calculated cars with conditions acquired the city data source (data after the treatment) (motion feature) value, and justify the conditions of your cars constructed.

 

Basic report:

The model assumes

 

Assumptions made that through research literature and analysis of the problem with constructing conditions:

 

1, the effectiveness of GPS and other recorded values ​​of lorry equipment. (I.e., in a non-fault conditions, the measurement value is the exact value)

 

2, time is not consecutive segments can be obtained by interpolation of consecutive segments around time.

 

3, the length of time is less than or greater than the kinematic 700s 20s fragment to construct not contribute to driving conditions.

 

4, characteristic parameters established to reflect the accurate kinematic characteristics of driving conditions.

 

5, after data is pre-processed without noise.

 

6, the network is preferably fragments during reconstruction clustering loss function and loss function same as the objective functional contribution.

Model

The basic theory

(1) IDEC algorithm

Unsupervised Clustering is one of the big data analysis and machine learning is a hot issue. Compared to traditional clustering method, clustering method to learn the depth of network-based classification has a strong ability to use a wide range of advantages. IDEC loss by defining cluster to simultaneously update the network parameters and the cluster center, embedded encoder automatically clustering algorithm to the depth (DEC) capable of binding to the main features of the local feature learning cooperative cluster, effectively improve the efficiency of the algorithm and effective sex. IDEC to define the objective functional loss of use of reconstruction and clustering loss:

                          (4-3-1)

Wherein the reconstruction is a loss function, loss function for the cluster as a control factor. At that time, the objective function degenerates to traditional DEC objective function, which is the reconstruction algorithm is no longer considered a loss function. Similar definitions and methods DEC:

              (4-3-2)

Wherein the degree of similarity is embedded and the cluster center point to define the students may be distributed by T, and the distribution of the target can be defined by:

                   (4-3-3)

    Compared DEC need to dispose of a decoder and encoder tuning, IDEC defines the rebuilding loss to achieve clustering:

                    (4-3-4)

Wherein, respectively, and encoding and decoding functions. Reconstructive loss function derived from the encoder, this method can maintain the distribution of local structure data generator.

Objective function (4-3-1) may be used in small quantities optimized stochastic gradient descent (SGD) and the back-propagation is achieved. Overall, each iteration is necessary to optimize the three parameters: the right to self-coding weights, cluster centers and target distribution.

First, the distribution of fixed targets, the cluster is calculated on the gradient of the loss function and insert point cluster center, according to the principle of SGD using the number of samples in small quantities and the learning rate for the cluster center, encoding and decoding weight updating weights, the distribution of the target the update only needs to update its tag values ​​once every T times.

 

(2) T-SNE dimensionality reduction

Distributed random embedded near T (T-SNE) is a popular non-linear learning method based on information theory, a low recovery from high-dimensional sampled data structure prevalence, in order to achieve reduced dimensionality data visualization i.e., high-dimensional data is the dimensionality reduction the best method in a visual effect. The core idea of ​​T-SNE algorithm is similar attributes between the data points into a probability.

The similarity of the original space is represented by a Gaussian joint probability:

                (4-3-5)

Embedding space similarity is represented by the "Student T distribution":

                 (4-3-6)

To evaluate the quality visualization through the joint probability space and embedded in the original space Kullback-Leibler (KL) divergence:

                  (4-3-7)

(3) K-means clustering

Cluster analysis is also known as group analysis, it is to develop a multivariate statistical method (sample or variable) classification, designed to maximize the heterogeneity and homogeneity maximize between class and class objects between the class object. K-means clustering method has advantages capable of efficiently classify large data sets, which is widely used in data analysis and processing.

K-Means algorithm idea is very simple, for a given sample set according to the distance between the size of the sample, the sample set is divided into K clusters. Let point within the cluster close together as possible, and let the distance between the clusters as large as possible.

Suppose divided into clusters (), the target function can be established to minimize square error:

                  (4-3-8)

Which is the mean vector clusters can be expressed as: 0

                    (4-3-9)

K-Means objective function (4-3-8) by a heuristic algorithm to finally obtain relatively high-quality category.

4.3.2.2 Modeling

According to the theoretical goal of establishing functional as follows:

                     (4-3-10)

Kinematic fragment is entered, it can be characterized through kinematic fragment obtained feature vectors encoding function, called network characteristic parameters, the decoded function kinematic fragment obtained after reconstruction. Reconstruction loss function is defined by establishing two-norm between the input and reconstructed output:

                 (4-3-11)

Network characteristic parameters can be obtained by characterizing the kinematics fragments IDEC network, a feature vector dimension is 10, the number of characteristic parameters of the network 10. The network screened network 10 characteristic parameters calculated with the characteristic parameters prior to fusion 9, obtained can be characterized by three components kinematic fragment S-TNE dimensionality reduction. By means of a K-means clustering algorithm for clustering three components can be four clades, Silhouette using clustering coefficients define the loss function

                    (4-3-12)

Which is the average distance between the sample and the other points in the same category for the next sample average distance between all other points from the nearest cluster. This can create a complete functional goals, and by means of weights of the neural network back-propagation algorithm update IDEC network. Multiple iterations until the above objective function converge or meet the requirements.

According to the model, to build cars flowchart conditions given graph-based network IDEC:

 

 

 

FIG 4-3-1 IDEC-based network conditions with automobiles curve condition Construction

FIG 4-3-1 IDEC-based network conditions with automobiles curve condition Construction

Implementation code:

Can be found in: https://github.com/cui-xiaoang96/2019-China-GMCM

 

 

 

 

 

 

Guess you like

Origin www.cnblogs.com/daangzi96/p/11649001.html