Depth study the advantages and disadvantages


Neural Networks in the development process, has experienced ups and downs three times, it is very important reason is that the advantages and disadvantages of neural networks to be reflected at different times. In theory, only comprise a single hidden layer neural network, any function can be fitted, and in practice it is not commonly used. Often the data were fitted using a multilayer neural network having a hidden layer.
A disadvantage and effective measures
1, at an early stage, due to the BP algorithm was invented, and was limited computing power, the establishment of neural network size is small, so the performance of neural networks also were much more limited. In the 1990s, after the invention of BP algorithm, we can use back propagation, by error to guide the training of the neural network gradient descent optimization algorithm, to reduce the predicted value and the difference between each target to achieve the model of learning. However, when the neural network size increases gradually, the hidden layer comprises gradually increased, during this time using BP algorithm, a gradient of the process will disappear, because the error is often caused by the disappearance, and therefore the foregoing neural network weights almost not updated, thus leading to the neural network fitting accuracy of the data is limited. The researchers in order to preserve error neural network proposed different strategies to retain the error, so that the error can be passed without the emergence of the phenomenon disappear.
2, the variable portion of the neural network model composed of the main neural network architecture (i.e., artificial neural network to select how many layers, each number of hidden nodes, each node activation function implicitly how to select, layers how the connection between, etc.), under normal circumstances, after building a good neural network trained neural network becomes weights between layers of learning, after all, in order to better target, design a lot of weight learning algorithm , including gradient descent algorithm, a conjugate gradient method, Newton's method L-BMGS, quasi trust region method. With the continued expansion of network size nerve, to learn more and more parameters, increasing the degree of freedom of the neural network, which gives the train the neural network model has brought great difficulties. Training complex neural network, we can easily fall into local minima and could not get out , so the researchers designed a lot of ways to try to avoid training the neural network into local minima.
3, such as 2 said, with the complex neural network structure, ability of neural network model fit more and more powerful, butPoor often appeared to fit the situation, which is in machine learning manipulation very fatal problem, that is, the performance on the training data well, but the generalization ability is poor, not seen in the data, the performance of . Therefore, the researchers in order to avoid as much as possible the occurrence of neural network overfitting, it suggests many ways.

(1) wherein a conventional method is extremely increasing the data, thus increasing the parameter, the parameter space is increased exponentially, to completely cover the sample so that the data space, the amount of data required increases exponentially, so by increasing the amount of data it wants good to avoid over-fitting phenomenon is unlikely .

(2) Another way is in the process of training, so training time to stop, but it generally requires generalization checkpoint set up to monitor the real-time neural network, generally use the cross-validation to determine whether to stop training the neural network .

Another method (3) is added regularization term, that is to add parameters to learn restraint, reduce the value of the parameter space as possible , which is very common in machine learning methods, generally Lasso regularization, Ridge regular item or items Elastic (two to synthesize weighted before regularization term).

(4) The depth of the neural network, because of its unique structural properties, can also be employed to simplify the drop out of the way neural networks, i.e. at each iteration of training, the probability of certain nodes masked , so largely avoiding the synergy between slightly neural network nodes, the neural network such that more diversity, the plurality of differences corresponding to a large ensemble of neural networks to improve generalization of the finally obtained.


Second, the advantages
1, the neural network is very powerful place is a perfect fit capability, you can approach any complex function, and neural networks can achieve dimensions of infinite dimensional, so its ability to fit the data is quite powerful . Often existing traditional machine learning methods, in a certain extent, are a special case of neural networks. Such as SVM, Logistic regression, etc. can be accomplished by the neural network.
2, neural network because it contains many hidden layers, hidden layers but also has many hidden nodes, thus making skills neural network is very strong , which is well represented in the Bayesian theory. For Restricted Boltzmann Machine i.e., in the form of neural network, training Restricted Boltzmann Machine layer by layer, or Bayesian networks, can build depth Boltzmann machine, the depth of the Bayesian network, so that further enhance the characterization capabilities of the network. On this basis, there has been a self-encoded, so that the neural network can featureless supervised learning data, especially pictures of abstract features for subsequent classification, detection, segmentation and other features provide a good support, no people in order to set the feature, but in a certain extent, conventional feature extraction methods can still provide a reference solution for the neural network learning feature.
3, further, now put convolution neural networks, recurrent neural networks, to further improve the performance of neural networks, making it better able to deal with specific problems in specific areas, reflecting its strong capabilities . Convolution neural network, in a certain extent, that the inter-image associated with strong local, regional and distant little correlation, so did the Markov hypothesis that train the neural network more easily.
4, and probabilistic neural network model can also be combined, so that the neural network has the ability to infer , the addition of random factors, such reasoning neural network can be improved.


In summary, the neural network has a very strong ability to characterize, but powerful at the same time, there are many problems, which requires scientists all aspects of the joint efforts of scholars in this area, drawing on biological neural networks as a reference to further enhance neural network performance.


----------------
Original link: https: //blog.csdn.net/nicholas_liu2017/article/details/71250082

Guess you like

Origin www.cnblogs.com/emanlee/p/12404147.html