Teach you how to understand deep learning knowledge in an easy-to-understand manner

First of all, there is a lot of information on deep learning online, but most of them are not suitable for beginners to learn. I personally read a 300-page PPT made by Taiwan scholar Li Hongyi - "Understanding Deep Learning in One Day", which will be discussed next. The content in the PPT is described in some popular ways. Also attached is the Baidu network disk link of the PPT: https://pan.baidu.com/s/1b7RVLF8HgqTOE0suFCt_uw Password: os0w. I hope everyone has to help.

The so-called deep learning is a neural network with multiple hidden layers. Deep learning requires a certain degree of mathematical knowledge, and it is necessary to adopt a simple and easy-to-understand method, so as not to make beginners feel intimidated. I think to understand deep learning, you must first pick up the concepts about derivatives, partial derivatives, and various functions that were forgotten before. We know that deep learning is to use neural networks to solve some linear inseparable problems. The neural network mainly has three parts: define the model function -> judge whether the model function is good or bad -> choose the best function, the reason why the neural network is called neural network The network is because it acts like a neural unit. Below is the neuron structure.

It can be seen from the figure that various neurons are connected in different ways, and different neural networks will be extended, such as a neural network with all forward connections. Each neuron in the upper layer is connected with each neuron in the lower layer. Among them, when we judge the loss of the model function, the most important thing is to define the loss function of the model. When we determine the loss function of the model, then our goal is to minimize this loss. Further, the process of selecting the best model parameters also becomes the parameter process of minimizing the loss function. Next we will choose the best model parameters, this part is usually selected by gradient descent. However, the gradient descent method cannot guarantee the global optimum, and often falls into the local optimum situation. This depends on the choice of our initial point. The gradient descent method in deep learning is also called the backpropagation algorithm. The essence of the backpropagation algorithm is to use the deep learning gradient descent algorithm to continuously update the weights of different neurons in the neural network. A process.

Deep learning includes input layer, output layer and hidden layer.

The input layer means that most people have various needs for something, and the proportion of each requirement is different. This is the difference in weight. These different factors constitute the parameters of the input layer.

The hidden layer is similar to being specific to a certain person. He adjusts these needs based on his own wishes, and modifies the proportion to become the most suitable proportion of his needs. The process of running in is a process of continuous learning and correction.

The output layer is the result of the output, the weight change in the hidden layer, and the final result of whether it is suitable or not is output in the output layer.

In deep learning, it is necessary to prevent both underfitting and overfitting. Underfitting means that the number of trainings is not enough, and the obtained function is not necessarily the optimal function. Although underfitting is not good, overfitting is even less desirable. Overfitting, as the name suggests, is the number of samples in the training set. Too many, the training set works well, but the test set doesn't.

Deep learning is a process of continuous running-in. The standard parameters are located at the beginning, and then they are constantly revised to change the weights of some of the nodes. We assume that deep learning is to teach students, and we need to train them continuously. questions, and then give them the correct answer. When they encounter similar questions in the future, they will solve it. This is similar to the process of solving weights with neural networks.

The more parameters of deep learning, it is obvious that the training effect of the model will be better. It is theoretically proved that if a single hidden layer has enough neurons, any function can be realized, but in fact, in a single hidden layer, with the number of node units If the accuracy rate increases, the rate of increase of the accuracy rate is slower, and some even decrease instead of increase. Therefore, compared with the single hidden layer training method, the expression of deep learning is more concise and clear. Building deep neural networks facilitates a modular training process so that we can achieve what we want with less training data.

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324696237&siteId=291194637