Deep Learning Specialization Course Notes - Introduction to Deep Learning

The first course, Neural Networks and Deep Learning, will consist of a four-week course where you will learn how to build a neural network, including a deep neural network, and how to train it with data.

At the end of this class, a neural network will be built to recognize cats.

what is a neural network?

A picture explaining what is the ReLU function, just like the curve of house price prediction: (At the same time this picture shows what is a single neuron neuron)


Different factors that affect house prices jointly determine the housing price: (At this time, the input x parameter is the size, the number of bedrooms, the zip code, and the affluence of the place of residence; the output y is the price; and the middle part is a built neural network, here Each small circle in is called a hidden neuron of the neural network.)


It should be noted that each neuron in the hidden layer is fully connected to the input layer , just like:


It is up to the neural network itself to decide how each neuron in the hidden layer is actually connected to the input layer (the real situation may be the same as the hand-drawn picture above).


Supervised Learning with Neural Networks

In housing price prediction and online advertising, standard NN is usually used; in image applications, CNN with convolutional structure is usually used; in audio time-series serialized data, RNN with cyclic structure is often used; language is also serialized Data has its own time series, usually using a more complex RNN; autonomous driving, which has image content, usually a more complex CNN.

Schematic diagram of Standard NN, CNN, RNN:


Explanation of structured and unstructured data:



Why is Deep Learning taking off?

Over the past few decades, a huge amount of data has been generated. Ordinary machine learning is insufficient in processing big data:


As you can see, the abscissa here is the amount of data, and the ordinate is the performance.

It can be seen that for good performance, 1. a large amount of data in the input layer is required; 2. a large number of hidden neurons in the neural network and a large number of connections between a large number of neurons.

At the same time, it can be seen that on small data sets, it is difficult to tell the pros and cons of the algorithm (if good features are manually selected, SVM may perform better than NN); on large data sets, the performance of NN crushes other ML algorithms.

Therefore, in the early rise of deep learning, thanks to the amount of data and the scale of computing, we only need to train a very large neural network on the CPU or GPU, and we can get good results.

In recent years, many algorithmic innovations are aimed at making neural networks run faster. Such as sigmoid function to ReLU function . (modification of activation function)

Reason: In some intervals, the sigmoid function slope (gradient) is close to 0, and the model gradient becomes very slow, while the gradient of positive values ​​in ReLU is all 1.





Note: All screenshots in this series are from coursera's Deep Learning Specialization course

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325461101&siteId=291194637