AI notes - deep learning

CNN
The value of CNN:

It can effectively reduce the dimensionality of large data volumes into small data volumes (without affecting the results)
and retain the characteristics of the graphics, similar to human vision principles.


Basic principles of CNN:

Convolutional layer – its main function is to retain the characteristics of the image.
Pooling layer – its main function is to reduce the dimensionality of the data, which can effectively avoid overfitting.
Fully connected layer – to output the results we want according to different tasks.


Practical applications of CNN:

Image classification, retrieval,
target positioning, detection,
target segmentation,
face recognition


RNN
The biggest difference between RNN and traditional neural networks is that each time the previous output result is brought to the next hidden layer.

Long short-term memory network – LSTM
has only a single tanh layer in standard RNN, retaining only important information.

GRU is a variant of LSTM. It retains the characteristics of LSTM in highlighting and forgetting unimportant information, and it will not be lost during long-term propagation. This can save a lot of time when the training data set is relatively large.

The unique value of RNN is that it can effectively process sequence data.
Based on RNN, variant algorithms such as LSTM and GRU have emerged. These variant algorithms have several main characteristics:

Long-term information can be effectively retained
. Select important information to retain, and unimportant information will be "forgotten".
Several typical applications of RNN are as follows:
Text generation
Speech recognition
Machine translation
to generate image descriptions
Video tagging


The original intention of GANs
: Automatically
generated adversarial networks (GANs) consist of two important parts:

Generator: Generates data (images in most cases) through a machine, with the purpose of "cheating" the discriminator.
Discriminator: Determines whether the image is real or machine-generated, with the purpose of " deceiving" the discriminator. The purpose is to find out
the "fake data" made by the generator.
3 advantages

Can better model data distribution (images are sharper and clearer)
. In theory, GANs can train any kind of generator network. Other frameworks require the generator network to have some specific
functional form, such as the output layer being Gaussian.
There is no need to use the Markov chain for repeated sampling, no need to make inferences during the learning process, no complex variational lower bounds
, and avoid the difficult problem of approximate calculation of tricky probabilities.
2 flaws

Difficult to train and unstable. Good synchronization is required between the generator and the discriminator, but in actual training it is easy for
D to converge and G to diverge. D/G training requires careful design.
Mode Collapse problem. The learning process of GANs may suffer from pattern loss and the generator
begins to degrade, always generating the same sample points and unable to continue learning.
 

Guess you like

Origin blog.csdn.net/qq_27246521/article/details/132491777