A deep learning overview for novices

A deep learning overview for novices

1 Introduction

Deep learning methods consist of multiple layers to learn data features with multiple levels of abstraction. Deep Learning (DL) (also known as Hierarchical Learning) refers to the precise assignment of credits across multiple stages of computation to transform aggregate activations in a network. To learn complex functions, deep architectures are used at multiple levels of abstraction, i.e. non-linear operations; such as ANNs, with many hidden layers. To sum it up in precise terms, deep learning is a subfield of machine learning that uses multiple levels of nonlinear information processing and abstraction for supervised or unsupervised feature learning, representation, classification, and pattern recognition.

2. Directory

1. Deep Learning (DL) Methods

2. Deep Architecture (i.e. Deep Deep Neural Network (DNN))

3. Deep generative model (DGM)

3. Excellent paper

4. Deep Learning Methods

1. Deep supervised learning

​ Supervised learning is applied when data labeling, classifier classification or numerical prediction

2. Deep unsupervised learning

When the input data is not labeled, unsupervised learning methods can be applied to extract features from the data and classify or label them

3. Deep reinforcement learning

Reinforcement learning uses a reward and punishment system to predict the next step of the learning model, which is mainly used in games and robots to solve ordinary decision-making problems.

5. Deep Neural Networks

5.1 Deep Autoencoder

An autoencoder (VAE) is a neural network (NN) where the output is the input. AE takes the original input, encodes it into a compressed representation, and then decodes the input again. In deep AE, low hidden layers are used for encoding, high hidden layers are used for decoding, and error backpropagation is used for training.

5.1.1 Variational Autoencoders

A variational autoencoder (VAE) can be counted as a decoder. VAEs are built on standard neural networks and can be trained by stochastic gradient descent (Doersch, 2016).

5.1.2 Multilayer denoising autoencoder

In early autoencoders (AE), the encoding layer has a smaller (narrower) dimension than the input layer. In a multi-layer denoising autoencoder (SDAE), the encoding layer is wider than the input layer (Deng and Yu, 2014).

5.1.3 Transform autoencoders

Deep autoencoders (DAEs) can be transformation-variable, that is, the features extracted from multiple layers of non-linear processing can be changed according to the needs of the learner. Transforming autoencoders (TAEs) can use both input vectors and target output vectors to apply transformation invariance properties to steer codes in desired directions (Deng and Yu, 2014).

5.2 Deep Convolutional Neural Networks

Four basic ideas make up a Convolutional Neural Network (CNN): local linking, shared weights, pooling, and the use of multiple layers. The first part of CNN consists of convolutional layers and pooling layers, and the latter part is mainly fully connected layers. Convolutional layers detect local connections of features, and pooling layers combine similar features into one. CNNs use convolutions instead of matrix multiplications in their convolutional layers.

5.2.1 Deep Max Pooling Convolutional Neural Network (MPCNN)

Maximum pooling convolutional neural network (MPCNN) mainly operates on convolution and maximum pooling, especially in digital image processing. MPCNN usually consists of three types of layers other than the input layer. A convolutional layer takes an input image and generates a feature map, then applies a non-linear activation function. A max pooling layer downsamples the image and keeps the maximum value of the sub-region. Fully connected layers perform linear multiplication. In deep MPCNN, convolution and hybrid pooling are used periodically after the input layer, followed by fully connected layers.

5.2.2 Very Deep Convolutional Neural Networks

Using very small convolution filters, the depth reaches 16-19 layers, the first one used in text processing, he works on the character level.

6. Training and Optimization Techniques

6.1 Dropout

to prevent the neural network from overfitting. Dropout is an average regularization method for neural network models by adding noise to their hidden units. During training, it randomly samples units and connections from the neural network. Dropout can be used in graphical models like RBMs (Srivastava et al., 2014) or in any type of neural network. A recently proposed improvement on Dropout is Fraternal Dropout for Recurrent Neural Networks (RNN).

6.2 Maxout

Maxout, a new activation function for Dropout. The output of Maxout is the maximum value of a set of inputs, which is conducive to the model averaging of Dropout.

6.3 Zoneout

Zoneout, a regularization method for recurrent neural networks (RNN). Zoneout uses noise randomly during training, similar to Dropout, but keeps hidden units instead of dropping them.

6.4 Deep Residual Learning

He et al. (2015) proposed a deep residual learning framework called ResNet with low training error.

6.5 Batch Normalization

Ioffe and Szegedy (2015) proposed batch normalization, a method to speed up deep neural network training by reducing internal covariate shift. Ioffe (2017) proposed batch-weight normalization, extending previous methods.

6.6 Distillation

Hinton et al. (2015) propose methods to transfer knowledge from an ensemble of highly regularized models (ie, neural networks) to a compressed small model.

6.7 Layer Normalization

Ba et al. (2016) propose layer normalization, especially for accelerated training of deep neural networks for RNNs, which addresses the limitations of batch normalization.

7. Applications of deep learning

  • Image Classification and Recognition
  • Video classification
  • sequence generation
  • defect classification
  • Text, speech, image and video processing
  • Text Categorization
  • speech processing
  • Speech Recognition and Spoken Language Understanding
  • text-to-speech generation
  • query classification
  • sentence classification
  • sentence modeling
  • vocabulary processing
  • preselection
  • Document and Sentence Processing
  • Generate image captions
  • Photo Style Transfer
  • natural image manifold
  • image coloring
  • image quiz
  • Generate textures and stylize images
  • Visual and textual Q&A
  • Visual Identification and Description
  • Target Recognition
  • document processing
  • Composition and Editing of Character Actions
  • song synthesis
  • Identification
  • Face recognition and verification
  • Video Action Recognition
  • Human Action Recognition
  • Action recognition
  • Classify and visualize motion capture sequences
  • Handwriting Generation and Prediction
  • Automation and Machine Translation
  • named entity recognition
  • mobile vision
  • dialogue agent
  • call genetic variation
  • cancer detection
  • X-ray CT reconstruction
  • Seizure Prediction
  • Hardware Acceleration
  • robot

8. Conclusion

Even though deep learning (DL) is advancing the world faster than ever, there are still many aspects worth exploring. We still don't fully understand deep learning, how we can make machines smarter, closer to or smarter than humans, or learn like humans. DL has been solving many problems while applying technology to everything. But human beings still face many problems, such as people still die from hunger and food crisis, cancer and other deadly diseases and so on. We hope that deep learning and artificial intelligence will be more dedicated to improving the quality of human life by conducting the most difficult scientific research. Last but not least, may our world become a better place.

Guess you like

Origin blog.csdn.net/weixin_43720666/article/details/128044156