CNN 01 (Introduction to CNN)

1. Development of convolutional neural networks

convolutional neural network 

In the field of computer vision , what is usually done is to use machine programs to replace human eyes to identify target images. So whether neural networks or convolutional neural networks are actually algorithms that existed in the last century, it’s just that in recent years, the computing power of computers has not been the same as it was back then. At the same time, there is a lot of training data now , so the relevance of neural networks Algorithms are back in fashion, and so are convolutional neural networks.

  • In 1974, Paul Werbos proposed error backpropagation to train artificial neural networks, making it possible to train multi-layer neural networks.
  • In 1979, Kunihiko Fukushima proposed Neocognitron, and the concepts of convolution and pooling were basically formed.
  • In 1986, Geoffrey Hinton co-authored a paper: Learning representations by back-propagation errors.
  • In 1989, Yann LeCun proposed a convolutional neural network that uses backward conduction for updating, called LeNet.
  • In 1998, Yann LeCun improved the original convolutional network, LeNet-5.

Three major areas of deep learning: Computer Vision CV Natural Language Processing NLP Speech Recognition VR

2. Why are convolutional neural networks needed?

2.1 The pressure of the number of image features on the effect of neural networks

Assume that the picture below is a black and white picture with a size of 28 * 28. Each pixel has only one value (single channel). Then the total number of values ​​is 784 features.

 Now this picture is in color, so the color picture consists of three RGB channels, which means that the total value has 28*  28  *3 = 2352 values. 

img

 From the above we get that the input of a picture is 2352 feature values, that is, the neural network is connected to several neurons. Assuming that the first hidden layer has 10 neurons, there are 23520 weight parameters.

If the picture is larger, assuming the picture is 1000 *1000* 3, then there are a total of 3 million values ​​​​and 10 neurons are also connected, then there are 30 million weight parameters. With such a parameter size, updating the neural network parameters requires a lot of calculations, and it is difficult to achieve better results. People are not inclined to use multi-layer neural networks.

So there is the popularity of convolutional neural networks, so why do people choose convolutional neural networks? So first let’s introduce the concepts of receptive fields and edge detection.

2.2 Receptive field

In 1962, Hubel and Wiesel proposed the concept of receptive field through research on cat visual cortex cells . The neurocognitron proposed by Fukushima based on the concept of receptive field can be regarded as the first convolutional neural network. a realization network.

A single receptor is connected to many sensory nerve fibers, and sensory information is transmitted through many sensory nerve fibers with a summation of spatially and temporally different impulses, which is equivalent to being transmitted through encoding.

img

2.3 Edge detection

In order to detect more information with fewer parameters, it is based on the above idea of ​​receptive fields. Usually neural networks need to detect the most obvious vertical and horizontal edges of objects to distinguish objects . for example

img

Looking at an example, a 6×6 image convolution is performed with a 3×3 filter (Filter or kenel) (symbol is *). * may also be matrix multiplication, so it is usually specified as convolution. Represents the meaning of convolution.

  • It is equivalent to placing the Filter on the Image, moving the entire Image from left to right and from top to bottom (one pixel by default), and calculating the sum of the element-by-element product of the part of the ImageImage covered by the Filter and the Filter.

img

In this 6×6 image, the left half of the pixels have all 10 values, the right half of the pixels have all 0 values, and there is a very obvious vertical edge in the middle . In the result of convolving this image with the filter, the values ​​in the middle two columns are both 30, and the values ​​in the two columns on both sides are 0, that is, the vertical edges in the original 6×66×6 image have been detected .

Note: Although it looks very thick, it is because our image is too small, only 5 pixels long and wide, so the final result you see is two pixel positions. If it is in a 500 x 500 image, it is a vertical pixel. Straight edge too.

img

With the development of deep learning, we need to detect edges in more complex images. Instead of using manually designed filters , we can also use the values ​​in the filter as parameters and learn them through backpropagation . The algorithm can select appropriate detection targets based on actual data, whether it is detecting horizontal edges, vertical edges or edges at other angles, and learn low-level features of the image.

Guess you like

Origin blog.csdn.net/peng_258/article/details/132528359