Computer Vision and depth of learning

1. Computer Vision and depth of the relationship between learning

For a long time, so that the computer can look listen can be said to be a computer scientist tireless pursuit of the goal, the goal is to make the most basic computer can see in this world, let the computer be able to have eyes like humans, so that they can understand the world .

1.1 human optic nerve inspiration

1.1.1 visual animal experiments

In 1958, DavidHubel and Torsten Wiesel in JohnHopkins University, studied the correspondence between the pupil area of ​​the cerebral cortex neurons. They are on the cat's head skull, he opened a 3 mm hole, insert the electrode into the hole, measuring the level of activity of neurons. Then, they are in front of the cat, show a variety of shapes, a variety of object brightness. And, when the show every object, changing the position and angle of the object is also positioned. They hope that through this approach, let the cat pupil feel different types and intensity of stimulation. The reason to do this test, the purpose is to prove a guess. Located in different visual brain cortex neurons between stimulation, the existence of a corresponding relationship with the pupil suffered. Once the pupil by a certain kind of stimulus, a part of the brain cortex neurons will be active. Many days experienced a boring repeated tests, at the expense of only several poor cat, David Hubel and Torsten Wiesel found in neuronal cells is called a "directional selectivity cells (Orientation Selective Cell)" a. When the pupil found the front edge of the object, and this edge pointing in a certain direction, such neurons will be active. This discovery has stimulated people to think further to the nervous system. Nerve - center - work processes in the brain, perhaps a constantly iterative, continuous process of abstraction.

1.1.2 Visual Neuroscience views

After the experiment DavidHubel from the optic nerve and Torsten Wiesel, the visual neuroscience officially been established, until now, on the optic nerve of several widely accepted point of view are the following:

(1) of the brain processes visual information is hierarchical, a lower brain regions may be treated to edge, edge or something, deal with more abstract high-level brain areas such as the face, sports house, objects and the like. Information is extracted out up layer by layer transfer processes.
(2) the brain processes visual information is also parallel, different brain regions to extract different information of different dry live, some dealing with what the object is, some responsible for dealing with this is how objects move.
There is extensive contact between (3) brain regions, while higher cortical cortex there are a lot of low-level feedback projection.
(4) processing of information generally well received by the top-down and bottom-up attention to the regulation. That is, the brain may be selective to certain features some space or a more sophisticated processing.

Further research found that when a particular object appear in any field of vision, some of the brain's visual neurons has been in a constant active state. Explanation from a scientific point of the optic nerve, is the human visual identification from the retina to the cortex, the nervous system from the subtle evolution of the small feature recognition target recognition. For the computer, if you have such a "cortical" converts the signal, the computer will be modeled on the human vision has become a reality.

Difficulties 1.2 computer vision and artificial neural network

Although in a lot of research, human vision the secret was revealed gradually open, but you want to write these ideas and experiences used on a computer are not so simple. Identify the computer is mechanized so that it leads to even read the same image, when the light is not the same, the computer identification results are quite likely to change. For computers, the recognition of two separate objects easily, but recognize the same question in a different environment is more difficult. But only solved the latter problem, be considered fairly complete vision system.

The core computer vision is to ignore the differences in how the same objects and strengthen internal distinction between different objects, that is similar to the same object, but there is a big difference between different objects.

Artificial Neural Network in the 1960s to produce seed, but was limited computer hardware resources, the theory can only stay on the simple model of development and can not be fully verified.

Theoretical foundation in the 1980s artificial neural networks landmark "back-propagation algorithm" invention, the original very complicated dismantling of the chain rule is a separate, only the connection layer of context, according to their respective weights error assignment.

Back-propagation algorithm:

BP algorithm (i.e., back propagation algorithm) suitable for a multi-layer neural network learning algorithm, it is based on a gradient descent method. BP network input-output relationship is essentially a mapping relation: a n-input m BP neural network output is accomplished continuous mapping function from the n-dimensional Euclidean space into m dimensional Euclidean space of a finite field, that mapping is highly non-linear. It is derived from the information processing capacity of several simple non-linear function of the composite, it has a strong function of the ability to reproduce. This is the basis of BP algorithm is applied.

With further research, in 2006, Geoffrey Hinton made a breakthrough in training deep neural networks. He was the first to demonstrate the use of more and more hidden layer neurons artificial neural network has a better ability to learn. Its basic principle is to use a certain distribution of data to ensure that the neural network model initialization, then use the supervised data is calculated on a good network initialization, using back-propagation of neurons to optimize the adjustment.

1.3 Application of deep learning to solve the problem of computer vision

"Deep with a convolutional neural network structure (CNN)" it is widely used in computer vision. This is a step by step visual modeled biological decomposition algorithm, assigning different levels of the image processing.

What is the convolution? Convolution is the relationship between the two functions, and then come to a new value, he is in continuous space integration is done, and then summed in a discrete space process. In fact, computer vision inside, you can put convolution as an abstract process, statistical information is to abstract out a small area.

CNN, in particular, its basic principles and algorithms of computer vision is regarded as the preferred solution, the depth of learning used in computer vision and many more advantages:

(1) highly versatile depth learning algorithm, in which the traditional algorithms for different objects need to customize different algorithms. Comparatively, deep learning algorithm based on the more general, such as in the traditional CNN developed on the basis of faster RCNN, on the face, pedestrians, general object detection tasks can be achieved very good results.
Feature (feature) (2) to obtain the depth of learning has a strong ability to migrate. The feature migration, referring to the number of features on the A task of learning to use can be obtained very good results in the B task.
(3) engineering development, optimization, low maintenance cost. Depth study calculated mainly convolution matrix multiplication and, for this calculation optimization, all deep learning algorithms can improve performance.

2. Basic research in computer vision and learning

Computer vision is a specialized teach the computer how to "see" the discipline, further explanation is to use machines instead of bio-eye target identification, and make the necessary image processing on this basis, the object processing needs.

2.1 Structure FIG Computer Vision

You can use deep learning to solve problems in computer vision grouped into a structure diagram, as shown below:
Computer Vision configuration diagram
For a computer visual learning, the choice of a good training platform is the most important. Because for the vast majority of learners, ease of use and convenient platform often determines the success or failure of learning. Followed by the use of the model. In addition to a very important factor, speed and cycle also need to consider how to make training faster, faster model of how to use objects recognition, computer vision which is a very important issue.

2.2 computer vision approach to learning

"The computer is connected to a camera, so the computer to see what it describes." This is a computer vision down when the decision was made as a discipline targets. Come up with a picture above is a dog and a cat, let a person to identify, regardless of cats and dogs on the picture image and the kind of human beings are always able to accurately distinguish the picture is a cat or a dog. And put this label with a picture sent to the neural network model to learn, this approach to learning is called "supervised learning."
Although, in the field of computer vision supervised learning, deep learning achieved significant results, but with respect to biological and visual learning to distinguish the "semi-supervised learning" and "unsupervised learning", there are more urgent and more important content resolved, such as the movement of objects in the video, there is a specific law behavior; in a picture, the animal also has a specific structure, the use of these images or video can put a specific structure unsupervised problem into a supervision of the problem, then there is the use of supervised learning method to learn. This is a computer vision approach to learning.

3. Written on the back

I just entered the field of learning the learning process are recorded in the notes, the latter will continue to update the article, the article is the first article in the series. Above all with reference: deep learning and computer vision real Wang Xiaohua version. Perhaps in the near future, computer vision will deal with more problems, we welcome like-minded learners together explore a variety of issues. This article if inappropriate, hope that readers could be criticized, I will correct it. If this article helpful to the reader, I hope readers can follow a wave.

Released three original articles · won praise 4 · Views 2784

Guess you like

Origin blog.csdn.net/weixin_43071717/article/details/104244291