Novice Lesson - What is the depth of learning

Novice Lesson - What is the depth of learning

First, the development of deep learning

1.1 Turing Testing (Turing Test)

Turing test whether artificial intelligence is really a measure of success, "father of artificial intelligence," "father of computer science" British mathematician Turing "machine will think it" in the 1950 paper proposed the Turing Test concept. That is a person and a computer were placed in two isolated rooms, a person outside the room to ask the same questions and the computer at the same time, if people are not outside the room which is the people, which is a computer, it is able to explain computer with artificial intelligence.

1.2 medical discoveries

The 1981 Nobel Prize was awarded to David Hubel and Torsten Wiesel, and Roger Sperry. They found that the human visual system processes information is hierarchical .

From the retina (the Retina), through the lower edge feature extraction region V1, the basic shape or to a local area V2 of the target, then the entire target level (e.g., a human face is determined), and the higher-layer PFC ( prefrontal cortex) classification judgment. That level is characterized by a combination of low-level features, express more abstract conceptualization and features from low to high, that can show more or semantic intent .

Edge features -> local features and objectives of the basic shape -> The goal of this whole process is actually our common sense and is consistent, because of the complexity of the graphics is often a combination of some of the basic structure made of. At the same time we can also see: a deep brain structure, cognitive processes is depth.

xs1
xs2

1.3 Deep Learning appears

Low-level features - - - - (combined) - - -> abstract high-level features

Deep learning, precisely the more abstract high-level features (or attributes category) formed by combining a low-level features. For example, in the field of computer vision, depth learning algorithm to learn from the original image to obtain a low-level expression, for example, edge detector, wavelet filters, then on the basis of the expression of these low-level, non-linear or linear combination to obtain a high level of expression. In addition, not only the image of the existence of this law, the sound is similar. For example, researchers from the library by a sound algorithm automatically discovered 20 basic sound structure, and the rest of the sound can be synthesized from the 20 kinds of basic structure!

Second, the machine learning

Machine learning is a means of artificial intelligence, is now considered more effective means of achieving artificial intelligence, machine learning currently used in more prominent areas of the industry are many, such as: computer vision, natural language processing, recommendation systems. License Plate Recognition ETC on the high-speed life we ​​often used, for example, news headlines today recommended to evaluate the Lynx description. Machine learning is a branch of artificial intelligence, and in many cases, almost become synonymous with artificial intelligence. In simple terms, that is, through machine learning algorithm, making the machine from a large amount of historical data to learn the law, so do the smart new sample identification or make predictions about the future.

2.1 vs Machine Learning Artificial Intelligence

Simulation of artificial intelligence is a branch of computer science, computer studies in intelligent behavior.

Whenever a machine to complete the task based on a set of rules to solve the problem of pre-defined, this behavior is called artificial intelligence. Developers introduced a number of rules you need to follow the computer. There is a possible list of specific actions inside the computer, it will make a decision based on this list. Today, artificial intelligence is an umbrella term that covers everything from advanced algorithms to actual robot.

We have four different levels of AI, let's explain the first two:

  • Weak artificial intelligence, also known as narrow artificial intelligence, artificial intelligence and training system designed for a specific task. One of the weak form of artificial intelligence is a virtual personal assistant, such as Apple's Siri.
  • Strong artificial intelligence, also known as artificial general intelligence, is a kind of artificial intelligence system of universal human cognitive ability. When the computer encounters an unfamiliar task, it is smart enough to find a solution.

Machine learning is the ability of a computer to use large data sets rather than hard-coded rules to learn.
Machine Learning allows the computer to their own learning. This learning takes advantage of the processing power of modern computers, can easily handle large data sets.

Basically, machine learning is a subset of artificial intelligence; more specifically, it is just an AI technology to achieve a model training algorithm, this algorithm enables the computer to learn how to make decisions. In a sense, the machine learning program to adjust itself according to the computer data contacted.

2.2 vs supervised learning unsupervised learning

Supervised learning requires the use of input and output is expected to mark the data set .

When you use a supervised learning artificial intelligence training, you need to provide an input and tell it to the desired output. If the output generated by artificial intelligence is wrong, it will re-adjust their calculations. This iterative process will continue to be done on the data set until the AI ​​will not make mistakes.

An example of supervised learning is the weather forecast of artificial intelligence. It learned to use historical data to predict the weather. Comprising input training data (past weather pressure, humidity, wind speed) and output (past weather temperature).

We also imagine that you are providing a computer program marked data. For example, if the specified task is to use an image classification algorithm for image classification boys and girls, the image of the boy needs with the "boy" label, the girl's image needs with the "girl" label. These data are considered a "training" data set, until the program can successfully classify images at an acceptable rate, above the label will be useless.

It is called supervised learning algorithm is because the process of learning from the training data set is like being a teacher-supervised learning. In advance we know the correct answer to the classification of the cases, the algorithm continues to iterate on the training data to predict, and then predict the results of a "teacher" were constantly revised. When the algorithm reaches an acceptable level of performance, the learning process will stop.

Unsupervised learning is the use of classified information is neither marked nor machine learning, and allows the algorithm to operate on the information in the absence of guidance .

When you use unsupervised learning artificial intelligence training, you can let the artificial intelligence data logical classification. Here machine task information are grouped according to similarity unordered, and difference patterns, without the need to pre-process the data.

An example of unsupervised learning is the behavior of the Amazon and other e-commerce sites to predict AI. It will create their own classification of input data, Amazon help identify which users are most likely to buy different products (cross-selling strategy).

As another example, the program can be optionally used in one of two algorithms to boys and girls complete image classification task. An algorithm called "clusters", it will be assigned the same similar objects according to a set of features such as hair length, the size of the chin, the eye location. Another algorithm called "relevant", according to its similarity to create your own if found / then rules. In other words, it determines the common mode between the image and sort them accordingly.

Third, the depth of learning how to work

Deep learning is a machine learning method, it allows us to train artificial intelligence to predict the output, given a set of inputs (incoming or outgoing message refers to a computer). Supervised learning and unsupervised learning can be used to train artificial intelligence.

Andrew Ng: "depth learning Similarly, the depth learning model rocket engine, fuel is we can provide massive amounts of data these algorithms."

We will see how deep learning works through the establishment of a bus fare estimate online services. To train it, we will use the supervised learning method.

We want our bus fare appraisers use the following information to predict price / input:

  • Starting site
  • Reach the site
  • Departure Date
  • Bus Company

3.1 Neural Networks

A neural network is a group of roughly mimic the human brain, a pattern recognition algorithm. The term is derived from the neural network design inspiration behind these systems architectures, these biological systems is a basic structure of an analog neural network of the brain itself, so that the computer to perform specific tasks.

Like humans, "AI price assessments," also by the neurons (circles) composition. In addition, these neurons is connected to each other.
xs3

Neurons are divided into three different types of levels:

  • Input layer receiving input data. In our example, there are four input layer neurons: departure station, the destination station, the departure date and the bus company. Input layer will pass input data to the first hidden layer.
  • The hidden layer to the input data mathematical calculations. One of the challenges to create a neural network is the number of hidden layers and the number of neurons in each layer is determined.
  • Artificial neural network output layer is the last layer of neurons, the main role is to produce a given output for this procedure, in the present embodiment the output is the predicted value of the price.

xs4

Each connection has a weight between neurons. The weight indicates the importance of the input value. Models do it is to learn how much of each element have contributed to the price. These "contribution" is the model weights. The right to a feature of the higher weight, indicating that the feature is more important than other characteristics. In predicting the bus fares, departure date is one of the most important factors that affect the final fare. Therefore, the departure date of neuronal connections have greater "weight."
xs5

Each neuron has an activation function. It is mainly a function of transmission output based on the input. When a set of input data through all layers in the neural network, output data returned by the final output layer.

3.2 Improved neural network by training

In order to improve the accuracy of "AI price assessment", we need to predict the results are compared with past results, do this, we need two elements:

  1. A lot of computing power
  2. massive data.

AI in the training process, it is important to set the input data (a data set is a single or in combination or as a whole set of data to be accessed), its output is also required the output of the data set comparing. Because the AI ​​has been a "new", its output results may be wrong.

For our bus fare model, we have to find historical data from past fare. Due to a large number of "bus station" and the possible combinations of "start date", so we need a very large list of fares.

Once we traverse the entire data set, it is possible to create a function to measure the difference between the output and the actual output AI (historical data). This function is called the cost function. That cost function is a measure of the accuracy of the model, based on a measure of capacity for this model to estimate the relationship between X and Y.

Model training goal is to make the cost function is equal to zero, i.e., when the output result is consistent with the output data set AI (cost function equal to 0) .

3.3 How can we reduce the cost function?

By using the method of gradient descent called. Gradient measure was that if you change a little bit input value, the output value of the function will be much change.

Gradient descent method is a method for seeking the minimum of a function. In this case, the goal is to achieve minimum cost function. It is the right model to optimize weight training model through iteration after each data set. Under heavy weights by calculating a gradient of the cost function is set, it can be seen minimum gradient direction.
xs6

In order to reduce the cost function value, multiple passes through the data set it is important. This is the reason why a lot of computing power. Once we have improved the AI ​​through training, we can use it to predict future price based on the four elements.

Published 61 original articles · won praise 25 · views 7176

Guess you like

Origin blog.csdn.net/qq_42582489/article/details/105244744