[] Understand the inner meaning of zero-based neural networks spread function

I. Preamble

  Before've written "single neuron, neural networks shallow, deep neural network" (interest can be looking through), write a bit messy but many areas do not have to say it. Here we try to answer a question by describing the "perception machine": "Why the long propagation function like this."

  Perceptron is a neural network forecasting model similar, and now all kinds of powerful neural network is the basis of the perception of the birth of the machine on, read Perceptron neural network is not hard up. In addition, both Perceptron neural network also, or early, they are used to deal with "two-class" issue per se, we started from the second classification, the reverse is derived Perceptron model. As for the perception of how the machine is transformed into neural networks, and more details will remain to revisit it in the future, welcome attention to "zero-based love learning," artificial intelligence learning together.

Second, dichotomous

  "Binary" is just a name for, you can imagine handle binary classification model into a black box, no matter what the content of the input, the output is 1 or 0 (yes or no). A typical binary classification as follows:

  1) whether the figure includes the cat

  2) whether the car is included in FIG.

  3) whether the portrait contained in FIG.

  As "the figure of what color the cat is," this problem does not belong to binary, because the cat may be white, black, yellow, figure they've not even a cat.

  Here we consider a very simple binary classification:

  "Just four pixel image white is not white."

  The following generates a random four-pixel image (magnified N times the pixel convenient observation)

  Corresponding to the gradation value (gradation value range 0-255,0 represents black, 255 represents full white):

  X=(240,200,100,10)

  Determining whether the image is to be "white not white" Intuitively handled well, we gradation value of each pixel is the sum, when the gray values ​​is greater than a certain threshold value when we consider that the white is a view on a less than that this is not a blank map. Example pseudo-code as follows:

  if (x1+x2+x3+x4)>a:

    return 1

  else:

    return 0

  But this approach has a problem, it does not take into account the different manifestations of each pixel in the overall sense also have a certain influence, as in the following two graphs:

  In fact, the gray values ​​of the two plans and is the same, but subjectively we may think on the left than on the right diagram illustrating an overall "white" thing. Then we put it another way to judge the characteristics of each pixel are taken into account:

  if x1>a1 and x2>a2 and x3>a3 and x4>a4:

    return 1

  else:

    return 0

  Here a1, a2, a3, a4 each pixel can be called "signature" value, but this method has led to universal judgment is poor, such as our change of location of each pixel, it might need to add a set of characteristic values:

 

  if (x1>a1 and x2>a2 and x3>a3 and x4>a4) or (x1>b1 and x2>b2 and x3>b3 and x4>b4):

    return 1

  else:

    return 0

  Above, only two sets of feature values, in fact, may take up to six sets of combinations of feature values ​​of four pixels. If you need to deal with is 500x500 picture, then the complexity of the code will not dare to imagine.

  By the previous example, we find that using a computer to process a very simple binary classification is very difficult, there is no way to do it? Of course not, either in front of or characteristic using a threshold value to do "white or not white," the judge, in essence, is to establish a "OK System", then goose the real world is often not so clear, "White and non-white." the judge is actually a "chaotic system", it is subject to a variety of factors. So is there a way to combine the individual and the whole of it? Wise sages only had a couple of minor adjustments:

  if x1w1+x2w2+x3w3+x4w4 > a :

    return 1

  else:

    return 0

  A threshold value is still referred to herein, w1, w2, w3, w4 is another feature of the form, referred to herein as weights. And then multiplied by the weight input cumulative manner, i.e. taking into account the "subject" should be different in the second classification determination, and taking into account the "whole" decisive role in the determination. But how much weight specific to each individual (the quantity of weight) is determined in the second classification shall account, the final threshold value should be set to a suitable number, it is difficult to directly give a definitive answer, then play in relation to the perceptron.

Third, Perceptron

  A simple perceptron model below:

 

  Mathematical formula:

  x1w1 + x2w2 <= a then y = 0 (no)

  x1w1 + x2w2> a then y = 1 (yes)

  x1, x2 are input, w1, w2 are weights corresponding to the input, a is the threshold value, y is the output.

  For ease of use, we can be moved to the left a formula with b = -a and instead, the equation becomes:

  x1w1 + x2w2 + b <= 0 then the y = 0 (no)

     x1w1 + x2w2 + b> 0 then the y = 1 (yes)

  Then we will put forward in front of the two-class problem "is only 4 pixel image white is not white," applied to mathematical formula:

  x1w1 + x2w2 + x3w3 + x4w4 + b <= 0 then the y = 0 (no)

  x1w1 + x2w2 + x3w3 + x4w4 + b> 0 then the y = 1 (yes)

  Now we successfully transformed into a binary classification problem math problems, answer questions dichotomous answer is to find a proper w and b, looking w, b process we call training. It should be noted that we only ever find a "make do" the w and b values, that is, the final Perceptron model does not perfectly fit all the training data.

  Now let's look at the propagation neural network function:

  It is not so difficult to understand?

  Here we can not achieve a full perception of the machine, the code is actually similar to the single neuron, the whole process is forward propagation loss calculation, the reverse optimization, to give a final w, b values.

IV Summary

  This paper attempts from the perspective of two-class problem to understand the propagation function, but its real meaning is not clear mathematics, leave to talk about it later.

  Please pay attention to the public number "zero-based love learning" AI learning together.

Guess you like

Origin www.cnblogs.com/cation/p/11646989.html