Perceptron put forward in 1957 by the Rosenblatt, a neural network and support vector machine -based. Model Machine perceptual linear classification second category classification, the input feature vector instance, is an example of the output classes, taking two values +1 and -1, the discriminant model belongs to the model, aims to obtain training data for the divided linear separating hyperplane .
1. Model :
hypothetical data set satisfies linear separability from the input space to the output space as the decision function:
w is the weight (or weight vectors), b bias, w · x represents an inner product of w and x , sign is the sign function, namely:
1. inside vector product (dot product, product number): the corresponding position of the two vectors eleven multiplication result of the summation point multiplication result is a scalar;
2. vector outer product (vector cross product, the vector product): the result is a vector cross product operation, and the coordinate plane perpendicular to the outer product of these two vectors composed of two vectors.
Reference Bowen
matrix multiplication of three kinds: reference Bowen
2. Strategy :
perceived loss function machine is all misclassified points to the total distance of the hyperplane S.
Norm : measure a spatial vector (or matrix) of each vector length or size.
(1) L0 norm: The number of nonzero vector element;
(2) Ll norm: vector absolute value sum of the elements;
(. 3) L2 of norm: square value of each element of the vector sum of squares. Reference Bowen
above formula is not a continuous parameter w b and differentiable function, deduced through the loss of function of the perceptron as follows:
Wherein M is a set of misclassified points.
Algorithm 3 :
optimization method is gradient descent, perceptron algorithm has two forms: the original form and dual form .
(1) original form
using a gradient descent method constantly minimized objective function, note: not a minimal process gradient so that all points M of misclassification decreased, but a randomly selected point so that a gradient descent misclassification.
(2) The dual form
of training instances in the form of dual form occurs only within the product, so that the inner product between the training set of examples is calculated and stored in a matrix, this matrix is the Gram matrix:
Machine learning algorithms Summary 2: Perceptron
Guess you like
Origin blog.csdn.net/qq_35946628/article/details/104353527
Recommended
Ranking