The perceptron is a linear classification model for two-class classification. Its input is the feature vector of the instance, and the output is the category of the instance, taking binary values of +1 and -1. The perceptron corresponds to a separation hyperplane that divides instances into positive and negative categories in the input space (feature space), and is a discriminant model. Perceptron is the basis of neural network and support vector machine
Perceptron learning aims to find the separation hyperplane that linearly divides the training data.
Perceptron learning ideas:
1. Import the loss function based on misclassification
2. Use gradient descent method to minimize the loss function
3. Substitute the parameters to get the perceptron model.
Classification of perceptron learning algorithms:
Primitive form, dual form.
Algorithm: The original form of the perceptron learning algorithm
Input: training data set , where , ;learning rate ;
Output: ;Perceptron model
1) Select the initial value ;
2) Select data in the training set ;
3) If ,
4) Go to 2) until there are no misclassification points in the training set.
The algorithm uses stochastic gradient descent:
First, arbitrarily select a hyperplane, and then use the gradient descent method to minimize the loss function. The minimization process is not to reduce the gradient of all misclassified points at once, but to randomly select one misclassified point each time to reduce the gradient.
Assuming that the set of misclassified points M is fixed, the gradient of the loss function is:
For a certain point , the gradient is in the increasing direction, so the loss function decreases.
Algorithm understanding : When an instance point is misclassified, the values of w and b are adjusted to move the classification hyperplane to one side of the misclassified point, reducing the distance between the misclassified point and the hyperplane until the hyperplane crosses the misclassify points so that they are correctly classified