Machine Learning Algorithms - Perceptron summary

First, the mathematical principles Perceptron

Perceptron algorithms optimized stochastic gradient descent algorithm, a binary linear model, the input feature vector instance, the classification result is output, the discriminant model belongs. Perceptron is intended to obtain the data can be divided into linear hyperplanes ( Note that only the data set divided into linearly separable, then k is the number of misclassification upper bound, i.e. perceptron algorithm converges, if the data set can not be separated , the algorithm does not converge ).
github code address, continuously updated

Perceptron algorithm basic steps

Input: training data set ( x i x_i Y i y_i ), x i x_i Feature vector, Y i y_i For the category, a learning rate
output: Parameter w, b, perceptron model F (X) = Sign (WX + B)
. 1, select the initial value w 0 w_0 b 0 b_0
2, the training data set in a randomly selected data point ( x i x_i , Y i y_i )
3, if Y i y_i ( w 0 w_0 x i x_i + b 0 b_0 ) 0 less than or equal
to update parameters w = w 0 w_0 +a x i Y i x_iy_i ,
b= b 0 b_0 +a Y i y_i
4, to 2, until the data set does not misclassified points

(1) a perceptron hyperplane function

f(x)=sign(x*w+b)其中sign当x大于等于0的时候为+1,小于0时为-1.

(2) a perceptron loss function
Here Insert Picture Description
here is to select the point to the total distance misclassification hyperplane loss function optimized stochastic gradient descent, i.e., respectively loss function w, b derivative, which gradient is obtained, and the parameter update until the minimum loss function (non-negative)

Two, sklearn realize (achieve linearly separable sets of data, multi-classification)

The default generate samples at sklearn dichotomous function make_classification, main parameters
#n_samples: generating a number of samples
# n_features = 2: generate the sample wherein

from sklearn.datasets import make_classification
from sklearn.linear_model import Perceptron
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
x,y = make_classification(n_samples=1200, n_features=2,n_redundant=0,n_informative=1,n_clusters_per_class=1)
x_data_train = x[:800,:]##x为特征
x_data_test = x[800:,:]
y_data_train = y[:800]##y为类别
y_data_test = y[800:]

#定义感知机
clf = Perceptron(fit_intercept=False,n_iter=50,shuffle=False,eta0=0.1,random_state=0)

#使用训练数据进行训练
clf.fit(x_data_train,y_data_train)
print(clf.coef_)#返回超平面参数w,b
y_pred=clf.predict(x_data_test)

#评估模型,采用score和classification_report,accuracy_score方法
acc = clf.score(x_data_test,y_data_test)
print(acc)
print (accuracy_score(y_data_test, y_pred))
classify_report = classification_report(y_data_test, y_pred)
print('classify_report : \n', classify_report)

The main parameters Perceptron model
n_iter: gradient descent can be understood as the number of iterations
tol: float or None, if the float, in iteration (loss> previous_loss - tol) stopped.
eta0: float, learning rate, (0,1)
perceptron return parameter
coef_: the weights [W1, w2 of, ...]
intercept_: constant b
Remarks
1 perceptron hyperplane function and the selected initial value, the order of selecting the data points about
2 perceptron present dual form
3, in the case of the data set linearly inseparable, it is necessary to constrain the hyperplane, employed SVM (support vector machine) to divide the hyperplane

Guess you like

Origin blog.csdn.net/qq_39751437/article/details/86489949