machine learning python - Perceptron

Recent study related books looking at the machine, the way to write out daily reading portions to share, discuss progress together learn together! As the first blog of machine learning, I am ready to start from the perception is, then slowly update other content.

Before implementing perception algorithm, we need to look at neurons (neuron) works, there are many neuronal dendrites and an axon, dendrites (Dendrites) can receive information from other nerve yuan and taken to the cells body (cell nucleus), axons (axon) information may be transmitted from the cell body to other neurons. After passing over the dendrites information calculation processing performed in body cells, if the result exceeds a certain threshold, it transmits a signal to the axons of other neurons. It neurons work by understanding the process, created a perceptron learning algorithm.

Perceptron is an artificial neural network Frank Rosenblatt in 1975 worked at the Cornell Lab of the invention, it is seen as a feedforward neural network simplest form, is a binary linear classifiers , inadequate in that linearly inseparable problem can not be processed.

The following figure shows three different cases, two left in a straight line may be used (i.e., a linear function) separately, i.e. linearly separable; middle and right due to the linear function can not be used separately, for the linearly inseparable.

We directly look at an example, suppose we now need to spend classification, there are two flowers dataset, respectively denoted as 1 and -1, we need to be classified according to some of the features contained in the data set flower, where only wherein two flower, i.e. the length of sepals and petals length, the two feature vectors is represented by:
\ [X = \ {bmatrix the begin x_1} \\ x_2 \ bmatrix End {} \]
X is also called an input vector , we then define a corresponding weight vector w:
\ [w = \ the begin {bmatrix} W_1 \\ w_2 \ End {bmatrix} \]
the x linear combination w obtained after Z:
\ [Z = w_1x_1 + w_2x_2 \]
we assume if the sample is equal to the activation value of z is greater than the threshold value B set well in advance, we say this sample belongs to class 1, otherwise it belongs to class -1, the formula expressed as follows:
\ [\ Phi (z) = \ Cases the begin {1}, \ quad z \ ge b \\\\ -
1, \ quad otherwise \ end {cases} \] can be seen that this idea works very similar and neurons. For convenience, we will move to the threshold value b of the equation on the left and right of a parameter defines the additional weight in place -b, z is updated following equation:
\ [+ z = w_0x_0 w_2x_2 w_1x_1 + \]
then the formula is not less than z it is equivalent to 0 when z is greater than equal to the threshold value b before the case can be obtained:
\ [\ Phi (z) = \ cases the begin {}. 1, \ Quad z \ GE0 \\\\ -. 1, \ Quad otherwise \ end {cases} \]
The above function is also called activation function, z will be compressed into a binary output (1, -1) by activating the function, namely:

We can see the weight vector w determines the classification is accurate, then how do we choose the right weight vector w it? We can not give a a w assignment, so the workload is too big and inefficient, in fact, Perceptron can automatically adjust the sample data set w, as the training progresses, the changes w stabilized, the classification accuracy will greatly improve.

We update the weight vector w formula:
\ [w_j = w_j + \ of Delta w_j \]

\ [\ Delta w_j = \ eta (s ^ i- \ hat ^ {y i}) x ^ i_j \]

\ [\ Eta- learning rate \\ w_j-w j-th feature vector \\ y ^ i- i-th category real samples \\\ hat {y ^ i} - prediction class i-th sample \\ x_j ^ i- i-th sample of the j th feature \]

Wherein the learning rate ranging between 0.0 and 1.0, for controlling the degree of updates w, the weight vector w in each of the synchronization parameters are updated, i.e., only in the size of the update for each parameter are calculated only w change the value of w, x we ​​use a lot of training sample data set to update w, to gradually improve classification accuracy.

Linear Type perception algorithm only small learning rate and the separable case in order to ensure convergence, the training samples received perceptron x, w and x will give linear binding to z, z transfer function to activate to produce a classification result as predicted category samples x, then in accordance with the updated rules to update w, such as the convergence perceptron also completed the training.

Then we began to realize perception algorithm and uses the Iris dataset Training:

import pandas as pd

Reading the data set

df = pd.read_csv('http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data', header=None)
df.tail()
0 1 2 3 4
145 6.7 3.0 5.2 2.3 Iris-virginica
146 6.3 2.5 5.0 1.9 Iris-virginica
147 6.5 3.0 5.2 2.0 Iris-virginica
148 6.2 3.4 5.4 2.3 Iris-virginica
149 5.9 3.0 5.1 1.8 Iris-virginica

From the above table it can be seen that each of the input vector x contains four feature (0,1,2,3) and a correct category (4)

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

100 training samples taken prior to vector category if its input category 'Iris-setosa', it is set to -1, otherwise it is set to 1

y = df.iloc[0:100, 4].values
y = np.where(y == 'Iris-setosa', -1, 1)

Remove the first two eigenvectors of the first 100 training samples

X = df.iloc[0:100, [0, 2]].values

Draw the 100 class distribution of training samples

plt.scatter(X[:50, 0], X[:50, 1], color='red', marker='o', label='setosa')
plt.scatter(X[50:100, 0], X[50:100, 1], color='blue', marker='x', label='versicolor')
plt.xlabel('petal length')
plt.ylabel('sepal length')
plt.legend(loc='upper left')
plt.show()

Implement Perceptron

import numpy as np

class Perceptron(object):
    """Perceptron classifier.

    Parameters
    ----------
    eta:float
        Learning rate(between 0.0 and 1.0
    n_iter:int
        Passes over the training dataset.

    Attributes
    ----------
    w_:1d-array
        weights after fitting.
    errors_:list
        Number of miscalssifications in every epoch.

    """

    def __init__(self, eta=0.01, n_iter=10):
        self.eta = eta
        self.n_iter = n_iter

    def fit(self, X, y):
        """Fit training data.

        :param X:{array-like}, shape=[n_samples, n_features]
        Training vectors, where n_samples is the number of samples and
        n_features is the number of features.
        :param y: array-like, shape=[n_samples]
        Target values.
        :return:
        self:object

        """

        self.w_ = np.zeros(1 + X.shape[1]) # Add w_0
        self.errors_ = []

        for _ in range(self.n_iter):
            errors = 0
            for xi, target in zip(X, y):
                update = self.eta * (target - self.predict(xi))
                self.w_[1:] += update * xi
                self.w_[0] += update
                errors += int(update != 0.0)
            self.errors_.append(errors)
        return self

    def net_input(self, X):
        """Calculate net input"""
        return np.dot(X, self.w_[1:]) + self.w_[0]

    def predict(self, X):
        """Return class label after unit step"""
        return np.where(self.net_input(X) >= 0.0, 1, -1) #analoge ? :n in C++

ppn = Perceptron(eta = 0.1, n_iter = 10)
ppn.fit(X, y)
<__main__.Perceptron at 0x16680906978>

Draw training curve

plt.plot(range(1, len(ppn.errors_) + 1), ppn.errors_, marker = 'o')
plt.xlabel('Epoches')
plt.ylabel('Number of misclassifications')
plt.show()

Draw a dividing line

from matplotlib.colors import ListedColormap
def plot_decision_region(X, y, classifier, resolution=0.02):
    # setup marker generator and color map
    markers = ('s', 'x', 'o', '^', 'v')
    colors = ('red', 'blue', 'lightgreen', 'gray', 'cyan')
    cmap = ListedColormap(colors[:len(np.unique(y))])
    
    # plot the decision surface
    x1_min, x1_max = X[:, 0].min() - 1, X[:, 0].max() + 1
    x2_min, x2_max = X[:, 1].min() - 1, X[:, 0].max() + 1
    
    xx1, xx2 = np.meshgrid(np.arange(x1_min, x1_max, resolution),
                          np.arange(x2_min, x2_max, resolution))
    Z = classifier.predict(np.array([xx1.ravel(), xx2.ravel()]).T)
    Z = Z.reshape(xx1.shape)
    
    plt.contourf(xx1, xx2, Z, alpha=0.4, cmap=cmap)
    plt.xlim(xx1.min(), xx1.max())
    plt.ylim(xx2.min(), xx2.max())
    
    #plot class samples
    for idx, cl in enumerate(np.unique(y)):
        plt.scatter(x=X[y == cl, 0], y=X[y == cl, 1],
                   alpha=0.8, c=cmap(idx), marker = markers[idx],
                   label=cl)
plot_decision_region(X, y, classifier=ppn)
plt.xlabel('sepal length [cm]')
plt.ylabel('petal length [cm]')
plt.legend(loc='upper left')
plt.show()

reference:

https://www.toutiao.com/a6669391886744027662/
https://zh.wikipedia.org/wiki/%E6%84%9F%E7%9F%A5%E5%99%A8

Guess you like

Origin www.cnblogs.com/Dzha/p/11832453.html