Do classification iris data set classification of a logitic

Do classification iris data set classification of a logitic

Iris Iris data set is a classic set of data, in the field of machine learning and statistical learning are often used as an example. Total 150 contains three types of records within the data set, for each class, 50 data, each record has four features: calyx length, width sepals, petals length, width petals, flowers iris can belong to four characterized by prediction ( iris-setosa, iris-versicolour, iris-virginica) in which species.

First, we look to load the data set. At the same time probably display data structures and data summaries.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

data = pd.read_csv('./data/iris.csv')
print(data.head())
print(data.info())
print(data['Species'].unique())
   Unnamed: 0  Sepal.Length  Sepal.Width  Petal.Length  Petal.Width Species
0           1           5.1          3.5           1.4          0.2  setosa
1           2           4.9          3.0           1.4          0.2  setosa
2           3           4.7          3.2           1.3          0.2  setosa
3           4           4.6          3.1           1.5          0.2  setosa
4           5           5.0          3.6           1.4          0.2  setosa
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 150 entries, 0 to 149
Data columns (total 6 columns):
Unnamed: 0      150 non-null int64
Sepal.Length    150 non-null float64
Sepal.Width     150 non-null float64
Petal.Length    150 non-null float64
Petal.Width     150 non-null float64
Species         150 non-null object
dtypes: float64(4), int64(1), object(1)
memory usage: 7.2+ KB
None
['setosa' 'versicolor' 'virginica']

A simple summary of the above data, we can get the iris, a total of three categories:

  1. silky
  2. versicolor
  3. virginica

We 0,1,2 respectively denoted [ 'setosa' 'versicolor' 'virginica']

sort out

First, we conducted a simple collation of data sets. We need to replace classified into 0,1,2

Secondly, we classified the data into two sets, one for training parameter logitic our algorithm, a further training to our test results the

following is the code:

# 数值替换

data.loc[data['Species']=='setosa','Species']=0
data.loc[data['Species']=='versicolor','Species']=1
data.loc[data['Species']=='virginica','Species']=2
print(data)
     Unnamed: 0  Sepal.Length  Sepal.Width  Petal.Length  Petal.Width  Species
0             1           5.1          3.5           1.4          0.2        0
1             2           4.9          3.0           1.4          0.2        0
2             3           4.7          3.2           1.3          0.2        0
3             4           4.6          3.1           1.5          0.2        0
4             5           5.0          3.6           1.4          0.2        0
..          ...           ...          ...           ...          ...      ...
145         146           6.7          3.0           5.2          2.3        2
146         147           6.3          2.5           5.0          1.9        2
147         148           6.5          3.0           5.2          2.0        2
148         149           6.2          3.4           5.4          2.3        2
149         150           5.9          3.0           5.1          1.8        2

[150 rows x 6 columns]
#分割训练集和测试集
train_data = data.sample(frac=0.6,random_state=0,axis=0)
test_data = data[~data.index.isin(train_data.index)]

train_data = np.array(train_data)
test_data = np.array(test_data)

train_label = train_data[:,5:6].astype(int)
test_label = test_data[:,5:6].astype(int)

print(train_label[:1])
print(test_label[:1])

train_data = train_data[:,1:5]
test_data = test_data[:,1:5]

print(np.shape(train_data))
print(np.shape(train_label))
print(np.shape(test_data))
print(np.shape(test_label))
[[2]]
[[0]]
(90, 4)
(90, 1)
(60, 4)
(60, 1)

We need to label the style of programming 1ofN

After operation of the above two steps, we can see the data set is divided into two portions. We next data logitic classification.

train_label_onhot = np.eye(3)[train_label]
test_label_onhot = np.eye(3)[test_label]
train_label_onhot = train_label_onhot.reshape((90,3))
test_label_onhot =  test_label_onhot.reshape((60,3))
print(train_label_onhot[:3])
[[0. 0. 1.]
 [0. 1. 0.]
 [1. 0. 0.]]

classification

Thinking

I chose to select the easier issues first method to deal with this problem:



If we have a category 0 or 1, then two, we need to determine whether the characteristic value X (N-dimensional) can be classified as a category. Our steps are as follows:

  1. Initialization parameter w (1, N) and b (1)
  2. Calculation \ (z = \ sum_ {i = 0} ^ {n} w * x + b \)
  3. Into \ (\ Sigma \) function to get the \ (\ hat {y} = \ sigma (z) \)

There are multiple classification, we need to use one-to-many go computing. Simple to understand, In this problem, a total of three classifications. We need to calculate the \ (\ hat {y} _1 \) to indicate the probability of this stuff is classified 1 or not classified 1 \ (\ hat {y} _2 \) is not a probability 2 classification, \ (\ Hat {the y-} _3 \) is not a probability 3 classification. And then to compare the maximum, which is the probability that the probability of these three categories.

Which belong to relatively high probability algorithm, we use softmat. Is calculated \ (exp (\ Hat _1 {Y}) \) , \ (exp (\ Hat {Y} _2) \) , \ (exp (\ Hat _3 {Y}) \) , then the three categories belonging to give the probabilities are

  1. p1=\(\frac{exp(\hat{y}_1)}{\sum_{i=0}{3}(\hat{y}_i)}\)
  2. p1=\(\frac{exp(\hat{y}_2)}{\sum_{i=0}{3}(\hat{y}_i)}\)
  3. p1=\(\frac{exp(\hat{y}_3)}{\sum_{i=0}{3}(\hat{y}_i)}\)



We went to computing based on the idea of ​​a record, as follows:

def sigmoid(s):
     return 1. / (1 + np.exp(-s))

w = np.random.rand(4,3)
b = np.random.rand(3)

def get_result(w,b):
    z = np.matmul(train_data[0],w) +b
    y = sigmoid(z)
    return y

y = get_result(w,b)

print(y)
[0.99997447 0.99966436 0.99999301]

Said code is a code that we just record, let him modify the matrix of a calculation of all the training set \ (\ hat {y} \ )

def get_result_all(data,w,b):
    z = np.matmul(data,w)+ b
    y = sigmoid(z)
    return y
y=get_result_all(train_data,w,b)
print(y[:10])
[[0.99997447 0.99966436 0.99999301]
 [0.99988776 0.99720719 0.9999609 ]
 [0.99947512 0.98810796 0.99962362]
 [0.99999389 0.99980632 0.999999  ]
 [0.9990065  0.98181945 0.99931113]
 [0.99999094 0.9998681  0.9999983 ]
 [0.99902719 0.98236513 0.99924728]
 [0.9999761  0.99933525 0.99999313]
 [0.99997542 0.99923594 0.99999312]
 [0.99993082 0.99841774 0.99997519]]

Next, we have required a loss function to calculate the deviation between the actual parameters and we get the parameters on the classification of the loss function, link here


loss function for individual categories as follows:

\[loss=−\sum_{i=0}^{n}[y_iln\hat{y}_i+(1−y_i)ln(1−\hat{y}_i)]\]

Derivative loss function follows Seeking

When \ (y_i = 0 \) when

W is the derivative:

\[ \frac{dloss}{dw}=(1-y_i)*\frac{1}{1-\hat{y}_i}*\hat{y}_i*(1-\hat{y}_i)*x_i \]
化简得到
\[ \frac{dloss}{dw}=\hat{y}*x_i=(\hat{y}-y)*x_i \]

B is the derivative

\[ \frac{dloss}{db}=(1-y_i)*\frac{1}{1-\hat{y}_i}*\hat{y}_i*(1-\hat{y}_i) \]
化简得到
\[\frac{dloss}{db}=\hat{y}-y\]

When \ (y_i \) =. 1 when

Derivative of w

\[ \frac{dloss}{dw}=-yi*\frac{1}{\hat{y}}*\hat{y}(1-\hat{y})*x_i \]
化简
\[ \frac{dloss}{dw}=(\hat{y}-1)*x_i=(\hat{y}-y)*x_i \]

Derivative b

\[\frac{dloss}{dw}=\hat{y}-y\]

Together can be
\ [\ frac {dloss} { dw} = \ sum_ {i = 0} ^ {n} (\ hat {y} -y) * x_i \]

\[ \frac{dloss}{db}=\sum_{i=0}^{n}(\hat{y}-y) \]

We just need to keep adjusting the w and b according to the following formula, is the process of machine learning
\ [w * = w-learning_rate DW \]
\ [b = b-learning_rate * db \]

Let's write down the code:

learning_rate = 0.0001



def eval(data,label, w,b):
    y = get_result_all(data,w,b)
    y = y.argmax(axis=1)
    y = np.eye(3)[y]
    count = np.shape(data)[0]
    acc = (count - np.power(y-label,2).sum()/2)/count
    return acc

def train(step,w,b):
    y = get_result_all(train_data,w,b)
    loss = -1*(train_label_onhot * np.log(y) +(1-train_label_onhot)*np.log(1-y)).sum()
    
    dw = np.matmul(np.transpose(train_data),y - train_label_onhot)
    db = (y - train_label_onhot).sum(axis=0)
    
    w = w - learning_rate * dw
    b = b - learning_rate * db
    return w, b,loss


loss_data = {'step':[],'loss':[]}
train_acc_data = {'step':[],'acc':[]}
test_acc_data={'step':[],'acc':[]}

for step in range(3000):
    w,b,loss = train(step,w,b)
    train_acc = eval(train_data,train_label_onhot,w,b)
    test_acc = eval(test_data,test_label_onhot,w,b)
    
    loss_data['step'].append(step)
    loss_data['loss'].append(loss)
    
    train_acc_data['step'].append(step)
    train_acc_data['acc'].append(train_acc)
    
    test_acc_data['step'].append(step)
    test_acc_data['acc'].append(test_acc)
    
plt.plot(loss_data['step'],loss_data['loss'])
plt.show()

plt.plot(train_acc_data['step'],train_acc_data['acc'],color='red')
plt.plot(test_acc_data['step'],test_acc_data['acc'],color='blue')
plt.show()
print(test_acc_data['acc'][-1])

[png]

0.9666666666666667

From the above results, Run view reached 96.67% prediction accuracy. not bad!

Guess you like

Origin www.cnblogs.com/bbird/p/11544410.html