5.2 - Neural Network - Handwritten Digit Recognition

  1. The neural network in this part only uses the trained parameters, the purpose is to feel the forward propagation process of the neural network
# 此处为代码所需要的包
import matplotlib.pyplot as plt
import numpy as np
import scipy.io as sio
import matplotlib
import scipy.optimize as opt
from sklearn.metrics import classification_report  # 这个包是评价报告

1 Import data

  1. The data file contains the pixelated data information of the image and the label information (that is, which number the image is)
  2. Important information is as follows:
    1. labels contains labels for 5000 images
    2. The data information is a 5000x400matrix (the original image is 20x20pixels, and each image is 400 feature points). Due to the organizational structure of the original data, an transposeoperation is required when reading the data (see code)
# 定义导入数据函数
def load_data(path, transpose=True):
    data = sio.loadmat(path)
    y = data.get('y')  # (5000,1)
    y = y.reshape(y.shape[0])  # make it back to column vector

    X = data.get('X')  # (5000,400)

    if transpose:
        # for this dataset, you need a transpose to get the orientation right
        X = np.array([im.reshape((20, 20)).T for im in X])

        # and I flat the image again to preserve the vector presentation
        X = np.array([im.reshape(400) for im in X])

    return X, y

# 读取数据
X, y = load_data('ex3data1.mat')
print('X.shape:', X.shape) # X.shape: (5000, 400)
print('y.shape:', y.shape) # y.shape: (5000,),是一个列向量


2 Drawing digital images

  1. Python Matplotlib.axes.Axes.matshow()Plot the image matrix as an image using
    1. The first parameter imag: the matrix to be displayed
    2. The second parameter cmap: represents a color mapping method
  2. Randomly select an image sequence and draw it, as shown in Figure 1.
# 绘图函数
def plot_an_image(image):
    #     """
    #     image : (400,)
    #     """
    fig, ax = plt.subplots(figsize=(1, 1))
    ax.matshow(image.reshape((20, 20)), cmap=matplotlib.cm.binary)
    plt.xticks(np.array([]))  # just get rid of ticks
    plt.yticks(np.array([]))
    
pick_one = np.random.randint(0, 5000)
image_to_draw = X[pick_one, :]
print("image_to_draw.shape:", image_to_draw.shape)
plot_an_image(image_to_draw)
plt.show()
print('this should be {}'.format(y[pick_one]))

insert image description here

figure 1
  1. Draw an image of 100 numbers, as shown in Figure 2.
def plot_100_image(X):
    """ sample 100 image and show them
    assume the image is square

    X : (5000, 400)
    """
    size = int(np.sqrt(X.shape[1])) # 即每个图片的原始像素宽度

    # sample 100 image, reshape, reorg it
    sample_idx = np.random.choice(np.arange(X.shape[0]), 100)  # 随机选取100*400
    sample_images = X[sample_idx, :]

    fig, ax_array = plt.subplots(nrows=10, ncols=10, sharey=True, sharex=True, figsize=(8, 8))

    for r in range(10):
        for c in range(10):
            ax_array[r, c].matshow(sample_images[10 * r + c].reshape((size, size)),
                                   cmap=matplotlib.cm.binary)
            plt.xticks(np.array([]))
            plt.yticks(np.array([]))
            
plot_100_image(X)
plt.show()

insert image description here

figure 2

3 Prepare data

  1. Prepare data X: Like linear regression, you need to add a bias item x 0 = 1 x_0=1 to the datax0=1 . Insert this offset term into the first column of the data.
  2. Prepare data y: Read the incoming y (5000,)column vector of dimension , where the value of each element represents the number represented by the image; in order to facilitate the completion of multi-classification tasks and conform to logistic regression, it needs to be mapped to 10 The vector of rows, if the kkthThe element of row k is 1, then the number represented by the image iskkk . The transformation process of the original label is shown in Figure 3.

insert image description here

image 3
# 准备数据X
raw_X, raw_y = load_data('ex3data1.mat')
print('raw_X.shape:', raw_X.shape)
print('raw_y.shape:', raw_y.shape)

# 在线性回归中有一个偏置项x_0=1,add intercept=1 for x0
X = np.insert(raw_X, 0, values=np.ones(raw_X.shape[0]), axis=1)  # 插入了第一列(全部为1)
print('X.shape:', X.shape)

# 准备数据y
# y have 10 categories here. 1..10, they represent digit 0 as category 10 because matlab index start at 1
# I'll ditit 0, index 0 again
y_matrix = []

for k in range(1, 11):
    y_matrix.append((raw_y == k).astype(int))  # 见配图 "向量化标签.png",化

# last one is k==10, it's digit 0, bring it to the first position,最后一列k=10,都是0,把最后一列放到第一列
y_matrix = [y_matrix[-1]] + y_matrix[:-1]
y = np.array(y_matrix)

print('y.shape:', y.shape)  # (10,5000)
print('扩展后的标签集如下:')
print(y)
print(type(y))

4 Building Multiple Logistic Regression Classifiers

4.1 Construction of cost function and gradient descent function

  1. First build without a regularization term, the construction method is the same as linear regression
# 定义代价函数和对应的梯度下降函数
def sigmoid(z):
    return 1 / (1 + np.exp(-z))


def cost(theta, X, y):
    ''' cost fn is -l(theta) for you to minimize'''
    return np.mean(-y * np.log(sigmoid(X @ theta)) - (1 - y) * np.log(1 - sigmoid(X @ theta)))


def gradient(theta, X, y):
    '''just 1 batch gradient'''
    return (1 / len(X)) * X.T @ (sigmoid(X @ theta) - y)
  1. Then build with a regularization term:
# 在此基础上定义带正则化项的代价函数和梯度下降函数
def regularized_cost(theta, X, y, l=1):
    '''you don't penalize theta_0'''
    theta_j1_to_n = theta[1:]
    regularized_term = (l / (2 * len(X))) * np.power(theta_j1_to_n, 2).sum()

    return cost(theta, X, y) + regularized_term


def regularized_gradient(theta, X, y, l=1):
    '''still, leave theta_0 alone'''
    theta_j1_to_n = theta[1:]
    regularized_theta = (l / len(X)) * theta_j1_to_n

    # by doing this, no offset is on theta_0
    regularized_term = np.concatenate([np.array([0]), regularized_theta])

    return gradient(theta, X, y) + regularized_term
  1. Build prediction function
def predict(x, theta):
    prob = sigmoid(x @ theta)
    return (prob >= 0.5).astype(int)

4.2 Building the main function of logistic regression

  1. The logistic regression function will be used as a classifier of the neural network
# 定义逻辑回归主函数
def logistic_regression(X, y, l=1):
    """generalized logistic regression
    args:
        X: feature matrix, (m, n+1) # with incercept x0=1
        y: target vector, (m, )
        l: lambda constant for regularization

    return: trained parameters
    """
    # init theta
    theta = np.zeros(X.shape[1])

    # train it
    res = opt.minimize(fun=regularized_cost,
                       x0=theta,
                       args=(X, y, l),
                       method='TNC',
                       jac=regularized_gradient,
                       options={
    
    'disp': True})
    # get trained parameters
    final_theta = res.x

    return final_theta
  1. Test logistic regression:
print('y[0].shape:', y[0].shape)  # y[0].shape: (5000,)
t0 = logistic_regression(X, y[0])  # 使用第一组5000个样本测试一下逻辑回归
print('t0.shape:', t0.shape)  # t0.shape: (401,)
y_pred = predict(X, t0)
print('Accuracy={}'.format(np.mean(y[0] == y_pred)))  # Accuracy=0.9974

4.3 Training

  1. Directly train the k-dimensional model, that is, train all ten sets of parameters at one time
# 训练k维的模型
k_theta = np.array([logistic_regression(X, y[k]) for k in range(10)])  # 循环调用逻辑回归,将十组数据都训练完成
print('k_theta.shape:', k_theta.shape)  # k_theta.shape: (10, 401)
  1. Prediction: Since ten sets of data are to be predicted, the final parameter vector cannot be directly passed into the prediction function, and it needs to be transposed. The process is shown in formula (1).
    1. The meaning of the final (5000,10)result: the prediction result of each piece of data on each set of parameters

X θ T = ( X ) ( 5000 , 401 ) ( θ T ) ( 401 , 10 ) = ( ) ( 5000 , 10 ) (1) X\theta^T=(X)_{(5000,401)}(\theta^T)_{(401,10)}=()_{(5000,10)} \tag{1} XθT=(X)(5000,401)( iT)(401,10)=()(5000,10)(1)

# 预测
prob_matrix = sigmoid(X @ k_theta.T)
np.set_printoptions(suppress=True) # 压制numpy的输出精度,取消科学技术法,便于观察
print('prob_matrix:\n',prob_matrix)
y_pred = np.argmax(prob_matrix, axis=1)  # 返回沿轴axis最大值的索引,axis=1代表行
print('y_pred:', y_pred) # y_pred: [0 0 0 ... 9 9 7]
print('y_pred.shape:',y_pred.shape) # y_pred.shape: (5000,)

4.4 Evaluation

  1. Use the evaluation method provided in the machine learning package to evaluate the prediction results, and the results are shown in Figure 4.
y_answer = raw_y.copy()
y_answer[y_answer == 10] = 0  # 原始数据集中的0用10来表示,因此需要先处理一下,变回来
print(classification_report(y_answer, y_pred))

insert image description here

Figure 4

5 Classification using neural networks

1. Logistic regression is a linear classifier and cannot fit more complex hypothesis functions

2. Neural networks can fit nonlinear hypothesis functions to represent more complex models

3. Therefore, the neural network is used for prediction below, but the weight of the trained neural network is temporarily used , and the forward propagation process is directly carried out for prediction.

5.1 Neural Network Structure

  1. There are 3 layers in total, the input layer (that is, a 400-dimensional vector, and of course a bias), a hidden layer, and an output layer. The network structure is shown in Figure 5.
    1. The hidden layer has 25 neurons
    2. The output layer has 10 neurons, corresponding to 10 numeric categories

insert image description here

Figure 5

5.2 Read network weights

  1. The read code is as follows:
# 加载训练好的神经网络权重
def load_weight(path):
    data = sio.loadmat(path)
    return data['Theta1'], data['Theta2']


theta1, theta2 = load_weight('ex3weights.mat')
print('theta1.shape:', theta1.shape) # theta1.shape: (25, 401)
print('theta2.shape:', theta2.shape) # theta2.shape: (10, 26)
  1. As can be seen from Figure 5, via θ ( 1 ) \theta^{(1)}i( 1 ) After the calculation, the hidden layer needs to add a bias unit (adding a neuron on the basis of the original 25), sotheta1.shape[0]=25, andtheta2.shape[1]=26.

5.3 Reading data

# 读取数据-由于网络权重训练时没有转置数据,因此这里读取数据时就没有转置了
X, y = load_data('ex3data1.mat', transpose=False)

X = np.insert(X, 0, values=np.ones(X.shape[0]), axis=1)  # intercept
print('X.shape:', X.shape) # X.shape: (5000, 401)
print('y.shape:', y.shape) # y.shape: (5000,)

5.4 Forward propagation process

# 前向传播过程
# 1-第一层神经元的输出
a1 = X
# 2-第一层与第一组参数运算,得到第二层神经元的输入
z2 = a1 @ theta1.T  # (5000, 401) @ (25,401).T = (5000, 25)
print('z2.shape:', z2.shape)
# 3-加入隐藏层的偏置单元,并输入到激活函数中,得到第二层的输出
z2 = np.insert(z2, 0, values=np.ones(z2.shape[0]), axis=1)
a2 = sigmoid(z2)
print('a2.shape:', a2.shape)  # a2.shape: (5000, 26)
# 4-计算第三层的输入
z3 = a2 @ theta2.T
print('z3.shape:', z3.shape)  # z3.shape: (5000, 10)
# 5-计算第三层的输出
a3 = sigmoid(z3)
print('a3.shape:', a3.shape)
print(a3)

5.5 Comparison of prediction results

# 预测结果评价
# 原始数据由Matlab保存而来,索引为1-10,返回沿轴axis最大值的索引,axis=1代表行
y_pred = np.argmax(a3, axis=1) + 1  
print('y_pred.shape:', y_pred.shape) # y_pred.shape: (5000,)
print(classification_report(y, y_pred))

insert image description here

Guess you like

Origin blog.csdn.net/colleges/article/details/127119883