Artificial Intelligence Series Experiments (1) - Binary Classification Single-Layer Neural Network for Recognizing Cats

This experiment uses Python to build a single-neuron neural network for recognizing cats, and finally achieves an accuracy rate of more than 70% on the test set.

Experimental environment: numpy, matplotlib, h5py and skimage libraries in python

import numpy as np
import matplotlib.pyplot as plt  # 用于画图
import h5py  # 用于加载训练数据集
import skimage.transform as tf  # 用于缩放图片

Training samples: 209 64*64 labeled pictures

Test samples: 50 64*64 labeled images

For details on the data set and complete code used in this experiment , see:
https://github.com/PPPerry/AI_projects/tree/main/1.cats_identification

The constructed neural network model is as follows:

insert image description here

Based on the neural network model, the code implementation is as follows:

First, preprocess the dataset data.

def load_dataset():
    """
    加载数据集数据
    :return: 训练数据与测试数据的相关参数
    """
    train_dataset = h5py.File("datasets/train_catvnoncat.h5", "r")
    train_set_x_orig = np.array(train_dataset["train_set_x"][:])  # 提取训练数据的特征数据,格式为(样本数, 图片宽, 图片长, 3个RGB通道)
    train_set_y_orig = np.array(train_dataset["train_set_y"][:])  # 提取训练数据的标签数据,格式为(样本数, )

    test_dataset = h5py.File("datasets/test_catvnoncat.h5", "r")
    test_set_x_orig = np.array(test_dataset["test_set_x"][:])  # 提取测试数据的特征数据
    test_set_y_orig = np.array(test_dataset["test_set_y"][:])  # 提取测试数据的标签数据

    classes = np.array(test_dataset["list_classes"][:])  # 提取标签,1代表是猫,0代表不是猫

    train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))  # 统一类别数组格式为(1, 样本数)
    test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))

    return train_set_x_orig,  train_set_y_orig, test_set_x_orig, test_set_y_orig, classes


train_set_x_orig,  train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()  # 加载数据集数据

m_train = train_set_x_orig.shape[0]  # 训练样本数
m_test = test_set_x_orig.shape[0]  # 测试样本数
num_px = test_set_x_orig.shape[1]  # 正方形图片的长/宽

train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0], -1).T  # 将样本数据进行扁平化和转置,格式为(图片数据, 样本数)
test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0], -1).T

train_set_x = train_set_x_flatten/255.  # 标准化处理,使所有值都在[0, 1]范围内
test_set_x = test_set_x_flatten/255.

Second, construct the corresponding functions used in the neural network.
Among them, the formula of forward propagation
is A = σ ( w TX + b ) = ( a ( 1 ) , a ( 2 ) , ⋯ , a ( m − 1 ) , a ( m ) ) J = − 1 m ∑ i = 1 mlog ( a ( i ) ) + ( 1 − y ( i ) ) log ( 1 − a ( i ) ) \begin{array}{c} A=\sigma(w^TX+b)=(a^ {(1)},a^{(2)},\cdots,a^{(m-1)},a^{(m)})\\ \\ J = -\frac{1}{m} \sum^m_{i=1}log(a^{(i)})+(1-y^{(i)})log(1-a^{(i)}) \end{array}A=s ( wTX+b)=(a(1),a(2),,a(m1),a(m))J=m1i=1mlog(a(i))+(1y(i))log(1a(i))
反向传播的公式为
∂ J ∂ w = 1 m X ( A − Y ) T ∂ J ∂ b = 1 m ∑ i = 1 m ( a ( i ) − y ( i ) ) \begin{array}{c} \frac{\partial J}{\partial w} = \frac{1}{m}X(A-Y)^T\\ \\ \frac{\partial J}{\partial b} = \frac{1}{m}\sum^m_{i=1}(a^{(i)}-y^{(i)}) \end{array} wJ=m1X(AY)TbJ=m1i=1m(a(i)y(i))

def sigmoid(z):
    """
    sigmod函数实现
    :param z: 数值或一个numpy数组
    :return: [0, 1]范围数值
    """
    s = 1 / (1 + np.exp(-z))
    return s


def initialize_with_zeros(dim):
    """
    初始化权重数组w和偏置b为0
    :param dim: 权重值的数量
    :return:
    w: 权重数组
    b: 偏置bias
    """
    w = np.zeros((dim, 1))
    b = 0
    return w, b


def propagate(w, b, X, Y):
    """
    实现正向传播和反向传播,分别计算出成本与梯度
    :param w: 权重数组
    :param b: 偏置
    :param X: 图片的特征数据
    :param Y: 图片的标签数据
    :return:
    cost: 成本
    dw: w的梯度
    db:b的梯度
    """
    m = X.shape[1]

    # 前向传播
    A = sigmoid(np.dot(w.T, X) + b)
    cost = -np.sum(Y * np.log(A) + (1 - Y) * np.log(1 - A)) / m

    # 反向传播
    dZ = A - Y
    dw = np.dot(X, dZ.T) / m
    db = np.sum(dZ) / m

    # 梯度保存在字典中
    grads = {
    
    
        "dw": dw,
        "db": db
    }

    return grads, cost


def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost=False):
    """
    梯度下降算法更新参数
    :param w: 权重数组
    :param b: 偏置bias
    :param X: 图片的特征数据
    :param Y: 图片的标签数据
    :param num_iterations: 优化迭代次数
    :param learning_rate: 学习率
    :param print_cost: 为真时,每迭代100次,打印一次成本
    :return:
    params: 优化后的w和b
    costs: 每迭代100次,记录一次成本
    """
    costs = []

    for i in range(num_iterations):
        grads, cost = propagate(w, b, X, Y)

        dw = grads["dw"]
        db = grads["db"]

        # 梯度下降
        w = w - learning_rate * dw
        b = b - learning_rate * db

        # 记录成本变化
        if i % 100 == 0:
            costs.append(cost)
            if print_cost:
                print("优化%d次后成本是:%f" % (i, cost))

    params = {
    
    
        "w": w,
        "b": b
    }

    return params, costs


def predict(w, b, X):
    """
    预测函数,判断是否为猫
    :param w: 权重数组
    :param b: 偏置bias
    :param X: 图片的特征数据
    :return:
    Y_predicition: 预测是否为猫,返回值为0或1
    p: 预测为猫的概率
    """
    m = X.shape[1]
    Y_prediction = np.zeros((1, m))

    p = sigmoid(np.dot(w.T, X) + b)

    for i in range(p.shape[1]):
        if p[0, i] >= 0.5:
            Y_prediction[0, i] = 1

    return Y_prediction, p

Finally, the above functions are combined to construct the final neural network model function.

def model(X_train, Y_train, X_test, Y_test, num_iterations=2001, learning_rate=0.5, print_cost=False):
    """
    最终的神经网络模型函数
    :param X_train: 训练样本的特征数据
    :param Y_train: 训练样本的标签数据
    :param X_test: 测试样本的特征数据
    :param Y_test: 测试样本的标签数据
    :param num_iterations: 优化迭代次数
    :param learning_rate: 学习率
    :param print_cost: 为真时,每迭代100次,打印一次成本
    :return:
    d: 返回相关信息的字典
    """
    w, b = initialize_with_zeros(X_train.shape[0])  # 初始化参数

    parameters, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost)  # 训练参数
    w = parameters["w"]
    b = parameters["b"]

    Y_prediction_train, p_train = predict(w, b, X_train)
    Y_prediction_test, p_test = predict(w, b, X_test)

    print("对训练数据的预测准确率为:{}%".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
    print("对测试数据的预测准确率为:{}%".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))

    d = {
    
    
        "costs": costs,
        "Y_prediction_test": Y_prediction_test,
        "Y_prediction_train": Y_prediction_train,
        "w": w,
        "b": b,
        "learning_rate": learning_rate,
        "num_iterations": num_iterations,
        "p_train": p_train,
        "p_test": p_test
    }

    return d

Now call the above model function to train on the data we loaded initially.

d = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations=2001, learning_rate=0.005, print_cost=True)

Output: The cost after optimizing 2000 times is 0.1356. The accuracy rate on the training data is about 99%, and the accuracy rate on the test data is about 70%.
insert image description here

Test the model prediction function:

  1. View the pictures in the training set and test set and their corresponding prediction results
def show_predict(index, prediction=np.hstack((d["Y_prediction_train"], d["Y_prediction_test"])),
                 data=np.hstack((train_set_x, test_set_x)), origin=np.hstack((train_set_y, test_set_y)),
                 px=num_px, p=np.hstack((d["p_train"], d["p_test"]))):
    if index >= prediction.shape[1]:
        print("index超出数据范围")
        return
    plt.imshow(data[:, index].reshape((px, px, 3)))
    plt.show()
    print("这张图的标签是" + str(origin[0, index]) + ",预测分类是" + str(int(prediction[0, index])) + ",预测概率是" + str(p[0, index]))

    return

Take the 19th picture as an example:
insert image description here
output: the label of this picture is 1, the predicted classification is 1, and the predicted probability is 0.93.

  1. View the pictures you input and their corresponding prediction results
# 预测自己的图片
# 在同目录下创建一个文件夹images,把你的任意图片改名成my_image1.jpg后放入文件夹
my_image = "my_image1.jpg"

image = np.array(plt.imread(my_image))
my_image = tf.resize(image, (num_px, num_px), mode='reflect').reshape((1, num_px*num_px*3)).T
my_prediction, my_p = predict(d["w"], d["b"], my_image)

plt.imshow(image)
plt.show()
print("预测分类是" + str(int(my_prediction[0, 0])) + ",预测概率是" + str(my_p[0, 0]))

The following figure is an example:
insert image description here
output: the predicted classification is 1, and the predicted probability is 0.89.

Conclusions from further experiments:

  1. Changes in cost as the number of iterations increases
# 绘制成本随迭代次数增加时的变化情况
costs = np.squeeze(d['costs'])  # 将表示向量的数组转换为秩为1的数组,便于matplotlib库函数画图
plt.plot(costs)
plt.ylabel('cost')
plt.xlabel('iterations (per hundreds)')
plt.title("Learning rate =" + str(d["learning_rate"]))
plt.show()

insert image description here
Conclusion: The more training times, the smaller the cost and the more accurate the prediction result.

  1. The change of cost with the number of iterations under different learning rates
# 绘制在不同学习率下成本随迭代次数增加时的变化情况
learning_rates = [0.01, 0.001, 0.0001]
models = {
    
    }
for i in learning_rates:
    models[str(i)] = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations=2001, learning_rate=i,
                           print_cost=False)

for i in learning_rates:
    plt.plot(np.squeeze(models[str(i)]["costs"]), label=str(models[str(i)]["learning_rate"]))

plt.ylabel('cost')
plt.xlabel('iterations (per hundreds)')
legend = plt.legend(loc='upper right', shadow=True)
plt.show()

insert image description here
Conclusion: It is very important to choose a correct learning rate, not necessarily bigger is better. If the setting is unreasonable, the neural network may never be able to drop to the minimum value of the loss function; on the contrary, if the setting is reasonable, the convergence of the neural network can achieve good results.

Conclusion:

  1. During the experiment, it is necessary to clarify the dimensions of the variables, otherwise problems will easily occur when doing data processing and writing array operations.
  2. Data sets, pictures, etc. need to be in the same directory as the code.
  3. This experiment is almost based on the pure Python numpy library function to build the neural network model, without using any machine learning framework, and focuses on understanding the mathematical principles and meanings. In the follow-up experiments, I will learn and use classic frameworks such as TensorFlow.
  4. Relevant codes may be continuously updated and improved, and the codes in github shall prevail.
  5. This is the first artificial intelligence program I implemented in the system learning artificial intelligence, and I will continue to update the series of experiments in the future. Write a blog to urge yourself not to always fish 233, and hope that you can go as far as possible on the road of neural networks.

Guess you like

Origin blog.csdn.net/qq_43734019/article/details/119392510