基于逻辑回归算法的“喵”图识别

本文是基于吴恩达老师的《深度学习》第一课第二周课后题所做，目的是使用逻辑回归算法来对小喵图片进行识别。在创建模型时主要包括：数据处理、模型搭建、模型测试、参数调整、模型应用等五个步骤，下面分别详述。

一、数据处理

为了简化流程，吴恩达老师已经给出了训练集和测试集数据以.h5格式给出（.h5格式是使用h5py库生产数据文件类型），数据文件链接为：https://pan.baidu.com/s/1KZks2c1TkIxtaWJrBtrYYQ，密码：i1l0，处理完成后会返回5个矩阵分别为：训练集X矩阵（train_set_x_orig）, 训练集Y矩阵（train_set_y_orig ），测试集X矩阵(test_set_x_orig),测试集Y矩阵（test_set_y_orig）,类别( classes:喵与非喵)

import numpy as np
import h5py
      
def load_dataset():
    train_dataset = h5py.File('datasets\\train_catvnoncat.h5', "r")
    train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set features
    train_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels

    test_dataset = h5py.File('datasets\\test_catvnoncat.h5', "r")
    test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set features
    test_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels

    classes = np.array(test_dataset["list_classes"][:]) # the list of classes
    
    train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))
    test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))

    return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes

此处所使用的numpy和h5py 库都是比较经典的库，在此不多介绍。

此时得到的样本数量为209个，每个样本是3通道（RGB）图片生产的数组，大小为64*64*3，这种3维数组不便于处理需转化成 2维数组，且每一列代表一个样本数据，因此我们进行flatten、转置和均值化处理。

train_set_x_flatten = train_set_x_orig.reshape(m_train,num_px**2*3).T
test_set_x_flatten = test_set_x_orig.reshape(m_test,-1).T
    
train_set_x = train_set_x_flatten / 255
test_set_x = test_set_x_flatten / 255

当然，我们对数据只做了简单的处理，还有许多数据处理手段可以应用以提高算法的准确性和鲁棒性。

二、模型搭建

接下来可以把逻辑回归算法分解成多个函数进行实现。

（1）激励函数，采用经典的sigmoid函数

def sigmoid(z):
    s = 1.0 / (1.0 + np.exp(-z))
    return s

（2）初始化函数，初始化w和b。建议在python程序中多使用assert函数对数组的shape进行判断，以减少错误的发生。

def initialize_with_zeros(dim):
    w = np.zeros((dim,1))
    b = 0

    assert(w.shape == (dim,1))
    assert(isinstance(b,float) or isinstance(b,int))

    return w,b

（3）传播函数，用于计算梯度下降法中的两个重要参数dw和db，及代价函数（Cost Function），根据逻辑回归算法我们采用的代价函数为：J(w,b)=1m∑i=1mL(y^(i),y(i))=−1m∑i=1m[y(i)logy^(i)+(1−y(i))log(1−y^(i))]

def propagate(w,b,X,Y):
    m = X.shape[1]
    A = sigmoid(np.dot(w.T,X)+b)
    cost = -np.sum(np.multiply(Y,np.log(A))+np.multiply(1-Y,np.log(1-A)))/m

    dw = 1/m * np.dot(X,(A-Y).T)
    db = 1/m * np.sum(A-Y)
    assert(dw.shape == w.shape)
    
    cost = np.squeeze(cost)
    assert(cost.shape == ())
    grads = {'dw' : dw,
             'db' : db}
    #print(type(grads))
    return grads,cost

（4）优化函数，利用梯度下降进行反复迭代，获取最终的优化参数w和b

def optimize(w,b,X,Y,num_iterations,learning_rate,print_cost):
    costs = []
    for i in range(num_iterations):
        grads,cost = propagate(w,b,X,Y)
        dw = grads['dw']
        db = grads['db']

        w = w - learning_rate * dw
        b = b - learning_rate * db

        if i % 100 == 0:
            costs.append(cost)
        if print_cost and i % 100 == 0:
            print("Cost after interation %i:%f"%(i,cost))

    params = {'w' : w,
              'b' : b}
    grads = {'dw' : dw,
             'db' : db}

    return params,grads,costs

（5）预测函数，根据优化函数获得的w和b，对输入样本集进行预测，当预测的概率大于0.5时，认定该样本为喵。

def predict(w,b,X):
    m = X.shape[1]
    Y_prediction = np.zeros((1,m))
    w = w.reshape(X.shape[0],1)
    A = sigmoid(np.dot(w.T,X)+b)

    for i in range(A.shape[1]):
        if A[0,i] >= 0.5:
            Y_prediction[0,i] = 1
        else:
            Y_prediction[0,i] = 0
    assert(Y_prediction.shape == (1,m))
    
    return Y_prediction

（6）模型函数，整合上述各函数，定义损失函数（Loss Fuction）评价模型精度。

def model(X_train,Y_train,X_test,Y_test ,num_iterations,learning_rate,print_cost):
       
    dim = X_train.shape[0]
    w,b = initialize_with_zeros(dim)
    parameters,grads,costs = optimize(w,b,X_train,Y_train,num_iterations,learning_rate,print_cost)
    w = parameters['w']
    b = parameters['b']

    Y_prediction_test = predict(w,b,X_test)
    Y_prediction_train = predict(w,b,X_train)

    print('train accuracy:{}%'.format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
    print('test accuracy:{}%'.format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))
    d = {'costs' : costs,
         'Y_prediction_test' : Y_prediction_test,
         'Y_prediction_train' : Y_prediction_train,
         'w' : w,
         'b' : b,
         'learning_rate' : learning_rate,
         'num_iterations' : num_iterations
         }
    return d

三、模型测试

我们选取迭代次数为1500（将此参数作为运行终止条件），选取学习率为0.005

d = model(train_set_x,train_set_y,test_set_x,test_set_y,num_iterations = 1500,learning_rate = 0.0005,print_cost = False)

模型运行完成后我们可以进行一些可视化的验证，使用imshow（）函数将测试集中的某个图片及分类结果打印出来，进行肉眼判断。

    index = num
    a_image = test_set_x[:,index].reshape((num_px,num_px,3))
    plt.imshow(a_image)
    plt.show()
    print('y='+str(test_set_y[0,index])+",you predict that it is a \""+classes[int(d['Y_prediction_test'][0,index])].decode('utf-8')+"\"picture.")

y=1,you predict that it is a "cat"picture.

训练集精度到达99%，大于测试集精度的70%，过拟合现象严重，因此我们采取措施进行调整，常见的手段是：参数调整。

四、参数调整

我们通过改变学习率为例，观察参数调节对训练精度的影响。依次将学习率设定为0.01,0.005,0.001,0.0005,0.0001，并绘制五条曲线进行比对。

 learning_rates = [0.01,0.005,0.001,0.0005,0.0001]
    models = {}
    for i in learning_rates:
        print("learning_rate is :"+str(i))
        models[str(i)] = model(train_set_x,train_set_y,test_set_x,test_set_y,num_iterations = 1500,learning_rate = i,print_cost = False)
        print('\n'+"---------------------------"+'\n')
    for i in learning_rates:
        plt.plot(np.squeeze(models[str(i)]['costs']),label=str(models[str(i)]['learning_rate']))

    plt.ylabel('cost')
    plt.xlabel('learning_rate')
    legend = plt.legend(loc = 'upper center',shadow=True)
    frame = legend.get_frame()
    frame.set_facecolor('0.90')
    plt.show()

五、模型应用

用训练好的模型识别图片，可使用 scipy库imresize函数将图片转化成64*64*3大小。

    my_image = "cat_image.jpg"
    fname = "images\\" + my_image
    #print(fname)
    image = np.array(ndimage.imread(fname,flatten=False))
    my_image = scipy.misc.imresize(image, size=(num_px,num_px)).reshape((1,num_px ** 2 * 3)).T
    my_predicted_image = predict(d['w'],d['b'],my_image)
    plt.imshow(image)
    plt.show()
    print('y='+str(np.squeeze(my_predicted_image))+",you predict that it is a \""+
          classes[int(np.squeeze(my_predicted_image))].decode('utf-8')+"\"picture.")

y=1.0,you predict that it is a "cat"picture.

综上便是逻辑回归算法在一个分类问题上的简单实现，在此基础上我们也可以通过数据处理和函数优化等方法来改善算法的精度和鲁棒性。