Tensorflow(二)- 实践一 - 三层神经网络

这篇结合一个三层神经网络完成多分类的例子来讲解tensorflow

创建图

首先创建神经网络中需要多次传入值的,X和Y,建立placeholder,由于不能确切知道一次minibatch传入的样本数多少,于是在矩阵大小定义时,第二维度为None。

def create_placeholders(n_x, n_y):
    X = tf.placeholder(tf.float32, [n_x, None])
    Y = tf.placeholder(tf.float32, [n_y, None])   
    return X, Y

初始化我们的参数

在上一篇中,我们是使用tf.Variable()来进行初始化的,而这里我们运用到的是tf.get_variable()。get_variable()创建的我们叫做共享变量,即这一套变量是不能被重复定义的,重复定义就会报错,除非在VarScope中进行reuse,而Variable(),就可以进行重复定义,系统会自动处理冲突。
这里我们用到了一个高级初始化API tf.contrib.layers.xavier_initializer()
https://www.tensorflow.org/api_docs/python/tf/contrib/layers/xavier_initializer
这里附上它的官方链接,这是一种带权重的初始化方法(Weight Initialization),具体原理在《深度神经网络优化(一)》中有提到就不细讲了。

def initialize_parameters(): 
    tf.set_random_seed(1) 

    W1 = tf.get_variable('W1', [25,12288], initializer = tf.contrib.layers.xavier_initializer(seed = 1))
    b1 = tf.get_variable('b1', [25,1], initializer = tf.zeros_initializer())
    W2 = tf.get_variable('W2', [12,25], initializer = tf.contrib.layers.xavier_initializer(seed = 1))
    b2 = tf.get_variable('b2', [12,1], initializer = tf.zeros_initializer())
    W3 = tf.get_variable('W3', [6,12], initializer = tf.contrib.layers.xavier_initializer(seed = 1))
    b3 = tf.get_variable('b3', [6,1], initializer = tf.zeros_initializer())

    parameters = {"W1": W1,
                  "b1": b1,
                  "W2": W2,
                  "b2": b2,
                  "W3": W3,
                  "b3": b3}

    return parameters

定义前向传播函数

tf.matmul() 等同于 np.dot()
Relu激活函数,tf.nn.relu()

def forward_propagation(X, parameters):

    W1 = parameters['W1']
    b1 = parameters['b1']
    W2 = parameters['W2']
    b2 = parameters['b2']
    W3 = parameters['W3']
    b3 = parameters['b3']

    Z1 = tf.matmul(W1, X) + b1                                              
    A1 = tf.nn.relu(Z1)                                              
    Z2 = tf.matmul(W2, A1) + b2                                              
    A2 = tf.nn.relu(Z2)                                             
    Z3 = tf.matmul(W3, A2) + b3                                              

    return Z3

定义损失函数

由于是分类问题,我们使用交叉熵,API是
tf.nn.softmax_cross_entropy_with_logits()
https://www.tensorflow.org/api_docs/python/tf/nn/softmax_cross_entropy_with_logits
,然后求取平均值API为
tf.reduce_mean()
https://www.tensorflow.org/api_docs/python/tf/reduce_mean

def compute_cost(Z3, Y):    

    logits = tf.transpose(Z3)
    labels = tf.transpose(Y)

    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = logits, labels = labels))

    return cost

创建,初始化Session,运行Graph

Adam优化算法API
tf.train.AdamOptimizer()
https://www.tensorflow.org/api_docs/python/tf/train/AdamOptimizer
而minimize() 是所有优化算法类里定义的函数,用来最小化cost。

初始化所有全局变量API
tf.global_variables_initializer()
https://www.tensorflow.org/versions/r1.0/api_docs/python/tf/global_variables_initializer

element-wise判断是否相同
tf.equal()
https://www.tensorflow.org/api_docs/python/tf/equal

得到最大值的下标,默认按列取。
tf.argmax()
https://www.tensorflow.org/api_docs/python/tf/argmax

数据类型转换
tf.cast()
https://www.tensorflow.org/api_docs/python/tf/cast

def model(X_train, Y_train, X_test, Y_test, learning_rate = 0.0001,
          num_epochs = 1500, minibatch_size = 32, print_cost = True):


    tf.reset_default_graph()
    tf.set_random_seed(1) 
    seed = 3 
    (n_x, m) = X_train.shape 
    n_y = Y_train.shape[0] 
    costs = [] 

    X, Y = create_placeholders(n_x, n_y)
    parameters = initialize_parameters()
    Z3 = forward_propagation(X, parameters)
    cost = compute_cost(Z3, Y) 
    optimizer = tf.train.AdamOptimizer(learning_rate = learning_rate).minimize(cost)

    init = tf.global_variables_initializer()

    with tf.Session() as sess:       
        # Run the initialization
        sess.run(init)
        # Do the training loop
        for epoch in range(num_epochs):

            epoch_cost = 0. 
            num_minibatches = int(m / minibatch_size) 
            seed = seed + 1
            minibatches = random_mini_batches(X_train, Y_train, minibatch_size, seed)

            for minibatch in minibatches:

                (minibatch_X, minibatch_Y) = minibatch

                _ , minibatch_cost = sess.run([optimizer, cost], feed_dict = {X: minibatch_X, Y: minibatch_Y})


                epoch_cost += minibatch_cost / num_minibatches

            # Print the cost every epoch
            if print_cost == True and epoch % 100 == 0:
                print ("Cost after epoch %i: %f" % (epoch, epoch_cost))
            if print_cost == True and epoch % 5 == 0:
                costs.append(epoch_cost)

        # plot the cost
        plt.plot(np.squeeze(costs))
        plt.ylabel('cost')
        plt.xlabel('iterations (per tens)')
        plt.title("Learning rate =" + str(learning_rate))
        plt.show()

        # lets save the parameters in a variable
        parameters = sess.run(parameters)
        print ("Parameters have been trained!")

        # Calculate the correct predictions
        correct_prediction = tf.equal(tf.argmax(Z3), tf.argmax(Y))

        # Calculate accuracy on the test set
        accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

        print ("Train Accuracy:", accuracy.eval({X: X_train, Y: Y_train}))
        print ("Test Accuracy:", accuracy.eval({X: X_test, Y: Y_test}))

        return parameters

小广告

淘宝choker、耳饰小店 物理禁止
女程序员编码时和编码之余 都需要一些美美的choker、耳饰来装扮自己
男程序员更是需要常备一些来送给自己心仪的人
淘宝小店开店不易 希望有缘人多多支持 (O ^ ~ ^ O)
本号是本人 只是发则小广告 没有被盗 会持续更新深度学习相关博文和一些翻译
感谢大家 不要拉黑我 ⊙﹏⊙|||°
这里写图片描述

猜你喜欢

转载自blog.csdn.net/mike112223/article/details/78173497