tensorflow learning (5) - Regression by training the neural network

Foreword

Previous blog: tensorflow learning (4) - with tensorflow training slope and intercept of a linear function
to solve the problem of the return of this chapter.

Introducing respective packets

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

Neural network level

Here Insert Picture Description
Here FIG pick a book with the most left side is the input layer of the network, in our case we use is numpy generated random sample (x_data, y_data), W represents the weight.

Prepare the test sample

#使用numpy来生成200个随机点,等差数列
x_data = np.linspace(-0.5,0.5,200)[:,np.newaxis]
noise = np.random.normal(0,0.02,x_data.shape)
y_data = np.square(x_data) + noise

Tips:

  • [:, Np.newaxis] which is a matrix of dimension L, which represents the range of the parameter before the comma liters dimension, such as [1: 5, newaxis] denote the array members 1-5 liters dimensions.
  • linspace (-0.5,0.5,200) This function is the first parameter, starting a second end, the third is the total number. For generating an arithmetic sequence.

Design of the intermediate layer

#定义神经网络中间层
Weights_L1 = tf.Variable(tf.random_normal([1,10]))
biases_L1 = tf.Variable(tf.zeros([1,10]))
Wx_plus_b_L1 = tf.matmul(x,Weights_L1) + biases_L1
#激活函数
L1 = tf.nn.tanh(Wx_plus_b_L1)

Tips:

  • Weights_L1: This is the weight of the first layer, hidden layer weight of this layer x_data input only one node, tf.random_normal ([1,10]) of the first parameter is set to 1, then we want to design ten neurons, so the second parameter is set to 10.
  • biases_L1: This is the offset parameter, and Weights_L1 as input only one node, we want to design ten neurons in the intermediate layer, the parameter set to tf.Variable (tf.zeros ([1,10]))
  • Wx_plus_b_L1: This is the figure above a11, the results of a12, a13, enter the matrix by the weighting matrix plus offset matrix.
  • L1 = tf.nn.tanh (Wx_plus_b_L1): This is the core part of the neural network, we mentioned aboveEnter the Matrix by the weighting matrix plus offset matrixThis sentence can obviously know a relationship between input and output is linear, but in our example the sample is a quadratic function, apparently by y = kx + b such lines is that we can not fit well curve, the activation function used here is a tanh, coupled to the linear nonlinear possible.

Design output layer

#定义输出层
Weights_L2 = tf.Variable(tf.random_normal([10,1]))
biases_L2 = tf.Variable(tf.zeros([1,1]))
Wx_plus_b_L2 = tf.matmul(L1,Weights_L2) + biases_L2
#激活函数
prediction = tf.nn.tanh(Wx_plus_b_L2)

Tips:

  • Weights_L2: Because we designed earlier ten neurons of the intermediate layer, the output layer where the right weight tf.Variable (tf.random_normal ([10,1])) The first argument is set to 10, then since there is only one output y, so the second parameter is set to 1.
  • biases_L2: Dimension bias matrix is ​​provided with the same weighting matrix.
  • Wx_plus_b_L2: input matrix by the weighting matrix plus offset matrix.
  • prediction = tf.nn.tanh (Wx_plus_b_L2): the activation function the same meaning hereinbefore.

training

#二次代价函数
loss = tf.reduce_mean(tf.square(y - prediction))
#用梯度下降法
train_step = tf.train.GradientDescentOptimizer(0.2).minimize(loss)

with tf.Session() as sess:
    #变量初始化
    sess.run(tf.global_variables_initializer())
    for i in range(2000):
        sess.run(train_step,feed_dict={x:x_data,y:y_data})
        
    #获得预测值
    prediction_value = sess.run(prediction,feed_dict = {x:x_data})
    #画图
    plt.figure()
    plt.scatter(x_data,y_data)
    plt.plot(x_data,prediction_value,'r-',lw=5)
    plt.show()

Training results

Here Insert Picture Description

Activation function test

Read a lot of online activation function argument, commonly used relu, tanh, sigmoid and so on.
The reason to use these functions are the following considerations:

  • Nonlinear
  • Limited output range, is suitable as an output layer
  • Output is centered at 0
  • Easy to calculate
  • Hengda value is not less than zero to zero or a constant

Beginner neural network, see the principle of explanation really do not understand, try to change a few parameters here and take a look at the difference between the activation function.

Relu used as activation function

Activation function: relu

#使用numpy来生成200个随机点,等差数列
x_data = np.linspace(-0.5,0.5,200)[:,np.newaxis]
noise = np.random.normal(0,0.02,x_data.shape)
y_data = np.square(x_data) + noise

#定义两个palceholder
x = tf.placeholder(tf.float32,[None,1])
y = tf.placeholder(tf.float32,[None,1])

#定义神经网络中间层
Weights_L1 = tf.Variable(tf.random_normal([1,10]))
biases_L1 = tf.Variable(tf.zeros([1,10]))
Wx_plus_b_L1 = tf.matmul(x,Weights_L1) + biases_L1
#激活函数
L1 = tf.nn.relu(Wx_plus_b_L1)

#定义输出层
Weights_L2 = tf.Variable(tf.random_normal([10,1]))
biases_L2 = tf.Variable(tf.zeros([1,1]))
Wx_plus_b_L2 = tf.matmul(L1,Weights_L2) + biases_L2
#激活函数
prediction = tf.nn.relu(Wx_plus_b_L2)

#二次代价函数
loss = tf.reduce_mean(tf.square(y - prediction))
#用梯度下降法
train_step = tf.train.GradientDescentOptimizer(0.2).minimize(loss)#修改学习率!!!

with tf.Session() as sess:
    #变量初始化
    sess.run(tf.global_variables_initializer())
    for i in range(2000):
        sess.run(train_step,feed_dict={x:x_data,y:y_data})
        
    #获得预测值
    prediction_value = sess.run(prediction,feed_dict = {x:x_data})
    #画图
    plt.figure()
    plt.scatter(x_data,y_data)
    plt.plot(x_data,prediction_value,'r-',lw=5)
    plt.show()

Output

Learning rate: 0.04
Here Insert Picture Description
learning rate: 0.1
Here Insert Picture Description
learning rate: 0.2
Here Insert Picture Description
learning rate: 0.3
Here Insert Picture Description
learning rate: 0.4
Here Insert Picture Description
learning rate: 0.5
Here Insert Picture Description
learning rate: 0.6
Here Insert Picture Description

relu summary

The higher the learning rate, the easier activation failed.
This is not very rigorous, samples are randomly generated, each generation is different, do not control variables ... but probably can see the impact of the learning rate relu activation function.

Used as the activation function tanh

#使用numpy来生成200个随机点,等差数列
x_data = np.linspace(-0.5,0.5,200)[:,np.newaxis]
noise = np.random.normal(0,0.02,x_data.shape)
y_data = np.square(x_data) + noise

#定义两个palceholder
x = tf.placeholder(tf.float32,[None,1])
y = tf.placeholder(tf.float32,[None,1])

#定义神经网络中间层
Weights_L1 = tf.Variable(tf.random_normal([1,10]))
biases_L1 = tf.Variable(tf.zeros([1,10]))
Wx_plus_b_L1 = tf.matmul(x,Weights_L1) + biases_L1
#激活函数
L1 = tf.nn.tanh(Wx_plus_b_L1)

#定义输出层
Weights_L2 = tf.Variable(tf.random_normal([10,1]))
biases_L2 = tf.Variable(tf.zeros([1,1]))
Wx_plus_b_L2 = tf.matmul(L1,Weights_L2) + biases_L2
#激活函数
prediction = tf.nn.tanh(Wx_plus_b_L2)

#二次代价函数
loss = tf.reduce_mean(tf.square(y - prediction))
#用梯度下降法
train_step = tf.train.GradientDescentOptimizer(0.04).minimize(loss)

with tf.Session() as sess:
    #变量初始化
    sess.run(tf.global_variables_initializer())
    for i in range(2000):
        sess.run(train_step,feed_dict={x:x_data,y:y_data})
        
    #获得预测值
    prediction_value = sess.run(prediction,feed_dict = {x:x_data})
    #画图
    plt.figure()
    plt.scatter(x_data,y_data)
    plt.plot(x_data,prediction_value,'r-',lw=5)
    plt.show()

Output

Learning rate: 0.01
Here Insert Picture Description
learning rate: 0.04
Here Insert Picture Description
learning rate: 0.1
Here Insert Picture Description
learning rate: 0.2
Here Insert Picture Description
learning rate: 0.3
Here Insert Picture Description
learning rate: 0.4
Here Insert Picture Description
learning rate: 0.5
Here Insert Picture Description
learning rate: 0.6
Here Insert Picture Description

to sum up

Learning rate is too low, the result is not accurate

Use as a sigmoid activation function

#使用numpy来生成200个随机点,等差数列
x_data = np.linspace(-0.5,0.5,200)[:,np.newaxis]
noise = np.random.normal(0,0.02,x_data.shape)
y_data = np.square(x_data) + noise

#定义两个palceholder
x = tf.placeholder(tf.float32,[None,1])
y = tf.placeholder(tf.float32,[None,1])

#定义神经网络中间层
Weights_L1 = tf.Variable(tf.random_normal([1,10]))
biases_L1 = tf.Variable(tf.zeros([1,10]))
Wx_plus_b_L1 = tf.matmul(x,Weights_L1) + biases_L1
#激活函数
L1 = tf.nn.sigmoid(Wx_plus_b_L1)

#定义输出层
Weights_L2 = tf.Variable(tf.random_normal([10,1]))
biases_L2 = tf.Variable(tf.zeros([1,1]))
Wx_plus_b_L2 = tf.matmul(L1,Weights_L2) + biases_L2
#激活函数
prediction = tf.nn.sigmoid(Wx_plus_b_L2)

#二次代价函数
loss = tf.reduce_mean(tf.square(y - prediction))
#用梯度下降法
train_step = tf.train.GradientDescentOptimizer(0.04).minimize(loss)

with tf.Session() as sess:
    #变量初始化
    sess.run(tf.global_variables_initializer())
    for i in range(2000):
        sess.run(train_step,feed_dict={x:x_data,y:y_data})
        
    #获得预测值
    prediction_value = sess.run(prediction,feed_dict = {x:x_data})
    #画图
    plt.figure()
    plt.scatter(x_data,y_data)
    plt.plot(x_data,prediction_value,'r-',lw=5)
    plt.show()

Output

Learning rate: 0.04
Here Insert Picture Description
learning rate: 0.1
Here Insert Picture Description
learning rate: 0.2
Here Insert Picture Description
learning rate: 0.3
Here Insert Picture Description
learning rate: 0.4
Here Insert Picture Description
learning rate: 0.5
Here Insert Picture Description
learning rate: 0.6
Here Insert Picture Description

to sum up

This is clearly inappropriate activation function as our model.

Published 53 original articles · won praise 5 · Views 2211

Guess you like

Origin blog.csdn.net/qq_37668436/article/details/104859522
Recommended