Application of Tensorflow nonlinear regression

Application of Tensorflow nonlinear regression

Linear regression trains k and d, and nonlinear regression uses neural networks to train the weight w. There are many pitfalls in tensorflow 2.0 version. Please note that the complete code is written at the end

First, give a nonlinear relationship y = x 2 y = x^2Y=x2 , plus a normal distribution of noise, get the input-output relationship through neural network training

  • tf.compat.v1.disable_eager_execution() was added due to the incompatibility between tensorflow 2.x version and 1.x version
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
tf.compat.v1.disable_eager_execution()
# np.linspace(-0.5,0.5,200)是在-0.5到0.5之间生成200个点 [:,np.newaxis]是把数据变成200行1列
x_data = np.linspace(-0.5,0.5,200)[:,np.newaxis]
noise = np.random.normal(0,0.02,x_data.shape)
y_data = np.square(x_data) + noise

plt.scatter(x_data,y_data)

Insert picture description here
The input and output relationship is shown in the figure above. The training model is obtained by neural network below
. 1. First define the algorithm, and then perform the operation in the session, x and y are the input and output data of the operation, which are defined as n rows and 1 column There is no limit to the number of lines of placeholders, so it is represented by None.
2. The neural network is defined as a structure of 1-20-1, that is, the hidden layer has 20 neurons, so the weight from the input layer to the hidden layer is w 1 w_1w1Defined as 1×20, the hidden layer to the output layer weight w 2 w_2w2Defined as 20×1, offset b 1, b 2 b_1, b_2b1,b2The initial values ​​are all set to 0, it is worth noting that b 1, b 2 b_1, b_2b1,b2Here are all one-dimensional, and it is not defined as an array form. It may be wrong to define it as an array for addition, because a one-dimensional list does not distinguish between rows and columns. If it is row addition, it is considered row addition, and it should be column Add, it is considered as column addition, where xw1_plus_b1 should be column addition.
3. The activation function uses tanh or relu function, but the effect is not very good. Several factors that affect the training effect are: training times, activation function, number of neurons and number of hidden layers
4. Construction cost The function (MSE, Mean Square Error) used is
loss = tf.reduce_mean(tf.square(y-L2)), using loss = tf.losses.mean_squared_error(y, L2) will cause an error, because in tensorflow 1 The results of these two operations in .x are the same, but in the 2.x version, the results are different. The case is shown below. In tf 2.0, you can use tf.compat.v1.losses.mean_squared_error(y,L2)

a = tf.constant([[4.0, 4.0, 4.0], [3.0, 3.0, 3.0], [1.0, 1.0, 1.0]])
b = tf.constant([[1.0, 1.0, 1.0], [1.0, 1.0, 1.0], [2.0, 2.0, 2.0]])
print(a)
print(b)
with tf.compat.v1.Session() as sess:
    print(sess.run(a))
    print(sess.run(b))
    print("\n")
c = tf.square(a - b)
mse1 = tf.reduce_mean(c) 
with tf.compat.v1.Session() as sess:
    #print(sess.run(c))
    print(sess.run(mse1))    
mse2 = tf.losses.mean_squared_error(a,b)
with tf.compat.v1.Session() as sess:
    print(sess.run(mse2))  

The results show that

Tensor("Const_70:0", shape=(3, 3), dtype=float32)
Tensor("Const_71:0", shape=(3, 3), dtype=float32)
[[4. 4. 4.]
 [3. 3. 3.]
 [1. 1. 1.]]
[[1. 1. 1.]
 [1. 1. 1.]
 [2. 2. 2.]]


4.6666665
[9. 4. 1.]

It can be seen that the mean square error calculated by tf.losses.mean_squared_error(y,L2) is a list, and the value calculated by loss = tf.reduce_mean(tf.square(y-L2)) is the mean value of this list. Obviously We use a list as loss as the cost function to minimize it. Our understanding of the process of this operation is unclear, but it can be used in the tensorflow 1.x version.

5. In the training, the feed data x_data, y_data should be input in the form of a dictionary, and the value of loss is checked every 150 times of training.

# 定义两个占位符
# [None,1]表示数据是n行1列的
x = tf.compat.v1.placeholder(tf.float32,[None,1])
y = tf.compat.v1.placeholder(tf.float32,[None,1])

# 定义神经网络结构 1-20-1
# 定义运算
# 权重w,偏置b1
w1 = tf.Variable(tf.random.normal([1,20]))
# 这里b1是一个一维数组shape = (20,),表示有20个元素,故下面 tf.matmul(x,w1) + b1 是可以相加的,
# 但如果b1是二维数组,表示20行1列时候,肯定就会报错,一维数组没有行和列,这里加法运算可以把b1看做1行20列进行加法
b1 = tf.Variable(tf.zeros([20]))
xw1_plus_b1 = tf.matmul(x,w1) + b1
# 激活函数用的是tanh
L1 = tf.nn.tanh(xw1_plus_b1)

w2 = tf.Variable(tf.random.normal([20,1]))
b2 = tf.Variable(tf.zeros([1]))
xw2_plus_b2 = tf.matmul(L1,w2) + b2
L2 = tf.nn.tanh(xw2_plus_b2)

# 构建代价函数
#loss = tf.losses.mean_squared_error(y,L2) 用这个做代价函数应该和下面那个是一样的,
#但在tensorflow2.0中我发现结果不一样,最后有案例
loss = tf.reduce_mean(tf.square(y - L2))

# 梯度下降法
train = tf.compat.v1.train.GradientDescentOptimizer(0.1).minimize(loss)

#创建会话
with tf.compat.v1.Session() as sess:
    # 初始化变量
    sess.run(tf.compat.v1.global_variables_initializer())
    for i in range(3000):
        sess.run(train,feed_dict = {
    
    x:x_data,y:y_data})
        if i%150 == 0:
            print(sess.run(loss,feed_dict = {
    
    x:x_data,y:y_data}))
        
    prediction_values = sess.run(L2,feed_dict = {
    
    x:x_data})
    plt.scatter(x_data,y_data)
    plt.scatter(x_data,prediction_values)
    plt.show()
    

The complete code of the training result is
Insert picture description here
as follows

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
tf.compat.v1.disable_eager_execution()
# np.linspace(-0.5,0.5,200)是在-0.5到0.5之间生成200个点 [:,np.newaxis]是把数据变成200行1列
x_data = np.linspace(-0.5,0.5,200)[:,np.newaxis]
noise = np.random.normal(0,0.02,x_data.shape)
y_data = np.square(x_data) + noise

plt.scatter(x_data,y_data)
# 定义两个占位符
# [None,1]表示数据是n行1列的
x = tf.compat.v1.placeholder(tf.float32,[None,1])
y = tf.compat.v1.placeholder(tf.float32,[None,1])

# 定义神经网络结构 1-20-1
# 定义运算
# 权重w,偏置b1
w1 = tf.Variable(tf.random.normal([1,20]))
# 这里b1是一个一维数组shape = (20,),表示有20个元素,故下面 tf.matmul(x,w1) + b1 是可以相加的,
# 但如果b1是二维数组,表示20行1列时候,肯定就会报错,一维数组没有行和列,这里加法运算可以把b1看做1行20列进行加法
b1 = tf.Variable(tf.zeros([20]))
xw1_plus_b1 = tf.matmul(x,w1) + b1
# 激活函数用的是tanh
L1 = tf.nn.tanh(xw1_plus_b1)

w2 = tf.Variable(tf.random.normal([20,1]))
b2 = tf.Variable(tf.zeros([1]))
xw2_plus_b2 = tf.matmul(L1,w2) + b2
L2 = tf.nn.tanh(xw2_plus_b2)

# 构建代价函数
#loss = tf.losses.mean_squared_error(y,L2) 用这个做代价函数应该和下面那个是一样的,
#但在tensorflow2.0中我发现结果不一样,最后有案例
loss = tf.reduce_mean(tf.square(y - L2))

# 梯度下降法
train = tf.compat.v1.train.GradientDescentOptimizer(0.1).minimize(loss)

#创建会话
with tf.compat.v1.Session() as sess:
    # 初始化变量
    sess.run(tf.compat.v1.global_variables_initializer())
    for i in range(3000):
        sess.run(train,feed_dict = {
    
    x:x_data,y:y_data})
        if i%150 == 0:
            print(sess.run(loss,feed_dict = {
    
    x:x_data,y:y_data}))
        
    prediction_values = sess.run(L2,feed_dict = {
    
    x:x_data})
    plt.scatter(x_data,y_data)
    plt.scatter(x_data,prediction_values)
    plt.show()
    

Guess you like

Origin blog.csdn.net/weixin_44823313/article/details/112478390