1 线性回归
线性回归是有监督学习问题中比较简单的一种模型,可用于预测(房价预测、股票预测)等问题。其原理就是寻找一个最合适的线性函数去拟合我们的数据点,如下图所示:
利用公式一般表达为:
其中,x为上图中数据点的横坐标,y为数据点的纵坐标,表现在具体的数据集上就是x为各个特征的值,即第k个特征的值,y为标签。
表达为张量形式为:
其中,。
W和b都是需要在模型中学习的参数,我们就是通过学习W和b来寻找我们需要的线性函数的。
2 TF的模块化套路
①#初始化变量和模型参数
②inference() #计算推断模型在数据 X上的输出,并将结果返回
③loss(X,Y) #依据训练数据X及期望Y计算损失
④inputs() #读取或生成训练数据X及其期望Y
⑤train(total_loss) #依据计算的总损失训练或调整模型参数
⑥evaluate(sess,X,Y) #对训练得到的模型进行评估
⑦with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
X,Y=inputs()
total_loss=loss(X,Y)
train_op=train(total_loss)
#实际的训练迭代次数
training_steps=1000
for step in range(training_steps):
sess.run(train_op)
#出于调试和学习的目的,查看损失在训练过程中递减的情况
if step%10==0:
print("loss:",sess.run(total_loss))
evaluate(sess,X,Y)
sess.close()
最后附上一个实际应用,使用的是一个将年龄、体重与血液脂肪含量相关联的数据集,由于数据集规模很小,将其直接写入代码中
# -*- coding: utf-8 -*-
"""
Created on Sat Aug 18 20:16:38 2018
@author: zhengyuv
"""
import tensorflow as tf
#初始化变量和模型参数,定义训练闭环中的运算
W = tf.Variable(initial_value=tf.zeros(shape=[2,1]),name="weights")
b = tf.Variable(initial_value=0.0,name="bias")
def inference(X):
#计算推断模型在数据X上的输出,并将结果返回
return tf.matmul(X,W) + b
def loss(X,Y):
#依据训练数据X及其期望输出Y计算损失
Y_predicted = inference(X)
return tf.reduce_sum(tf.squared_difference(Y,Y_predicted))
def inputs():
#读取或生成训练数据X及其期望输出Y
weight_age = [[84,46],[73,20],[65,52],[70,30],
[76,57],[69,25],[63,28],[72,36],
[79,57],[75,44],[27,24],[89,31],
[65,52],[57,23],[59,60],[69,48],
[60,34],[79,51],[75,50],[82,34],
[59,46],[67,23],[85,37],[55,40],
[63,30]]
blood_fat_content=[354,190,405,263,451,302,288,385,
402,365,209,290,346,254,395,434,
220,374,308,220,311,181,274,303,244]
return tf.to_float(weight_age),tf.to_float(blood_fat_content)
def train(total_loss):
#依据计算的总损失训练或调整模型参数
learning_rate=0.0000001
return tf.train.GradientDescentOptimizer(learning_rate).minimize(total_loss)
def evaluate(sess,X,Y):
#对训练得到的模型进行评估
print(sess.run(inference([[80.0,25.0]])))
print(sess.run(inference([[65.0,25.0]])))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
X,Y = inputs()
total_loss = loss(X,Y)
train_op = train(total_loss)
#实际的训练迭代次数
training_steps = 1000
for step in range(training_steps):
sess.run(train_op)
#出于调试和学习的目的,查看损失在训练过程中的递减情况
if step % 10 == 0:
print("loss:",sess.run(total_loss))
evaluate(sess,X,Y)
sess.close()
输出结果如下:
runfile('E:/pyCode/practice1/practice1-1.py', wdir='E:/pyCode/practice1')
loss: 7608772.0
loss: 5352849.5
loss: 5350043.5
loss: 5347919.0
loss: 5346300.5
loss: 5345061.5
loss: 5344105.0
loss: 5343361.0
loss: 5342774.5
loss: 5342305.5
loss: 5341925.0
loss: 5341610.5
loss: 5341344.5
loss: 5341115.5
loss: 5340913.5
loss: 5340733.0
loss: 5340567.5
loss: 5340413.0
loss: 5340267.0
loss: 5340127.5
loss: 5339993.0
loss: 5339861.5
loss: 5339733.0
loss: 5339606.0
loss: 5339481.0
loss: 5339357.5
loss: 5339234.5
loss: 5339111.5
loss: 5338989.5
loss: 5338867.5
loss: 5338746.5
loss: 5338626.0
loss: 5338504.5
loss: 5338384.0
loss: 5338263.0
loss: 5338142.0
loss: 5338021.5
loss: 5337900.5
loss: 5337780.0
loss: 5337659.5
loss: 5337538.5
loss: 5337418.0
loss: 5337298.0
loss: 5337177.0
loss: 5337057.0
loss: 5336936.0
loss: 5336815.5
loss: 5336695.0
loss: 5336575.0
loss: 5336454.5
loss: 5336334.0
loss: 5336214.0
loss: 5336093.0
loss: 5335972.0
loss: 5335852.0
loss: 5335732.0
loss: 5335611.5
loss: 5335491.5
loss: 5335370.5
loss: 5335250.0
loss: 5335130.0
loss: 5335010.0
loss: 5334888.5
loss: 5334769.0
loss: 5334649.0
loss: 5334528.0
loss: 5334408.0
loss: 5334288.0
loss: 5334167.5
loss: 5334047.0
loss: 5333926.0
loss: 5333806.5
loss: 5333686.0
loss: 5333565.5
loss: 5333445.5
loss: 5333326.0
loss: 5333205.5
loss: 5333085.0
loss: 5332964.5
loss: 5332844.0
loss: 5332724.5
loss: 5332604.0
loss: 5332484.0
loss: 5332363.5
loss: 5332243.5
loss: 5332123.5
loss: 5332003.5
loss: 5331883.5
loss: 5331763.5
loss: 5331643.0
loss: 5331523.0
loss: 5331402.0
loss: 5331282.5
loss: 5331162.5
loss: 5331042.0
loss: 5330923.0
loss: 5330802.5
loss: 5330682.5
loss: 5330562.5
loss: 5330442.0
[[320.6497]]
[[267.78183]]
程序调试期间遇到了一个小问题:FailedPreconditionError: Attempting to use uninitialized value weights_5
上网查了一下原来是直接调用了变量初始化函数tf.global_variables_initializer(),这样是不可以的,因为使用TF做张量运算分为设计图和启动图两个步骤,不使用sess.run()就只是停留在“设计”阶段,变量不会真正的初始化,所以一定要使用sess.run(tf.global_variables_initializer())初始化变量。
参考文献:《面向机器智能的TensorFlow实践》