人工智能实战_第二次作业_王涛

一丶作业要求

  1. 实现线性反向传播
  2. 题目链接

二、解决方法

1) 在每次迭代中都重新计算Δb,Δw的贡献值:

import numpy as np
def func_z(w, b):
    return (2*w + 3*b)*(2*b + 1)
z_true = 150
w = 3
b = 4
count = 0
while((func_z(w, b)-z_true)>=1e-5):
    count+=1
    z = func_z(w, b)
    dz = np.abs(z - z_true)/2
    y = 2*b + 1
    x = 2*w + 3*b
    dw = dz/(2*y)
    db = dz/(2*x + 3*y)
    print("w=%f, b=%f, z=%f, delta_z=%f, delta_b=%f" %(w, b, z, 2*dz, db))
    w = w - dw
    b = b- db
print("w=%f, b=%f, z=%f, delta_z=%f" %(w, b, func_z(w, b), func_z(w, b)-z_true))
print(f"Interation counts:{count}, final_w:{w}, final_b:{b}")
#output:******************************************************
w=3.000000, b=4.000000, z=162.000000, delta_z=12.000000, delta_b=0.095238
w=2.666667, b=3.904762, z=150.181406, delta_z=0.181406, delta_b=0.001499
w=2.661519, b=3.903263, z=150.000044, delta_z=0.000044, delta_b=0.000000
w=2.661517, b=3.903263, z=150.000000, delta_z=0.000000
Interation counts:3, final_w:2.661517402927456, final_b:3.9032629057674404

2)梯度下降

实际上在极低学习率下,精度很难提高,不过损失是减少的。

w = 3
b = 4
loss = []
eta = 1e-6
while(np.abs(func_z(w, b)-z_true)>=1e-3):
    y = 2*b + 1
    x = 2*w + 3*b
    gradient_w = 2*y
    gradient_b = 2*x + 3*y
    w = w - eta*gradient_w
    b = b - eta*gradient_b
    loss.append(func_z(w, b)-z_true)
    if np.abs(func_z(w, b)-z_true)<=1e-3:
        print(func_z(w, b)-z_true)

3)tensorflow自动求导

import tensorflow as tf

w = tf.Variable(3.0, dtype=tf.float64, name='w')
b = tf.Variable(4.0, dtype=tf.float64, name='b')
f = (2*w + 3*b)*(2*b + 1)
loss = tf.abs(f-150.0)

#grads_w = tf.gradients(f, [w])
#grads_b = tf.gradients(f, [b])
#training_op1 = tf.assign(w, w-learning_rate*grads_w)
los = np.infty
training_op = tf.train.AdamOptimizer(0.0001).minimize(loss)
init = tf.global_variables_initializer()

with tf.Session() as sess:
    init.run()
    while(los>=1e-5):
        _, los = sess.run([training_op, loss])
        
    print("Final w, b", sess.run([w, b]))
    print(sess.run(loss))
    print(los)
#output:******************************************************
Final w, b [2.8489226246747603, 3.8490755625488835]
0.0002418711751772662
7.760580075455437e-06

loss 和 los 理论上应该相同的,这里不知道为啥不同,不过梯度下降还是很难求得精确解的。如果将损失改为tf.square(f - 150), 解更精确了,rmse损失函数为凸函数??

猜你喜欢

转载自www.cnblogs.com/ownt/p/10514242.html