机器学习 多元线性回归

版权声明:聂CC~ https://blog.csdn.net/ncc1995/article/details/86347773

线性回归:

x1为第一个特征,x2为第二个特征,也可以称为属性。y为真实值,h为预测值。

 h_{\theta }(x) = \theta _{0} + \theta_{1}x_{1} + \theta_{2}x_{2}

h(x) = \sum_{i=0}^{n}\theta_{i}x_{i} = \theta^{T}x

所以损失函数定义如下:

J(\theta) =\frac{1}{2} \sum_{i = 1}^{m}(h_{\theta}(x^{(i)}-y^{(i)}))^{2}

x^{(i)}表示第i个样本。

利用梯度下降进行参数更新:

\theta_{j} = \theta{j} -\alpha\frac{\partial }{\partial \theta_{j}}(J(\theta))

\frac{\partial}{\partial\theta_{j}}J(\theta) = \frac{\partial}{\partial\theta_{j}}\frac{1}{2}(h_{\theta}(x)-y)^{2}\\=(h_{\theta}(x)-y)\frac{\partial}{\partial\theta_{j}}(h_{\theta}(x)-y)\\=(h_{\theta}(x)-y)\frac{\partial}{\partial\theta_{j}}(\sum_{i=0}^{n}\theta_{i}x_{i}-y)\\=(h_{\theta}(x)-y)x_{j}

  • x_{i} 在此处代表的不是第i个样本,而是样本第i维的特征或属性。
  • \theta_{j}=\theta_{j}-\alpha(h_{{\theta}}(x^{(i)})-y^{(i)})x^{(i)}_{j}表示的是第i个样本的第j维特征

参数更新有两个规则:

1、把m个样本对参数j的梯度分别求出来,然后求和。

                  \theta_{j} = \theta_{j}-\alpha\sum_{i=1}^{m}(h_{\theta}(x^{(i)})-y^{(i)})x^{(i)}_{j}

2、每次更新利用一个样本对参数j求梯度,然后循环m次,即将m个样本全部使用一遍。

                 \theta_{j}=\theta_{j}-\alpha(h_{{\theta}}(x^{(i)})-y^{(i)})x^{(i)}_{j} \qquad for \ i \ in \ range(m)                  

import numpy as np
import matplotlib.pyplot as plt


#x = np.array([1, 1.3, 1.4, 2, 2.4, 3.6, 4, 6, 7, 10, 13, 18])
x = np.sort(10 * np.random.randn(30))
print(x)
m = x.shape[0]
x1 = x.reshape(1, m)
x2 = np.square(x).reshape(1, m)
x = x.reshape(1, m)

X = np.concatenate((x1, x2), axis=0)
y = 2 * x1 + 3 * x2 + 0.1 * np.random.normal(0, 1, m)
theta = np.random.randint(1, 2, 2).reshape(2, 1).astype(np.float32)
#学习率的设置非常重要
alpha = 0.00001
#正则
reg = 0.001
print(X.shape)
iters = 1
for k in range(iters):
    for i in range(m):
        for j in range(theta.shape[0]):
            #print(alpha*(np.dot(theta.T, X[:, i]) - y[:, i]))
            theta[j, :] -= (alpha * ((np.dot(theta.T, X[:, i]) - y[:, i]) * X[j, i]))
print(theta)
plt.subplot(2, 1, 1)
h_x = np.dot(theta.T, X)
print(y)
print(h_x)
plt.plot(x.reshape(m), y.reshape(m), '*',  label='y')
plt.plot(x.reshape(m), h_x.reshape(m), label='h_x')
plt.legend()
plt.show()

猜你喜欢

转载自blog.csdn.net/ncc1995/article/details/86347773