[001] Python implementation of a single variable linear regression

Disclaimer: This article is a blogger original article, follow the CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source link and this statement.
This link: https://blog.csdn.net/shipsail/article/details/89304413

Next introduce myself

This article describes a single variable linear regression! ! ! ! ! !

Hello, you can call me relax Aberdeen, which can be considered the first chapter in the life of a formal share blog. I do a college freshman, machine learning (Machine Learning) interest.
hhh, also new to learn in this regard, I would like to share the experience for beginners, and give some stepped pit.

Also, I'm not a big brother Ha! ! ! I welcome you all with learning together! ! !
Here Insert Picture Description
The final regression results!

Some concepts

What fitting that?

In my opinion, it is to fit a series of points on the plane, with a smooth curve connected.

Because the point is very complex, we can go with a combination of various functions to achieve our objective.

What generalization ability?

Training set after training, the ability to predict the location of the data.

Simple understanding can be.

What over-fitting that?

Overfitting, we may be too complex model selection, excellent performance in training, but actual combat to die ~

What underfitting that?

And over-fitting the contrary, the effect of that is training on the poor - not to mention the actual pull it!

Linear Regression

Suppose function (hypothesis)

Suppose function (Hypothesis), this may be a simple linear function, a polynomial may be complex combinations. (I think he is the mathematical model may be wrong, correct me if looking out)

The cost function (cost function)

Here Insert Picture Description
My image of the map should be explained his meaning, right? We can thus calculate the difference between the value of y and assume the function h (x) predicted all points, after the adoption of the square (the square of negative to positive) averaged and divided by 2, the Andrew Ng teacher's curriculum is so to speak da usually lower by 2, in order to further minimize the cost of it? ?

Linear regression goal (Goal)

Our goal: To make a prediction error value and the function value is minimized

即 minimizeJ(theta)

Therefore, the introduction of a gradient descent (Gradient Descent)

Gradient descent can also be simple to understand, multidimensional also well understood, my first sort out the most basic gradient descent again
Here Insert Picture Description
this is just to give an example, the derivative of the function J, we found the partial derivative of always pointing at a minimum .
So with gradient descent equation:
Here Insert Picture Description
This requires knowledge of the high number of partial derivatives used in, if you have studied derivatives, the partial derivative is very simple -
specific derivation, you can do the math yourself!

The Code

import sklearn as sk
"""
单变量的线性回归基础
单变量:一个特征哈!!!
"""


from sklearn.datasets import load_boston
import numpy as np

#输出数据的结构
def print_data(data):
    print("Shape:",data.shape,"\n")
    print("Feature:",data['feature_names'], "\n")
    print("Keys:",data.keys(), "\n")

"""
#   计算CostFunction 代价函数
#   X为特征矩阵
#   y为目标
#   theta为theta系数 1 X n 的矩阵
"""
def computerCost(X ,y ,theta):
    num = len(y)
    y = y.reshape(num,1)
    result = np.power((X.dot(theta.transpose()) - y),2)
    return np.sum(result) /(2* num)


"""
    梯度下降函数
#   Param X train Feature
#   Param y train Target
#   Param theta h(x)=theta(0)+theta(1)*x1
#   Param times 迭代的次数
"""
def gridientDescent(X,y,theta,times,learn_rate):
    number = len(y) #获取训练数据的数量
    X[:,1] = X[:,1]/100 #特征缩放
    temp = np.zeros(theta.shape) #用于更新theta
    cost = np.zeros(times) #记录每次迭代后代价函数的值
    paramters = len(theta[0]) #参数的个数

    for i in range(times):
        y = y.reshape(number,1)
        H = X.dot(theta.transpose())
        E = H - y
        #print("Y SHAPE:",y.shape)
        #print("H SHAPE:",H.shape)
        #print("E SHAPE:",E.shape)
        for j in range(paramters):#更新theta
            XT = X.transpose()
            XT = XT[j,:]
            XT = XT.reshape(1,len(XT))
            #print("XT",XT.shape)
            term = XT.dot(E)
            temp[0, j] = theta[0, j] - ((1 / number) * learn_rate * term)

        theta = temp
        print(computerCost(X,y,theta))
    return theta


if __name__ == '__main__':
    """
        准备好了吗?我们开始操作咯!
    #   首先获取我们的数据
    """
    Boston = load_boston()
    BostonData = Boston.data ##特征数据
    BostonTarget = Boston.target ##房价
    BostonFeatureName = Boston.feature_names
    print(BostonFeatureName)
    """我们使用AGE 这个特征做本次的单特征回归"""
    BostonAge = BostonData[:10,6]
    BostonAgeTarget = BostonTarget[:10]
    print('TARGET SHAPE :',BostonAgeTarget.shape)
    print("DATA SHAPE:",BostonAge.shape)
    """发现一共有506条数据,我们获取其中前10行"""
    print("DATA:\n",BostonAge)




    #给X加入一列,因为theta0 是一个常数项
    X = np.column_stack((np.ones(len(BostonAgeTarget)),BostonAge))
    theta = np.array([[0,0]]) #参数矩阵
    times  = 100000 #迭代次数
    step = 0.02 #步长
    g = gridientDescent(X,BostonAgeTarget,theta,times,step)

    """将数据可视化"""
    import matplotlib.pyplot as plt

    font = {
    'family': 'SimHei',
    'weight': 'bold',
    'size': '10'
    }
    plt.rc('font', **font)
    plt.rc('axes', unicode_minus=False)
    plt.xlabel("Feature Age")
    plt.ylabel("House Price")
    plt.title("Boston Price Data By Age")
    plt.scatter(x=BostonAge,
                y=BostonAgeTarget,
                c='b',
                alpha=.8,
                label='训练数据')

    #画出拟合直线
    x = np.linspace(BostonAge.min(), BostonAge.max(), 100)
    f = g[0, 0] + (g[0, 1] * x /100) #注意除以100,因为我们做过特征缩放
    plt.plot(x, f, 'r', label='预测函数')
    plt.legend(loc=2)
    plt.show()


Coding pit encountered

  1. In order to understand every step introduction, the code added to a lot of unnecessary intermediate variables, the real practice time does not need to be so complicated Oh ~
  2. shape matrix, such as after reading target y, his shape is one-dimensional, but the matrix is ​​two-dimensional, so there will not be operational.
  3. Try to output the Shape, algorithm by a matrix, which can quickly find the location of the error code may
  4. * Multiplication and matrix operations np.dot () are different
  5. The last matrix solver: This is a matrix Oh! ! We used linear algebra to derive learned you will understand the pull (Hint: see the push by the complex, from one element to push the entire matrix)
    Here Insert Picture Description

thank

Thank you for reading ~ continue to study it!

Guess you like

Origin blog.csdn.net/shipsail/article/details/89304413