Wu Enda Machine Learning Course Assignment (1) Detailed analysis based on python

@Univariate linear regression

Preface

The machine learning course of teacher Wu Enda of Stanford University is almost a required course for every student who loves the field of artificial intelligence. Although there are many python-based codes on the Internet, most of them use the python interactive mode interpreter ipython to explain. I use pycharm to provide the source code and personal understanding based on my own understanding. Some codes may refer to the code of others. If there is any infringement, please send me a private message

1. Question discussion

The univariate linear regression algorithm needs to predict the profit of opening a snack bar based on the population of the city. The
data can be obtained from Coursera. The first column is the population of the city, and the second column is the profit of the snack bar in the city.

Two, code analysis

1. Introduce the library

The code is as follows (example):

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

The numpy library, pandas library and matplotlib library implement python's linear algebra, data processing and drawing functions respectively. We cannot do without these three modules every time we implement artificial intelligence algorithms.

2. Read in the data

The code is as follows (example):

path ='D:\machine learning data\ex1data1.txt'
data = pd.read_csv(path, header=None, names=['Population', 'Profit'])

The parameter header in read_csv specifies the number of rows to be used as column names, and the number of rows at which the data starts. If there is no column name in the file, the default is 0, otherwise it is set to None. If you explicitly set header=0, the original existing column names will be replaced.

3. Data processing

The code is as follows (example):

data.insert(0, 'Ones', 1)#在第0列插入一列,这列值全为1
clos=data.shape[1]  #shape[1]取列数,[0]取行数,clos=3
X=data.iloc[:,0:clos-1]#iloc,行全选,选01列,是前闭后开集合.X为DataFrame形式,既有行索引,也有列索引
y=data.iloc[:,clos-1:clos]#行全选,选最后一列
#print(X.head())  #验证X,y
#print(y.head())
X = np.matrix(X.values)#将X,y转化成矩阵
y = np.matrix(y.values)
theta=np.matrix([0,0]) #将theta转化成空矩阵j

Note that a new column must be inserted in the 0th column. This is a column of parameters combined with theta0. If there is no such column, the model built is just a straight line, plane, etc. passing through the origin; at the same time, the difference between matrix and array to build matrix is: matrix can only build a one-dimensional matrix, while array can build a multi-dimensional matrix, but in numpy the main matrix is The advantage is: relatively simple notation for multiplication. For example, a and b are two matrices, then a*b is the matrix product. Instead of np.dot()

4. Calculate the cost function

The code is as follows (example):

def computeCost(X, y, theta):#计算代价函数
    inner=np.power(((X*theta.T)-y),2)
    return np.sum(inner)/(2 * len(X))#len(X)为行数,即公式中的m
print(computeCost(X, y, theta))

Here we can let the computer calculate the initial cost, the result is 32.072733877455676. At the same time, it should be noted that theta.T represents the transposition of theta. If theta is not transposed, the correct result will not be obtained.

5. The gradient descent algorithm calculates the optimal solution

The code is as follows (example):

def gradientDescent(X, y, theta, alpha, iters):#alpha学习率,iters迭代次数
    temp = np.matrix(np.zeros(theta.shape))#一个与theta相同维度的0矩阵
    parameters=int(theta.ravel().shape[1]) #ravel()将多维降为一维
    cost = np.zeros(iters)#保存迭代之后的cost
    for i in range(iters):
        error=(X*theta.T)-y
        for j in range(parameters):
            term=np.multiply(error,X[:,j])
            temp[0,j]=theta[0,j] - np.sum(term)*(alpha/len(X))
        theta=temp
        cost[i]=computeCost(X, y, theta)
    return theta, cost
alpha = 0.01
iters= 1500
g,cost = gradientDescent(X, y, theta, alpha, iters)
print(g)
print(computeCost(X, y, g))#使用拟合值来计算代价函数(误差)

Here alpha and iters represent the learning rate and the number of iterations, respectively. It should be noted that the multiply function in the np library is used to multiply the corresponding positions of a matrix or array, not a matrix multiplication calculation in the traditional sense

6. Drawing

The code is as follows (example):

x = np.linspace(data.Population.min(), data.Population.max(), 100) #横坐标在最大和最小之间分100份
f = g[0, 0] + (g[0, 1] * x)
fig, ax = plt.subplots(figsize=(12,8))
ax.plot(x, f, 'r', label='Prediction')
ax.scatter(data.Population, data.Profit, label='Traning Data')
ax.legend(loc=2)
ax.set_xlabel('Population')
ax.set_ylabel('Profit')
ax.set_title('Predicted Profit vs. Population Size')
plt.show()

At the end, we can use matplotlib.pyplot library for drawing operations. At this point, if you can understand the duplicate code without the assistance of others, it means that you have basically completed the homework requirements.

to sum up

Since I am only a sophomore in college, and I am studying machine learning algorithms in my dormitory, I cannot update an article regularly, but I will try my best to complete new articles as soon as possible in my favorite field. Farther!

Guess you like

Origin blog.csdn.net/cc512613/article/details/115309287