1 Model Representation
1. Housing price prediction training set
Size in feet²**(x)** | Price ($) in 1000’s(y) |
---|---|
2104 | 460 |
1416 | 232 |
1534 | 315 |
852 | 178 |
… | … |
In the training set of housing price prediction, the input and output results are given at the same time, that is, the "correct results" marked by humans are given , and the predicted amount is continuous, which belongs to the regression problem in supervised learning.
2. Problem Solving Model
2 Cost Function
3 Cost Function - Intuition 1 (Cost Function - Intuition I)
4 Cost Function - Intuition 2 (Cost Function - Intuition II)
5 Gradient Descent
6 Gradient Descent Intuition
Finally, gradient descent can be used not only for the cost function in linear regression, but also for minimizing other cost functions.
7 Gradient Descent For Linear Regression
In addition, using loop solving, the code is more redundant, and we will talk about how to use **Vectorization** to simplify the code and optimize the calculation, so that the gradient descent runs faster and better.
8 Code Implementation
The whole part of 2 needs to predict the profit of opening a snack bar based on the population of the city. The
data is in ex1data1.txt. The first column is the population of the city, and the second column is the profit of the snack bar in the city.
8.1 Plotting the Data
Read in the data, then display the data
In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
In [2]:
path = '../ex1data1.txt'
data = pd.read_csv(path, header=None, names=['Population', 'Profit'])
data.head()
Out [2]:
In [3]:
data.plot(kind='scatter', x='Population', y='Profit', figsize=(12,8))
plt.show()
8.2 Gradient descent
In this part, you need to train the parameters θ of linear regression on the existing data set
8.2.1 Official
#这个部分计算J(Ѳ),X是矩阵
def computeCost(X, y, theta):
inner = np.power(((X * theta.T) - y), 2)
return np.sum(inner) / (2 * len(X))
#调用
computeCost(X, y, theta)
8.2.2 Implementation
In [4]:
data.insert(0, 'Ones', 1)
Now let's do some variable initialization.
In [5]:
# 初始化X和y
cols = data.shape[1]
X = data.iloc[:,:-1]#X是data里的除最后列
y = data.iloc[:,cols-1:cols]#y是data最后一列
Observe if X (training set) and y (target variable) are correct.
In [6]:
X.head()#head()是观察前5行
Out [6]:
\
In [7]:
y.head()
Out [7]:
The cost function is supposed to be a numpy matrix, so we need to transform X and Y before we can use them. We also need to initialize theta.
In [8]:
X = np.matrix(X.values)
y = np.matrix(y.values)
theta = np.matrix(np.array([0,0]))
In [9]:
X.shape, theta.shape, y.shape
Out [9]:
8.2.3 Computing J(θ)
Calculate the cost function (theta initial value is 0), the answer should be 32.07
In [10]:
def computeCost(X, y, theta):
inner = np.power(((X * theta.T) - y), 2)
return np.sum(inner) / (2 * len(X))
#这个部分计算J(Ѳ),X是矩阵
computeCost(X, y, theta)
Out [10]:
32.072733877455676
8.2.4 Gradient descent
In [11]:
def gradientDescent(X, y, theta, alpha, iters):
temp = np.matrix(np.zeros(theta.shape))
parameters = int(theta.ravel().shape[1])
cost = np.zeros(iters)
for i in range(iters):
error = (X * theta.T) - y
for j in range(parameters):
term = np.multiply(error, X[:,j])
temp[0,j] = theta[0,j] - ((alpha / len(X)) * np.sum(term))
theta = temp
cost[i] = computeCost(X, y, theta)
return theta, cost
#这个部分实现了Ѳ的更新
Initialize some additional variables - the learning rate α and the number of iterations to perform, already mentioned in 2.2.2.
In [12]:
alpha = 0.01
iters = 1500
Now let's run the gradient descent algorithm to fit our parameter θ to the training set.
In [13]:
g, cost = gradientDescent(X, y, theta, alpha, iters)
g
Out [13]:
matrix([[-3.63029144, 1.16636235]])
In [14]:
predict1 = [1,3.5]*g.T
print("predict1:",predict1)
predict2 = [1,7]*g.T
print("predict2:",predict2)
#预测35000和70000城市规模的小吃摊利润
predict1: [[0.45197679]]
predict2: [[4.53424501]]
In [15]:
x = np.linspace(data.Population.min(), data.Population.max(), 100)
f = g[0, 0] + (g[0, 1] * x)
fig, ax = plt.subplots(figsize=(12,8))
ax.plot(x, f, 'r', label='Prediction')
ax.scatter(data.Population, data.Profit, label='Traning Data')
ax.legend(loc=2)
ax.set_xlabel('Population')
ax.set_ylabel('Profit')
ax.set_title('Predicted Profit vs. Population Size')
plt.show()
#原始数据以及拟合的直线
8.3 Visualizing J(θ)
It won't be reproduced with python, take a screenshot to mean