Machine Learning and Deep Learning - Using the Stochastic Gradient Descent Algorithm SGD to Perform Linear Regression on Boston Housing Price Data

Machine Learning and Deep Learning - Using the Stochastic Gradient Descent Algorithm SGD to Perform Linear Regression on Boston Housing Price Data

This time we use the stochastic gradient descent (SGD) algorithm to perform linear regression training on the Boston house price data, give the weight, loss and gradient of each iteration, and draw a graph of the loss loss changing with epoch.

step

1. Import the necessary libraries and modules: numpy, pandas, matplotlib, load_boston and StandardScaler. Among them, load_boston is used to load the Boston housing price dataset, and StandardScaler is used to standardize the data.
2. Load the dataset and normalize the data. Also, add a column of 1s to the data as the intercept term, and convert y to a column vector.
3. Define the SGD function for training. In each epoch, we will randomly extract a batch of data from the sample to calculate the gradient and loss, and update the weights.
4. Use SGD to train the model and output the results of each iteration: weight w, loss loss and gradient grad. At the same time, store the average loss of each epoch into the list losses, and return the final weight and loss.
5. Draw a graph of the loss function value changing with epoch during stochastic gradient descent training

code

1. Use the stochastic gradient descent (SGD) algorithm to perform linear regression training on the Boston house price data, give the weight, loss and gradient of each iteration, and draw a graph of the loss loss changing with epoch.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import load_boston
from sklearn.preprocessing import StandardScaler
#加载数据集并对数据进行标准化处理。同时,为数据添加一列1作为截距项,并将y转换为列向量。
# 加载数据
boston = load_boston()
X, y = boston.data, boston.target

# 标准化数据
scaler = StandardScaler()
X = scaler.fit_transform(X)

# 在数据中添加一列1
X = np.hstack((np.ones((X.shape[0], 1)), X))

# 将y转换为列向量
y = y.reshape(-1, 1)
# 3、定义SGD函数来进行训练。在每个epoch中,我们会随机地从样本中抽取一个batch的数据来计算梯度和损失,并更新权重。
def sgd(X, y, lr=0.01, epochs=100, batch_size=32):
    n_samples, n_features = X.shape
    w = np.zeros((n_features, 1))
    losses = []
    
    for epoch in range(epochs):
        epoch_loss = 0
 
        # 随机排列样本
        permutation = np.random.permutation(n_samples)
        
        for i in range(0, n_samples, batch_size):
            # 获取一个batch的样本
            indices = permutation[i:i+batch_size]
            X_batch = X[indices]
            y_batch = y[indices]
            
            # 计算梯度和损失
            grad = X_batch.T.dot(X_batch.dot(w) - y_batch) / batch_size
            loss = np.mean((X_batch.dot(w) - y_batch) ** 2)
            epoch_loss += loss
            
            # 更新权重
            w -= lr * grad
        
        losses.append(epoch_loss / (n_samples // batch_size))
        
    return w, losses
# 使用SGD训练模型,并输出每次迭代的结果:权重w,损失loss和梯度grad。同时,将每个epoch的平均损失存储到列表losses中,并返回最终的权重和损失。
w, losses = sgd(X, y, lr=0.01, epochs=100, batch_size=32)

def sgd(X, y, lr=0.01, epochs=100, batch_size=32):
    n_samples, n_features = X.shape
    w = np.zeros((n_features, 1))
    losses = []
    
    for epoch in range(epochs):
        epoch_loss = 0
        
        # 随机排列样本
        permutation = np.random.permutation(n_samples)
        
        for i in range(0, n_samples, batch_size):
            # 获取一个batch的样本
            indices = permutation[i:i+batch_size]
            X_batch = X[indices]
            y_batch = y[indices]
            
            # 计算梯度和损失
            grad = X_batch.T.dot(X_batch.dot(w) - y_batch) / batch_size
            loss = np.mean((X_batch.dot(w) - y_batch) ** 2)
            epoch_loss += loss
            
            # 更新权重
            w -= lr * grad
            
            # 输出w值和grad值和loss值
            print('w:', w.flatten())
            print('grad:', grad.flatten())
            print('loss:', loss)
        
        losses.append(epoch_loss / (n_samples // batch_size))
        
    return w, losses

w, losses = sgd(X, y, lr=0.01, epochs=100, batch_size=32)



#绘制随机梯度下降训练过程中,损失函数值随着epoch变化的曲线图
plt.plot(losses)
plt.xlabel('Epoch')
plt.ylabel('Cost')
plt.show()

renderings

insert image description here
insert image description here
insert image description here
insert image description here
Stochastic gradient descent (SGD) is a simple but very effective method, which is mostly used for learning linear classifiers under convex loss functions such as support vector machines and logistic regression (LR). And SGD has been successfully applied to large-scale and sparse machine learning problems often encountered in text classification and natural language processing. SGD can be used for both classification calculations and regression calculations.

Guess you like

Origin blog.csdn.net/Myx74270512/article/details/131622056