Python programming fundamentals for machine learning

1. Regression

The training data set click.csv is as follows:

x,y
235,591
216,539
148,413
35,310
85,308
204,519
49,325
25,332
25,332
173,498
191,498
134,392
99,334
117,385
112,387
162,425
272,659
159,427
59,319
198,522

Our goal is to discover a pattern from this dataset by which we can predict the value of x for any given value of y. This process is also known as learning.
First of all, we display these points in a two-dimensional coordinate system, and the distribution of data can be found more intuitively through the image.

import numpy as np
import matplotlib.pyplot as plt
# 读入训练数据
train = np.loadtxt('click.csv',delimiter=',',skiprows=1)
train_x = train[:,0]
train_y = train[:,1]
# 绘图
plt.plot(train_x,train_y,'o')
plt.show()

insert image description here
We refer to the law to be learned for this dataset as f θ (x).

Let's implement f θ (x) as a linear function first. We want to achieve the following f θ (x) and objective function E(θ).
insert image description hereFor initialization of θ 0 and θ 1 , random values ​​are used as initial values.

# 参数初始化
theta0 = np.random.rand()
theta1 = np.random.rand()

# 预测函数
def f(x):
    return theta0+theta1*x

# 目标函数
def E(x,y):
    return 0.5*sum((y-f(x))**2)

Preprocessing the training data: Turn the training data into data with an average value of 0 and a variance of 1.

This preprocessing is not necessary, but after doing it, the convergence of the parameters will be faster. This practice is also known as normalization or z-score normalization, and the transform expression is like this. µ is the mean of the training data and σ is the standard deviation.
insert image description here

# 标准化
mu = train_x.mean()
sigma = train_x.std()
def standardize(x):
    return (x-mu)/sigma
train_z = standardize(train_x)

plt.plot(train_z,train_y,'o')
plt.show()

insert image description here
It can be found that the scale of the horizontal axis becomes smaller after normalization.

The next step is to find the law of f θ (x), because we assume that the data set has the law of a linear function. For a one-time function, we need to get two parameters θ 0 and θ 1 from the data set so that it can pass through all points of the data set as much as possible.

Use the gradient descent method to find the two parameters that minimize the error function E(θ).
insert image description here

# 学习率
ETA = 1e-3
# 误差的差值
diff = 1
# 更新次数
count = 0
# 重复学习
error = E(train_z,train_y)
while diff > 1e-4:
    # 更新结果保存到临时变量
    tmp0 = theta0 - ETA*np.sum((f(train_z)-train_y))
    tmp1 = theta1 - ETA*np.sum((f(train_z)-train_y)*train_z)
    # 更新参数
    theta0 = tmp0
    theta1 = tmp1
    # 计算与上一次的差值
    current_error = E(train_z,train_y)
    diff = error - current_error
    error = current_error
    # 输出日志
    count += 1
    log = '第{}次:theta0={:.3f},theta1={:.3f},差值={:.4f}'
    print(log.format(count,theta0,theta1,diff))

insert image description here

# 绘图确认
x = np.linspace(-3,3,100)
plt.plot(train_z,train_y,'o')
plt.plot(x,f(x))
plt.show()

insert image description here
Earlier we assumed that the law in the data set was a linear function and successfully fitted it.

In fact, we can also assume that the law is a quadratic function.
insert image description here
Treating both parameters and training data as vectors makes computation easier.
insert image description here
However, since there are a lot of training data, we treat one row of data as one training data, and it would be better to process it in the form of a matrix.
insert image description here
Then, find the product of this matrix and the parameter vector θ.
insert image description here

# 初始化参数
theta = np.random.rand(3)
# 创建训练数据的矩阵
def to_matrix(x):
    return np.vstack([np.ones(x.shape[0]),x,x ** 2]).T
X = to_matrix(train_z)
# 预测函数
def f(x):
    return np.dot(x,theta)
# 误差的差值
diff = 1
# 重复学习
error = E(X,train_y)
while diff > 1e-3:
    # 更新参数
    theta = theta - ETA * np.dot(f(X)-train_y,X)
    # 计算与上一次误差的差值
    current_error = E(X,train_y)
    diff = error -current_error
    error = current_error
x = np.linspace(-3,3,100)
plt.plot(train_z,train_y,'o')
plt.plot(x,f(to_matrix(x)))
plt.show()

insert image description here

# 均方误差
def MSE(x,y):
    return (1 / x.shape[0]) * np.sum((y-f(x))**2)
# 用随机值初始化参数
theta = np.random.rand(3)
# 均方误差的历史记录
errors = []
# 误差的差值
diff = 1
# 重复学习
errors.append(MSE(X,train_y))
while diff > 1e-3:
    theta = theta - ETA * np.dot(f(X)-train_y,X)
    errors.append(MSE(X,train_y))
    diff = errors[-2]-errors[-1]
# 绘制误差变化图
x = np.arange(len(errors))
plt.plot(x,errors)
plt.show()

insert image description here

2. Classification - Perceptron

The training data set images1.csv is as follows:

x1,x2,y
153,432,-1
220,262,-1
118,214,-1
474,384,1
485,411,1
233,430,-1
396,361,1
484,349,1
429,259,1
286,220,1
399,433,-1
403,340,1
252,34,1
497,472,1
379,416,-1
76,163,-1
263,112,1
26,193,-1
61,473,-1
420,253,1

Implementation steps:
1. First, initialize the weight of the perceptron, and then implement the function f w (x).
insert image description here
2. Next, you only need to implement the update expression of the weight.
insert image description here
3. The equation of a line that makes the weight vector a normal vector is the set of x whose inner product is 0. So transpose and transform it, and finally draw the graph of the following expression.
insert image description here

import numpy as np
import matplotlib.pyplot as plt

train = np.loadtxt("data/images1.csv", delimiter=",", skiprows=1)
train_x = train[:, 0:2]
train_y = train[:, 2]

# 权重的初始化
w = np.random.rand(2)  # 生成两个符合0-1分布的随机值,以列表形式保存


# 判别函数
def f(x_f):
    if np.dot(w, x_f) >= 0:
        return 1
    else:
        return -1


# 重复次数
epoch = 10

# 更新次数
count = 0

# 学习权重
for _ in range(epoch):
    for x, y in zip(train_x, train_y):
        if f(x) != y:
            w = w + y * x
            # 输出日志
            count += 1
            print('第{}次:w={}'.format(count, w))

# 绘图
x1 = np.arange(0, 500)
plt.title("classification_result")
plt.xlabel('x1')
plt.ylabel('x2')
plt.plot(train_x[train_y == 1, 0], train_x[train_y == 1, 1], "o")
plt.plot(train_x[train_y == -1, 0], train_x[train_y == -1, 1], "x")
plt.plot(x1, -w[0] / w[1] * x1, linestyle='dashed')
plt.show()

insert image description here

insert image description here

3. Classification - Logistic Regression

Change the -1 label in images1.csv to 0, and get the training data set images2.csv as follows:

x1,x2,y
153,432,0
220,262,0
118,214,0
474,384,1
485,411,1
233,430,0
396,361,1
484,349,1
429,259,1
286,220,1
399,433,0
403,340,1
252,34,1
497,472,1
379,416,0
76,163,0
263,112,1
26,193,0
61,473,0
420,253,1

Implementation steps:
1. First initialize the parameters, and then standardize the training data. x1 and x2 are to be normalized separately. Also don't forget to add an x ​​0 column.
2. Realize the prediction function.
insert image description here
3. Next is the realization of the parameter update part.
insert image description here
4. Transform θ T x = 0 and sort it out to get such an expression.
insert image description here

import numpy as np
import matplotlib.pyplot as plt

# 读入训练数据
train = np.loadtxt("data/images2.csv", delimiter=",", skiprows=1)
train_x = train[:, 0:2]
train_y = train[:, 2]

# 参数初始化
theta = np.random.rand(3)

# 标准化
mu = train_x.mean(axis=0)
sigma = train_x.std(axis=0)


def standardize(x):
    return (x - mu) / sigma


train_z = standardize(train_x)


# 增加x0
def to_matrix(x):
    x0 = np.ones([x.shape[0], 1])
    return np.hstack([x0, x])


X = to_matrix(train_z)


# sigmoid 函数
def f(x):
    return 1 / (1 + np.exp(-np.dot(x, theta)))


# 分类函数
def classify(x):
    return (f(x) >= 0.5).astype(np.int)


# 学习率
ETA = 1e-3

# 重复次数
epoch = 5000

# 更新次数
count = 0

# 重复学习
for _ in range(epoch):
    theta = theta - ETA * np.dot(f(X) - train_y, X)
    # 日志输出
    count += 1
    print('第{}次:theta = {}'.format(count, theta))

# 绘图确认
x0 = np.linspace(-2, 2, 100)
plt.title("classification_result")
plt.xlabel('x1')
plt.ylabel('x2')
plt.plot(train_z[train_y == 1, 0], train_z[train_y == 1, 1], 'o')
plt.plot(train_z[train_y == 0, 0], train_z[train_y == 0, 1], 'x')
plt.plot(x0, -(theta[0] + theta[1] * x0) / theta[2], linestyle='dashed')
plt.show()

insert image description here
insert image description here
The above data can be classified by a straight line, which we call a linearly separable classification problem. For the following data, it can be found that they cannot be classified by only a straight line.
The training data set data3.csv is as follows:

x1,x2,y
0.54508775,2.34541183,0
0.32769134,13.43066561,0
4.42748117,14.74150395,0
2.98189041,-1.81818172,1
4.02286274,8.90695686,1
2.26722613,-6.61287392,1
-2.66447221,5.05453871,1
-1.03482441,-1.95643469,1
4.06331548,1.70892541,1
2.89053966,6.07174283,0
2.26929206,10.59789814,0
4.68096051,13.01153161,1
1.27884366,-9.83826738,1
-0.1485496,12.99605136 ,0
-0.65113893,10.59417745,0
3.69145079,3.25209182,1
-0.63429623,11.6135625,0
0.17589959,5.84139826,0
0.98204409,-9.41271559,1
-0.11094911,6.27900499,0
import numpy as np
import matplotlib.pyplot as plt

# 导入数据集
train = np.loadtxt("data/data3.csv", delimiter=",", skiprows=1)
train_x = train[:, 0:2]
train_y = train[:, 2]

# 参数初始化
theta = np.random.rand(4)

# 标准化
mu = train_x.mean(axis=0)
sigma = train_x.std(axis=0)


def standardize(x):
    return (x - mu) / sigma


train_z = standardize(train_x)


# 增加x0和x3
def to_matrix(x):
    x0 = np.ones([x.shape[0], 1])
    x3 = x[:, 0, np.newaxis] ** 2
    return np.hstack([x0, x, x3])


X = to_matrix(train_z)


# sigmoid 函数
def f(x):
    return 1 / (1 + np.exp(-np.dot(x, theta)))


# 分类函数
def classify(x):
    return (f(x) >= 0.5).astype(np.int)


# 学习率
ETA = 1e-3

# 重复次数
epoch = 2000

# 更新次数
count = 0

# 精度的历史记录
accuracies = []

# 重复学习
for _ in range(epoch):
    theta = theta - ETA * np.dot(f(X) - train_y, X)
    # 日志输出
    count += 1
    print('第{}次:theta = {}'.format(count, theta))
    # 计算现在的精度
    result = classify(X) == train_y
    accuracy = len(result[result == True]) / len(result)
    accuracies.append(accuracy)

x1 = np.linspace(-2, 2, 100)
x2 = -(theta[0] + theta[1] * x1 + theta[3] * x1 ** 2) / theta[2]
plt.title("classification_result")
plt.xlabel('x1')
plt.ylabel('x2')
plt.plot(train_z[train_y == 0, 0], train_z[train_y == 0, 1], "o")
plt.plot(train_z[train_y == 1, 0], train_z[train_y == 1, 1], "x")
plt.plot(x1, x2, linestyle='dashed')
plt.show()

x = np.arange(len(accuracies))
plt.plot(x, accuracies)
plt.show()

insert image description here
insert image description here
insert image description here
When updating parameters, we choose the method of stochastic gradient descent for parameter updating.

import numpy as np
import matplotlib.pyplot as plt

# 导入数据集
train = np.loadtxt("data/data3.csv", delimiter=",", skiprows=1)
train_x = train[:, 0:2]
train_y = train[:, 2]

# 参数初始化
theta = np.random.rand(4)

# 标准化
mu = train_x.mean(axis=0)
sigma = train_x.std(axis=0)


def standardize(x):
    return (x - mu) / sigma


train_z = standardize(train_x)


# 增加x0和x3
def to_matrix(x):
    x0 = np.ones([x.shape[0], 1])
    x3 = x[:, 0, np.newaxis] ** 2
    return np.hstack([x0, x, x3])


X = to_matrix(train_z)


# sigmoid 函数
def f(x):
    return 1 / (1 + np.exp(-np.dot(x, theta)))


# 分类函数
def classify(x):
    return (f(x) >= 0.5).astype(np.int)


# 学习率
ETA = 1e-3

# 重复次数
epoch = 2000

# 更新次数
count = 0

# 精度的历史记录
accuracies = []

# 重复学习
for _ in range(epoch):
    # 使用随机梯度下降法更新参数
    p = np.random.permutation(X.shape[0])
    for x, y in zip(X[p, :], train_y[p]):
        theta = theta - ETA * (f(x) - y) * x
    # 日志输出
    count += 1
    print('第{}次:theta = {}'.format(count, theta))
    # 计算现在的精度
    result = classify(X) == train_y
    accuracy = len(result[result == True]) / len(result)
    accuracies.append(accuracy)

x1 = np.linspace(-2, 2, 100)
x2 = -(theta[0] + theta[1] * x1 + theta[3] * x1 ** 2) / theta[2]
plt.title("classification_result")
plt.xlabel('x1')
plt.ylabel('x2')
plt.plot(train_z[train_y == 0, 0], train_z[train_y == 0, 1], "o")
plt.plot(train_z[train_y == 1, 0], train_z[train_y == 1, 1], "x")
plt.plot(x1, x2, linestyle='dashed')
plt.show()

x = np.arange(len(accuracies))
plt.title("accuracy_result")
plt.xlabel('x')
plt.ylabel('accuracy')
plt.plot(x, accuracies)
plt.show()

insert image description here
insert image description here
insert image description here

4. Regularization

import numpy as np
import matplotlib.pyplot as plt


# 真正的函数
def g(x):
    return 0.1 * (x ** 3 + x ** 2 + x)


# 随意准备一些向真正的函数加入了一点噪声的训练数据
train_x = np.linspace(-2, 2, 8)
train_y = g(train_x) + np.random.rand(train_x.size) * 0.05

# 绘图确认
plt.plot(train_x, train_y, 'o')
x = np.linspace(-2, 2, 100)
plt.plot(x, g(x), linestyle="dashed")
plt.ylim(-1, 2)
plt.show()

# 标准化
mu = train_x.mean()
sigma = train_x.std()


def standardize(x):
    return (x - mu) / sigma


train_z = standardize(train_x)


# 创建训练数据的矩阵
def to_matrix(x):
    return np.vstack([np.ones(x.size),
                      x,
                      x ** 2,
                      x ** 3,
                      x ** 4,
                      x ** 5,
                      x ** 6,
                      x ** 7,
                      x ** 8,
                      x ** 9,
                      x ** 10]).T


X = to_matrix(train_z)

# 参数初始化
theta = np.random.randn(X.shape[1])


# 预测函数
def f(x):
    return np.dot(x, theta)


# 目标函数
def E(x, y):
    return 0.5 * np.sum((y - f(x)) ** 2)


# 学习率
ETA = 1e-4

# 误差
diff = 1

# 重复学习
error = E(X, train_y)
while diff > 1e-6:
    theta = theta - ETA * np.dot(f(X) - train_y, X)
    current_error = E(X, train_y)
    diff = error - current_error
    error = current_error

# 对结果绘图
z = standardize(x)
plt.plot(train_z, train_y, 'o')
plt.plot(z, f(to_matrix(z)))
plt.show()

# 保存未正则化的参数,然后再次参数初始化
theta1 = theta
theta = np.random.randn(X.shape[1])

# 正则化常量
LAMBDA = 1

# 误差
diff = 1

# 重复学习(包含正则化项)
error = E(X, train_y)
while diff > 1e-6:
    # 正则化项。偏置项不适合正则化,所以为0
    reg_term = LAMBDA * np.hstack([0, theta[1:]])
    # 应用正则化项,更新参数
    theta = theta - ETA * (np.dot(f(X) - train_y, X) + reg_term)
    current_error = E(X, train_y)
    diff = error - current_error
    error = current_error

# 对结果绘图
plt.plot(train_z, train_y, 'o')
plt.plot(z, f(to_matrix(z)))
plt.show()

theta2 = theta

plt.plot(train_z, train_y, 'o')

# 画出未正则化的结果
theta = theta1
plt.plot(z, f(to_matrix(z)), linestyle='dashed', label="non_regularization")
# 画出正则化的结果
theta = theta2
plt.plot(z, f(to_matrix(z)), label="regularization")
plt.legend()
plt.show()

We generate data using a real function and add random noise to it to simulate reality.
insert image description here
The effect of learning without regularization.
insert image description here
Regularized learning results.
insert image description here
contrast between the two.
insert image description here

Guess you like

Origin blog.csdn.net/m0_46692607/article/details/127536337