[Artificial Intelligence] Supervised Learning for Introduction to Machine Learning (1) Supervised Learning

Supervised Learning for Introduction to Machine Learning (1) Supervised Learning

Introduction

Supervised learning algorithm is one of the common algorithms, mainly divided into supervised learning and unsupervised learning. This article mainly records the classification algorithm and regression algorithm in supervised learning, among which the regression algorithm is the main content of this article.

This note corresponds to the video: Alibaba Cloud Developer Community Learning Center - Artificial Intelligence Learning Route - Phase 1: Overview of Machine Learning and Common Algorithms

Corresponding video address: Machine Learning Overview and Common Algorithms - Alibaba Cloud Developer Community

supervised learning

Definition: 利用已知类别的样本, training and learning to obtain an optimal model to achieve the required performance, and then use the trained model to map all inputs to corresponding outputs, and make simple judgments on the outputs, so as to achieve the purpose of classification, namely Unknown data can be classified.

Supervised learning is divided into: supervised learning, unsupervised learning, semi-supervised learning

This article mainly introduces supervised learning and unsupervised learning related algorithms

supervised learning

Supervised learning: Use a set of samples of known categories to train the model to meet the performance requirements.

Features: There is a clear identification or result (label) for the input data (training data)

Menglang said 1: Just give him questions and answers, let him do the questions by himself, and then check the answers by himself.

Menglang said 2: Just give it a sample question with an answer to let it learn

classification algorithm

Classification: It is to obtain an objective function f (model) through the learning of the existing data set (training set), and map each attribute set x to the target attribute y (class), and y must be discrete (if If y is continuous, it belongs to the regression algorithm). Through the analysis of the training set of known categories
, classification rules are found to predict the category of new data.

Menglang speaks human words: For example, I have a bunch of pictures in my hand, and each picture marks whether the person in the picture is wearing a mask. I show it: AI, look, this picture is wearing a mask, this picture is not wearing a mask... (AI is learning), according to what I taught it, it will model, and then you give it For a picture that has never been seen, it can predict whether the picture is wearing a mask based on the model at this time.

[Related Reading] Computer Vision Technology and Application: Identifying Whether a Person Wears a Mask

Article address (including code): Computer Vision Technology and Application: Identifying whether a person is wearing a mask

There are many classification algorithms

Classified by principle:

  • Statistics-based: e.g. Bayesian classification
  • Rule-based: e.g. decision tree algorithms
  • Neural Network Based: Neural Network Algorithms
  • Distance-based: KNN (K nearest neighbors)

Common Evaluation Indicators

  • Precision: The ratio of predicted results to actual results
  • Recall rate: the correct coverage rate of a certain type of results in the prediction results
  • F1-Score: statistics, comprehensive evaluation classification model, value between 0-1

regression algorithm

Regression

The target attribute y (class) of the classification algorithm is discrete, while the y obtained by the regression algorithm is continuous.

Since it is continuous, it can be represented by a function.

So the essence of the regression algorithm: through the existing data, try to fit a function as much as possible

For example: I have a data set of selling quantities of commodities at different selling prices. After modeling these data, the model can predict the quantity that will be sold based on the price we input. In fact, it fits a function based on the data we give, such as fitting a linear Y=aX+bequation

This a and b can be obtained using the least squares method .

For example, I give you the following data

price sales volume
1 1000
2 900
3 800
4 700
5 600
6 500
7 400
8 300
9 200
10 100

When you look at this data, you know that there is a certain relationship between price and sales volume, y=kx+bsuch as

\[k = \frac{\sum xy - n \overline x \overline y}{\sum x^2 - n \overline x^2} \]

After calculating the slope according to

\[(\overline x,\overline y) \]

and the determined slope k, use the undetermined coefficient method to find the intercept b

Let’s calculate it mathematically first, the code is as follows

import pandas as pd
def getK(data):
    avgx = data["x"].values.mean() # 获取x平均值
    avgy = data["y"].values.mean() # 获取y平均值
    fenzi = 0
    fenmu = 0
    for i in range(len(data["x"])):
        x = data["x"][i]
        y = data["y"][i]
        fenzi += x * y
        fenmu += x * x
    fenzi = fenzi -len(data["x"]) * avgx * avgy
    fenmu = fenmu - len(data["x"]) * avgx *avgx
    k = fenzi / fenmu
    b = avgy - k *avgx
    return "y="+str(k)+"x+"+str(b)


data = pd.DataFrame(data = [[1, 2, 3, 4, 5, 6, 7, 8, 9, 10],[1000, 900, 800, 700, 600, 500, 400, 300, 200, 100]],index=["x","y"])
data = data.T # 转置
result = getK(data)
print(result)

[Output result] y=-100.0x+1100.0

This linear equation goes downhill, and y will become negative later. Although it is impossible for the sales volume to be negative, ours is just an imprecise example. I created the data casually, and it is normal for there to be unreasonable situations.

Our purpose is: the linear regression equation fitted by tensorflow is as close as possible to the linear regression equation calculated by mathematics

Next use Tensoflow to derive this model (fitting linear regression equation)

First of all, prepare the data. The data above is not enough, so we create some data. The code for creating data is as follows

import math
import random

import pandas as pd


def getCheck(data):
    avgx = data["x"].values.mean()  # 获取x平均值
    avgy = data["y"].values.mean()  # 获取y平均值
    fenzi = 0
    fenmu = 0
    for i in range(len(data["x"])):
        x = data["x"][i]
        y = data["y"][i]
        fenzi += x * y
        fenmu += x * x
    fenzi = fenzi - len(data["x"]) * avgx * avgy
    fenmu = fenmu - len(data["x"]) * avgx * avgx
    k = fenzi / fenmu
    b = avgy - k * avgx
    testX=[]
    testY=[]
    # return k,b
    for i in range(1000):
        x = random.uniform(0, 10)
        y = k * x + b
        testX.append(x)
        testY.append(math.floor(y)) # 向下取整
    print(testX)
    print(testY)
    d = pd.DataFrame(data = [testX,testY])
    d = d.T
    d.to_csv("train_data.csv")
    # d.to_csv("test_data.csv")

if __name__ == '__main__':
    data = pd.DataFrame(data=[[1, 2, 3, 4, 5, 6, 7, 8, 9, 10], [1000, 900, 800, 700, 600, 500, 400, 300, 200, 100]],
                        index=["x", "y"])
    data = data.T  # 转置
    getCheck(data)

Run the above code (note lines 33 and 34, which generate training sets and test sets respectively) to generate data sets

【Related Reading】

Activation function: [Artificial Intelligence] Neural Network Optimization: Complexity Learning Rate, Activation Function, Loss Function, Alleviating Overfitting, Optimizer- Menglang Lantian- Blog Park

Optimizer: [Artificial Intelligence] Neural Network Stereotypes- Menglang Blue Sky- Blog Garden

read dataset

# 读取数据集
train_data = pd.read_csv('./train_data.csv')
test_data = pd.read_csv('./test_data.csv')

build model

First of all, we must clarify what we are going to do. We are making predictions, using linear regression algorithm to predict.

# 构建模型
model = tf.keras.Sequential([
    # 全连接层 tf.keras.layers.Dense() 全连接层在整个网络卷积神经网络中起到“特征提取器”的作用
    # --- 输出维度
    # --- 激活函数activation:relu 关于激活函数。可以查阅https://www.cnblogs.com/mllt/p/sjwlyh.html#%E6%BF%80%E6%B4%BB%E5%87%BD%E6%95%B0
    tf.keras.layers.Dense(128, activation='relu', input_shape=(1,)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(1)
])
# 设置优化器optimizer 相关链接:https://www.cnblogs.com/mllt/p/sjwlbg.html#2modelcompile
# https://www.cnblogs.com/mllt/p/sjwlyh.html#%E4%BC%98%E5%8C%96%E5%99%A8
optimizer = tf.keras.optimizers.Adam(lr = 0.002)
"""
lr 学习率。lr决定了学习进程的快慢(也可以看作步幅的大小)。
如果学习率过大,很可能会越过最优值;
如果学习率过小,优化的效率可能很低,导致过长的运算时间

优化器keras.optimizers.Adam()是解决这个问题的一个方案。
其大概的思想是开始的学习率设置为一个较大的值,然后根据次数的增多,动态的减小学习率,以实现效率和效果的兼得
"""

training model

model.compile(loss="mse", optimizer=optimizer, metrics=['mse']) # 预测评价指标:https://blog.csdn.net/guolindonggld/article/details/87856780
# 均方误差(MSE)是最常用的回归损失函数,计算方法是求预测值与真实值之间距离的平方和
# 相关链接:https://www.cnblogs.com/mllt/p/sjwlbg.html#2modelcompile
# 神经网络模型
print(model)
# 神经网络模型结构
print(model.summary())
# 对神经网络进行训练

Visualization of model training status

# 训练情况可视化
hist = pd.DataFrame(history.history)
print(hist)
hist['epoch'] = history.epoch
plt.figure()
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.plot(hist['epoch'], hist['loss'],label="训练集损失值")
plt.plot(hist['epoch'], hist['val_loss'],label='测试集损失值')
"""
最佳情况:loss 和 val_loss 都不断下降
过拟合:loss不断下降,val_loss趋近于不变 解决办法:减少学习率或者减少批量数目
数据集异常:loss趋近于不变,val_loss不断下降
学习瓶颈:loss、val_loss都趋近于不变 解决办法:减少学习率或者减少批量数目
神经网络设计的有问题:loss、val_loss都不断上升 解决办法:重置模型结构 重置数据集
"""
plt.legend()
plt.show()
plt.figure()
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.plot(hist['epoch'], hist['mse'],label="训练集准确率")
plt.plot(hist['epoch'], hist['val_mse'],label="测试集准确率")
plt.legend()
plt.show()

Visualization of predictions

# 预测情况可视化
plt.figure()
y = model.predict(test_data["0"])
plt.plot(test_data["0"],y,label="模型预测值")
plt.plot(test_data['0'],test_data['1'],label="真实值")
plt.legend()
plt.show()

predict

print(model.predict([5.5]))

result

image

MSE: Mean Square Error (Mean Square Error)

Range [0,+∞], equal to 0 when the predicted value is completely consistent with the real value, that is, the perfect model; the larger the error, the larger the value, and the worse the model performance.

full code

This complete code is the complete code after the image is normalized after deleting the comments in the above code.

import tensorflow as tf
import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams['font.sans-serif']=['SimHei'] #用来正常显示中文标签
plt.rcParams['axes.unicode_minus']=False  # 正常显示负号
train_data = pd.read_csv('./train_data.csv')
test_data = pd.read_csv('./test_data.csv')

# 构建模型
model = tf.keras.Sequential([
    tf.keras.layers.Dense(128, activation='relu', input_shape=(1,)),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(1)
])
optimizer = tf.keras.optimizers.Adam(lr = 0.002)
model.compile(loss="mse", optimizer=optimizer, metrics=['mse'])
history = model.fit(train_data["0"], train_data["1"], batch_size=100, epochs=100, validation_split=0.3, verbose=0)
# 训练情况可视化
hist = pd.DataFrame(history.history)
print(hist)
hist['epoch'] = history.epoch
y = model.predict(test_data["0"])
# plt.figure(figsize=(10,5),dpi=300)# 创建画布
fig,axes = plt.subplots(nrows=1,ncols=3,figsize=(20,5),dpi=300)
# 添加描述
axes[0].set_title("损失值(loss)",fontsize=24)
axes[1].set_title("准确率(mse)",fontsize=24)
axes[2].set_title("模型预测情况",fontsize=24)
# 设置标签
axes[0].set_ylabel("batch")
axes[1].set_ylabel("batch")
axes[2].set_ylabel("销售量")
axes[0].set_xlabel("epoch")
axes[1].set_xlabel("epoch")
axes[2].set_xlabel("售价")
axes[0].plot(hist['epoch'], hist['mse'],label="训练集损失值",color="r",linestyle="-")
axes[0].plot(hist['epoch'],hist['val_mse'],label="测试集损失值",color="g",linestyle="--")
axes[1].plot(hist['epoch'], hist['loss'],label="训练集准确率",color="r",linestyle="-")
axes[1].plot(hist['epoch'],hist['val_loss'],label="测试集准确率",color="g",linestyle="--")
axes[2].plot(test_data['0'],test_data['1'],label="真实值",color="b",linestyle="--")
axes[2].plot(test_data['0'],y,label="预测值",color="y",linestyle="--")
axes[0].legend(loc="upper right")# 显示图例必须在绘制时设置好
axes[1].legend(loc="upper right")# 显示图例必须在绘制时设置好
axes[2].legend(loc="upper right")# 显示图例必须在绘制时设置好
# 添加网格
# plt.grid(True,linestyle="--",alpha=0.5) # 添加网格
axes[0].grid(True,linestyle="--",alpha=1)
axes[1].grid(True,linestyle="-.",alpha=1)
axes[2].grid(True,linestyle="-",alpha=0.5)
plt.show()
print(model.predict([5.5]))

Other Supervised Learning Algorithms

Classification algorithm:

  • KNN (K nearest neighbor, K-Nearest Neighbor)

  • NB (Naive Bayes, Naive Bayes )

  • DT (Decision Tree, Decision Tree ): C45, CART

  • SVM (Support Vector Machine, Support Vector Machine )

Regression prediction:

  • Linear Regression

  • Logistic Regression ( Logistic Regression )

  • Ridge Regression

  • Lasso Regression ( LASSO Regression )

Guess you like

Origin blog.csdn.net/ks2686/article/details/127780838