A preliminary understanding of machine learning (Machine Leaning)

I. Overview

1. Artificial Intelligence > Machine Learning > Deep Learning | Reinforcement Learning
2. Machine Learning (ML: machine leaning): A method of mining data through optimization methods (linear regression, logistic regression, decision trees, vector machines, Bayesian models, etc.) Regular disciplines.
3. Machine learning: input ==> function ==> output. Now that the input and output data are known, a better function is fitted by machine learning to match the input and output. 4. The essence of machine learning
is statistical model training, its main The job is to train the model, which can also be called the fitting model, that is, fitting the data is the main work of machine learning, and one word to sum it up is "guess"
. After calculation, a deviation result is fed back, and the algorithm model is adjusted according to the deviation result, and then a value is output, which goes round and round until it is correct.
6. Hypothesis Function (Hypothesis Function): Fill the hypothesis function with data as "fuel", it can generate power output and make the learning process run 7.
Loss Function (Loss Function): Provide learning power for machine learning, the The deviation value is obtained by comparing the predicted result of the function with the actual value.
8. The basic mode of machine learning.
insert image description here
9. The optimization method can adjust the parameters of the hypothetical function based on the deviation value to make it approximate
and
insert image description here
fit. Learning: It can be understood as learning with reference answers, specifically, the data set contains prediction results
12. Commonly used machine learning algorithms

1. Linear regression algorithm: the simplest machine learning algorithm, which uses a linear method to solve regression problems
2. Logistic regression classification algorithm: it is the "twin brother" of the linear regression algorithm, and its core idea is still the linear method, which has the ability to solve classification problems
3. KNN classification algorithm: an algorithm that does not rely on mathematical or statistical models, but relies purely on "life experience". It solves classification problems through the idea of ​​"finding the nearest neighbor" 4. Naive Bayesian classification algorithm: the result is not deterministic
but It is probabilistic and solves the classification problem
5. Decision tree classification algorithm: similar to if-else logic for classification
6. Support vector machine classification algorithm: map linearly inseparable data points into linearly separable, and then use the simplest
7. K-means clustering algorithm 8.
Neural network classification algorithm

2. Environment

1. Three-piece set of machine learning

Support library Numpy: a professional support library specially designed for scientific computing
Algorithm library Scikit-Learn: machine learning algorithm library
Data processing library Pandas: built-in many practical functions such as sorting and statistics

2.Numpy

command line installation

pip install -U numpy


Pip download is too slow: you can import -i https://pypi.douban.com/simple as the required end content

import numpy as np

use
insert image description here

3.Scikit-Learn

command line installation

pip install -U scikit-learn -i https://pypi.douban.com/simple

import

import sklearn

use
insert image description here

4.Pandas

command line installation

pip install -U pandas -i https://pypi.douban.com/simple

import

import pandas as pd

use
insert image description here

3. Linear Regression Algorithm (Linear Regression)

1. Using linear models to solve regression problems
2. Regression problems: fitting historical continuous data, predicting future continuous data
3. Learning from mistakes: bias measurement + weight adjustment
4. Mathematical expressions of hypothesis functions
insert image description here
5. Mathematics of loss functions Expression
insert image description here
6. Mathematical Expressions for Optimization Methods
insert image description here
7. Linear Regression Algorithm Information Table
insert image description here
8. Three Steps to a Linear Regression Problem
insert image description here
9. Using the Linear Regression Algorithm in Python

import matplotlib.pyplot as plt  # 二维画图
import numpy as np  # 科学计算库
from sklearn import linear_model  # 机器学习算法库

# 生成数据集
x = np.linspace(-3, 3, 30)
y = 2 * x + 1
# 添加扰动
x = x + np.random.rand(30)
y = y + np.random.rand(30)

# 数据集转换:序列==>矩阵
x.shape = len(x), -1
y.shape = len(y), -1

# 训练线性回归模型
model = linear_model.LinearRegression()
model.fit(x, y)

# 测试输入
x_ = [[1], [2]]

# 预测输出
y_ = model.predict(x_)
print(y_)

# 法向量w和截距b
w = model.coef_
b = model.intercept_
print(w, b)

# 数据集绘图
y2 = w[0][0] * x + b[0]  # 拟合直线
plt.scatter(x, y)
plt.plot(x, y2)
plt.show()

insert image description here

4. Logistic Regression Classification Algorithm (Logistic Regression)

1. Classification problem: Compared with regression problem, its predicted value is discrete rather than continuous. Binary classification is the basis of multivariate classification
. Approaching to 0, right approaching to 1
insert image description here
Through the Logistic function, the continuous value can be mapped to the discrete value of the transition, so it is a bridge connecting continuous and discrete The
mathematical expression of the Logistic function is as follows:
insert image description here
Using Logistic regression to solve classification problems Core ideas:
First, use linear equations to draw straight lines.
The second is to "bend" the straight line through the Logistic function to fit the data points of the classification problem in a discrete distribution, which is equivalent to first mapping the classification problem into a regression problem through the Logistic function, and then using a linear model that can solve the regression problem to solve the classification problem question.
3. The idea of ​​using the Logistic function to map continuous values ​​to discrete values
insert image description here
​​4. The classification category form in machine learning 5.
insert image description here
The hypothesis function of Logistic regression
insert image description here
6. The loss function of Logistic regression
insert image description here
7. Logistic regression classification algorithm information table
insert image description here
8. Logistic regression classification algorithm Step
insert image description here
9. Using the Logistic Regression Algorithm in Python

from sklearn.linear_model import LogisticRegression
from sklearn.datasets import load_iris  # 导入鸢尾花分类数据集

X, y = load_iris(return_X_y=True)  # 载入鸢尾花数据集
clf = LogisticRegression(max_iter=1000).fit(X, y)  # 训练模型
y_ = clf.predict(X)  # 使用模型进行分类预测
print(y_)  # 分类结果
print(clf.score(X, y))  # 性能评估

insert image description here

Five. KNN classification algorithm (K-Nearest Neighbor)

1. Algorithm principles

Like attracts like: For the question of which pile the newly input samples to be classified should be classified into, it is transformed into which pile of samples has the most in common and is most similar to the new sample. Which pile is similar to the new sample will be classified into that pile, that is, which category it will be divided into.

Majority voting: According to the value of each dimension of it, see what kind of points are adjacent to it, according to the principle of majority voting, which classes are in the majority, and which class this new sample belongs to

Proximity voting: With the point to be classified as the center of the circle, you can find out which points are close to it, thus forming its "friend circle". Only the points in the circle have the right to vote on which class this point belongs to, instead of voting by the entire sample

2. Take the sample point to be classified as the center and the nearest K points. Which category accounts for the largest proportion among the K points, and which category the sample points to be classified belong to.

3. How to determine the number of nearest neighbors K?
The parameters that need to be adjusted according to the actual situation in order to obtain a better fitting effect can be set according to experimental methods such as cross-validation and combined with work experience. Generally, the value of K will be between 3 and 10

4. How to determine the nearest neighbor?
The key is what method to use to measure "nearest". This is the primary problem that KNN and related derivative algorithms need to solve. It is a difficulty and an innovation point. It can be measured by Minkowski
Distance

5. KNN algorithm classification process
insert image description here
6. Minkowski distance
insert image description here
When P=1:
insert image description here
When P=2:
insert image description here
7. KNN classification algorithm information table
insert image description here
8. KNN classification algorithm implementation steps
insert image description here
9. Using KNN classification algorithm in Python

from sklearn.datasets import load_iris  # 从Scikit-Learn库导入近邻模型中的KNN分类算法
from sklearn.neighbors import KNeighborsClassifier  # 载入鸢尾花数据集

X, y = load_iris(return_X_y=True)  # 训练模型
clf = KNeighborsClassifier().fit(X, y)  # 使用模型进行分类预测
y_ = clf.predict(X)
print(y_)   # 预测分类结果
print(clf.score(X, y))  # 算法性能评估

insert image description here

Six. Naive Bayes classification algorithm (Naive Bayes)

1. The core of the Naive Bayesian classification algorithm is the Bayesian formula, and the core of the Bayesian formula is the conditional probability
2. Naive Bayesian: use the "Bayesian formula" under the "naive" assumptions
3. Probability and Conditional probability
insert image description here
4. The essence of conditional probability is to quantify the correlation between X and Y
5. The difference between logic and correlation: logic is causality, and correlation is based on statistical data
6. Bayesian formula prediction The core idea is just 5 words-"It looks more like"
7. Bayesian formula hopes to use known experience to make judgments. Using "experience" to make "judgment", how does experience come from? How to judge with experience? One sentence actually contains two rounds of process.
insert image description here
8. Prior probability, posterior probability and possibility function.
insert image description here
That is, the prior probability can be obtained by modifying the possibility function.
If the probability of occurrence of A is the prior probability, when something B occurs that will affect the probability of A occurring After the occurrence, the probability of A occurring at this time is called the posterior probability
9. The posterior probability of the category and the likelihood of the feature
insert image description here
The posterior probability of the category and
insert image description here
the likelihood of a certain feature represent
insert image description here
10. Mathematics of the Naive Bayesian classification algorithm Analyze
the "simple" assumption: features and features are independent of each other and do not affect each other (this assumption is to solve the lack and incompleteness of data collection, so the more features x, the more prominent these two problems will be, It is more difficult to count the probability of these features appearing at the same time),
so the likelihood of a feature can be simplified as: the
insert image description here
posterior probability is proportional to the likelihood
Naive Bayesian algorithm uses the posterior probability to predict, the core method is through the likelihood The likelihood predicts the posterior probability, and the learning process is the process of continuously increasing the likelihood.
insert image description here
If the equation is used, the probability of co-occurrence of statistical features is still required:
insert image description here
the optimization method of Naive Bayes:
insert image description here
11. Naive Bayes Classification Algorithm Information Table
insert image description here
12. Implementation Steps of Naive Bayes Classification Algorithm
insert image description here
13. Using Naive Bayes Classification Algorithm in Python

from sklearn.datasets import load_iris  # 从Scikit-Learn库导入朴素贝叶斯模型中的多项式朴素贝叶斯分类算法
from sklearn.naive_bayes import MultinomialNB  # 载入鸢尾花数据集

X, y = load_iris(return_X_y=True)  # 训练模型
clf = MultinomialNB().fit(X, y)  # 使用模型进行分类预测
y_ = clf.predict(X)

print(y_)
print(clf.score(X, y))

insert image description here

Seven. Decision Tree classification algorithm (Decision Tree)

1. Programmer’s point of view: if-else is matched layer by layer
2. How to choose the judgment condition to generate the judgment branch is the core point of the decision tree algorithm
3. The judgment condition of the decision tree is generated from this feature dimension set
4. How to It is a good decision-making condition: the ideal situation is of course that after the decision-making condition is selected, an if-else just divides the data set into two parts according to the positive class and the negative class. The next best thing is to hope that the fewer impurities in the classification results, the better, that is, the purer the classification results, the better.
5. Measuring rules for node purity
insert image description here
insert image description here
6. Pruning problem of decision tree

The reality is that due to various reasons, such as one-sided collection of data sets or random disturbances, etc., the data may be falsely correlated, and these actually invalid attribute dimensions will be regarded as effective branch judgment conditions by the decision tree algorithm. The decision tree model trained with such a falsely related data set will experience over-learning, and learn the classification decision-making conditions that do not have universal significance, that is, over-fitting, resulting in the classification effectiveness of the decision tree model. reduce.

According to the trigger timing of the pruning operation, it can be basically divided into two types, one is called pre-pruning, and the other is called post-pruning

Regardless of pre-pruning or post-pruning, pruning is divided into two steps: pruning judgment and pruning operation. Only when it is judged that pruning is necessary will the actual pruning operation be performed

7. Basic idea of ​​decision tree classification algorithm

Where does the criterion come from?
This problem is solved in two steps. The first step is the source. The data in the dataset are organized by feature dimensions. These feature dimensions can also be used as a set, called post dimension set, or attribute set. We want to discover the possible relationship between feature dimensions and categories, so the discriminant conditions come from this set.

Which feature dimension should be selected as the discriminant condition of the current if-else?
This requires comparison, and comparison requires standards, so we introduced the concept of "purity", which feature dimension "purification" effect is the best, and which feature dimension is selected as the discriminant condition.

When should the decision tree stop node splitting?
insert image description here
A core of the decision tree classification algorithm is to sequentially select the decision-making conditions in the feature set of the data, that is, to complete the division of the if-else judgment branch.

How to measure the purity of the classification results under different characteristic conditions is the core issue of the decision tree classification algorithm.

8. Decision tree classification algorithm information table
insert image description here
9. Decision tree classification algorithm implementation steps
insert image description here
10. Using decision tree classification algorithm in Python

from sklearn.datasets import load_iris  # 从Scikit-Learn库导入决策树模型中的决策树分类算法
from sklearn.tree import DecisionTreeClassifier  # 载入鸢尾花数据集

X, y = load_iris(return_X_y=True)  # 训练模型
clf = DecisionTreeClassifier().fit(X, y)  # 使用模型进行分类预测
y_ = clf.predict(X)

print(y_)
print(clf.score(X, y))

insert image description here

Eight. Support Vector Machine Classification Algorithm (Support Vector Machine)

1. Interval: the distance between different classes. Linearly separable problems can be classified using a straight line in the interval. For linear inseparable problems, high-dimensional mapping processing is required first. 2. Support vectors: data points at the edge of the interval are
called Support vectors, they are very important for correct classification
insert image description here
3. High-dimensional mapping: low-dimensional linear inseparable mapping can be separable after high-dimensional
insert image description here
4. Kernel function: a function that completes high-dimensional mapping in support vector machines
insert image description here
5 .Algorithm Classification Step
insert image description here
6. Algorithm Information Table
insert image description here
7. Using Support Vector Machine Classification Algorithm in Python

from sklearn.datasets import load_iris  # 从Scikit-Learn库导入支持向量机算法
from sklearn.svm import SVC  # 载入鸢尾花数据集

X, y = load_iris(return_X_y=True)  # 训练模型
clf = SVC().fit(X, y)  # 默认为径向基rbf,可通过kernel查看
print(clf.predict(X))
print(clf.kernel)
print(clf.score(X, y))

insert image description here

Nine. K-means clustering algorithm

1. The most basic principle of clustering problems: find similarities
2. If there are too many similarities, it is the same class, and if there are too many differences, it is not the same class.
3. Clusters: The sample data sets are finally aggregated into individual "classes" through the clustering algorithm. These classes are called "clusters" in the Chinese terminology of machine learning.
4. The clustering process can be regarded as the process of continuously finding the centroids of the clusters. 5.
The number of different clusters that clustering will eventually produce can be preset as K, that is, the data is classified according to K categories 6.
Centroid: randomly select K points in the data set as centroids, and cluster them around them Classes, we can use the mean to adjust the centroid, so that K randomly selected centroids can finally achieve our desired goal. 7. Majority voting: The K-means algorithm
votes on the clustering problem, which is "Are we the same Cluster", that is, everyone is to be identified, and no sample data point can be used as the center point, so some points need to be selected as the centroid
. The K centroids that can satisfy this "minimum" are the centroids we are looking for.
9. Algorithm information table and implementation steps
insert image description here
10. Use K-means clustering algorithm in Python

# 导入绘图库
import matplotlib.pyplot as plt
# #从Scikit-Learn库导入聚类模型中的K-means聚类算法
from sklearn.cluster import KMeans
# #导入聚类数据生成工具
from sklearn.datasets import make_blobs

# 用sklearn自带的make_blobs方法生成聚类测试数据
n_samples = 1500
# #该聚类数据集共1500个样本
X, y = make_blobs(n_samples=n_samples)
# #进行聚类,这里n_clusters设定为3,也即聚成3个簇
y_pred = KMeans(n_clusters=3).fit_predict(X)
# #用点状图显示聚类效果
plt.scatter(X[:, 0], X[:, 1], c=y_pred)
plt.show()

insert image description here

10. Artificial Neural Network (ANN)

1. The neural network algorithm has "three treasures", neuron, activation function and backpropagation mechanism.
2. Neurons
insert image description here
insert image description here
3. Excitation transmission
insert image description here

4. Activation function
insert image description here
insert image description here
insert image description here

5. Backpropagation mechanism
6. The core working mechanism of neurons is to decide whether to activate or not according to the stimulus, and the activation will continue to transmit the stimulus forward, otherwise the stimulus will be interrupted here and will not affect the final output
7. Neural network structure
insert image description here

Guess you like

Origin blog.csdn.net/m0_46692607/article/details/126666740