[Machine Learning-0] Introduction, PLA

Preface

Machine learning is an inevitable product of the development of artificial intelligence research to a certain extent. It has a deep background and many branches from the 1950s to the present. In the past two decades, the ability of humans to collect and process data has been greatly improved, and there is an urgent need for algorithms with efficient capabilities in data analysis, and machine learning has emerged in response to the needs of this era. Here are some machine learning books and important conferences in related fields, Machine Learning (Zhou Zhihua, Tsinghua University Press), learning from data, Optimization Methods for Large-Scale Machine Learning, ICML, ECML, ACML

Components of learning

For machine learning, formatively speaking, it can be divided into input, output, objective function, data set and prediction function.
Insert picture description here
Figure 1
Looking at machine learning from the process, we can regard the object we want to abstract the function as a black box, and a group of Input the data, get the corresponding output, and get the training data. Then the prediction function space is selected, and according to the learning algorithm, the prediction function is finally selected from the function space.
Note that in machine learning, we may not know the specific information and internal mechanism of the objective function at all, and our goal is to approximate the objective function, but the final trained prediction function may still be completely different from the objective function. The things in
Insert picture description here
Figure 2
constitute two parts of the final algorithm: the prediction function set H, and the learning algorithm A becomes the learning model (learning model)

Perceptron

As the name implies, the perceptron is the part that judges and perceives the input.
For example, for a set of vectors x=(x1,...,xn), the perceptron weights them and compares them with the threshold, and finally makes a judgment on this set of data.
Insert picture description here
Figure 3
where sign means Symbolic function

PLA

The PLA algorithm is a very classic algorithm in machine learning. The premise is that the data we calculate is linearly separable, that is, it can be divided by a hyperplane.
First, we have a weight vector w, and the perceptron calculates
Insert picture description here
Figure 4 to
get The training set (x1, y1),..., (xn, yn), y belongs to {1, -1}, and then we find that the value calculated by some perceptrons is different from the corresponding y, for example, on the rectangular coordinate system The point on the x-axis is 1, otherwise it is -1, and the result is judged as -1 on the x-axis. Then it needs to be adjusted, that is, the weight vector w is corrected.
Insert picture description here
Figure 5,
for example, below the straight line The point is -1. As a result, a point below is judged to be +1. That is, it is above the line, then after updating w, it is equivalent to rotating the line counterclockwise, so that this point is below the line.
Insert picture description here
Figure 6
In this way, we only need to select the wrong points and iterate the PLA one by one to get the straight line.
Note: Linear separability guarantees that the PLA will eventually stop, instead of changing the weight vector w, there will be new wrong judgments. Point
If you want to see more detailed proofs and derivations, please see https://www.cnblogs.com/HappyAngel/p/3456762.html

Types of machine learning

At present, the scope of machine learning is very wide, all kinds of new words, but it is recognized that machine learning can be divided into two categories: supervised learning and unsupervised learning.
Supervised learning is given sample input and what they expect Output, and then learn a way to connect input and output. On the other hand, unsupervised learning does not give labels, allowing the machine to find the internal structure of the input. Classification and regression are representatives of the former, while clustering is a representative of the latter

The final task of learning (leaning tasks)

In the end, the tasks that machine learning wants to achieve can be divided into classification, regression, clustering, density estimation, and dimensionality reduction.
Classification: The input is divided into two or more classes, and the learner must generate one The model assigns invisible inputs to one (or more tags) or more of these classes. This is usually handled under supervision. Spam filtering is an example of classification, where the input is an email (or other) message and the class is "spam" and "not spam"
regression: it is also a matter of supervision, the output is continuous, not Discrete
clustering: A set of inputs that will be grouped. Different from classification, the information of these groups is not known in advance.
Density estimation: Find the distribution of the input in different spaces.
Dimensionality reduction: Simplify the input by mapping the input to a lower-dimensional space.

Learning methods

• Regression
• Decision trees
• k−means
• Support vector machine
• Apriori algorithm
• EM algorithm
• PageRank
• kNN
• Naive Bayes
• (Deep) Neural networks
• Gradient Descent Methods
• Online Gradient Methods
• Stochastic Gradient Methods
• Newton method
• Quasi-newton method (BFGS)
• Limited memory BFGS
• Coordinate Descent
• Alternating Direction methods of multipliers
• Penalty method, Augmented Lagrangian
• Gradient Projection method
• Iterative-thresholding method (IST)
• Conditional Gradient method

to sum up

Machine learning has deep applications in all aspects, and it covers only a wide range, including linear algebra, probability theory, etc. It is recommended that you first learn the corresponding mathematical knowledge before entering the pit. Later this series will add more introductions and explanations of theoretical parts

Guess you like

Origin blog.csdn.net/Cplus_ruler/article/details/114258531