Machine Learning Notes 1: Overview

The accepted definition of machine learning [Mitchell, 1997]:

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E.

basic concept

Commonly used terms in the field of machine learning are, data set , which can be subdivided into training set , validation set and test set . A single record is called an instance or a sample , which contains a feature and a label. The space formed by the feature is called a feature space or a sample space , and a sample is often called a feature. vector (feature vector) . The learned model corresponds to a certain underlying law about the data, called a hypothesis , and this underlying law itself, called a ground-truth . The ability of the learned model to apply to new samples is called generalization ability. All machine learning models are based on the basic assumption that all samples in the sample space obey an unknown distribution (distribution) D , each sample obtained is independently sampled from this distribution, that is, independent and identically distributed (referred to as independent and identically distributed). i . i . d . )

The learning process can be regarded as a search process in the hypothesis space (hypothesis space ) composed of all hypotheses. The search goal is to find a hypothesis that matches the training set. Once the representation of the hypothesis is determined, the hypothesis space and Its size is determined. In real problems, the hypothesis space will be large, but the learning process is based on a limited sample training set, so there may be multiple hypotheses that are consistent with the training set, that is, there is a "hypothesis set" that matches the training set, which is called The version space (version space) , how to decide which one is better, needs to be based on inductive bias . Inductive preferences correspond to assumptions made by the learning algorithm itself about what model is better. In specific real-world problems, whether this assumption is true, that is, whether the inductive preference of the algorithm matches the problem itself, directly determines whether the algorithm can achieve good performance most of the time . The most commonly used principle is Occam's razor . If multiple assumptions are consistent with observations, the simplest one is chosen.

However, there is no free lunch theorem (No Free Lunch Theorem, referred to as NFL) , telling us that good is not absolute, if the learning algorithm L a better than learning algorithms on some problems L b better, there must be other problems, in which L b Compare L a Better performance, this conclusion holds for any algorithm. The principle is briefly described as the expected performance of any two algorithms is equal :
write picture description here
The most important implication of the NFL theorem is to make us clearly realize that, apart from the specific problem, the empty talk "what learning algorithm is better "good" is meaningless, because all learning algorithms are equally good when all potential problems are considered. To talk about the relative pros and cons of algorithms, we must focus on specific learning problems; learning algorithms that perform well on some problems may not be satisfactory on others. Whether the inductive preferences of the learning algorithm itself match the problem? often play a decisive role.

Statistical learning is a discipline in which computers build probability and statistical models based on data and use the models to predict and analyze data. The object of statistical learning is data, and the basic assumption about data is that similar data have certain statistical regularity . Prediction and analysis of data are achieved by constructing probability and statistical models .

Three Elements of Statistical Learning

Statistical learning methods include model hypothesis space, model selection criteria and model learning algorithms, which are called three elements of statistical learning methods, referred to as model , strategy and algorithm for short :

Method = Model + Strategy + Algorithm

development path

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325383610&siteId=291194637