The term machine learning substantially 1.0

Disclaimer: This article is a blogger original article, follow the CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source link and this statement.
This link: https://blog.csdn.net/qq_33189961/article/details/97642546

table of Contents

Machine learning is defined

Model and Pattern

Example data set (sample feature vector)

dimension

Learning (training)

Training samples and the training set

Learning model (learner)

Testing and test samples

Assumptions and Truth (true)

Mark with sample

Marking space (output space)

Classification and Regression

N-type and anti-type (negative type)

Clustering

The difference between classification and clustering

Supervised learning and unsupervised learning

Generalization

Independent and identically distributed

Rote learning

Version space

Induction preferences

Occam's Razor

No Free Lunch Theorem (NFL)


Machine learning is defined

Machine learning is such a subject, it means working on how to calculate and use the experience to Mei good performance of the system itself in a computer system, "experience" is usually present in the "data" form, so? Machine learning study the main content is about have a "model" (model) from the data on a computer algorithm, namely "learning algorithm (learning algorithm). With the learning algorithm, we have empirical data available to it, it can be based on these data generating a model; in the face of new circumstances (for example, did not see a cut watermelon), the model will provide us with the appropriate judgments (such as good melon) If computer science is the study of knowledge about the "algorithm", then. similar can be said of machine learning is the study of knowledge about "learning algorithm".

Model and Pattern

"Model" refers to data obtained from the results of high school, or a global result (e.g. a decision tree), and a "mode" refers to locally stay junction (e.g., a rule).

Example data set (sample feature vector)

"Data set" that is a collection of data

"Exemplary" is also called a sample, the feature vector, that is, the data set in each record is a description of the event or object.

//// further relates attribute, attribute value, the attribute value, the sample space (input space) a simple stop here, see the book watermelon P2

dimension

I.e., the number of attributes of the feature vectors

Learning (training)

That was the model data from the high school process.

Training samples and the training set

"Training samples" that the training data for each sample, "training set" consisting of a set of training samples, ie.

Learning model (learner)

Learning algorithm can be viewed as instances of a given learning algorithm on the data and the spatial parameter to be set usually parameters, using different parameter values, and (or) the training data, to produce different results.

Testing and test samples

After the learned model, using the parent row prediction process is called "test" (Testing) , the sample to be predicted is called "test sample" (Testing Sample). For example, in learn f  after, the test cases x  obtained prediction tag = Y  F (X).

Assumptions and Truth (true)

"Hypothesis" that learn a model corresponding to the potential of certain laws about data. "Truth" that this potential law itself. Learning process is to find out the truth or approximation.

Mark with sample

When the "tag" that is, information about the sample results, such as a watermelon is good or bad judgment, "good melon" is a mark. "Sample" that have an example of tag information.

Marking space (output space)

Classification and Regression

"Classification" that is to be predicted is a discrete value, such as "good melon" "bad melon" this type of learning task.

"Return" Jiyu predicted continuous value such as watermelon maturity 0.95 0.37 this type of category learning task.

N-type and anti-type (negative type)

Involving only two categories of "binary" (binary Cl sification) mission, which is usually called a class is "positive type" (positive class)

Another class of "anti-Class" (negative class).

Clustering

Upcoming training set samples into a plurality of groups known as a "cluster" (cluster). For example, in identifying problems in the watermelon into watermelon "light melon" "dark melon" even "in the present sweet potato" "outer potato "in clustering, a" light-colored melon "concepts such as" this sweet potato "we do not know in advance, and the training samples used in the learning process usually does not have tag information.

The difference between classification and clustering

Category: according to some known classes given sample labels, some kind of training to learn the machine (ie, some sort of objective function), to enable it to classify samples of unknown category. It belongs to supervised learning.

Clustering: refers to the category labels did not know in advance of any sample, sample hopes to put a group of unknown class is divided into several classes by some algorithm, clustering, we do not care what a class is, we need to implement the goal is simply to put similar things come together, belonging to the unsupervised learning.

Supervised learning and unsupervised learning

Supervised learning: from a given training data set out a learning function (model parameters), when the arrival of new data, the results can be predicted based on this function. Supervised learning the training set requirements include input and output, it can be said characteristics and objectives. The goal of the training set is marked by the people. Supervised learning is the most common classification (clustering attention and distinction) and regression is a common technique to train the neural network and decision tree.

Unsupervised Learning: the input data is not marked, nor to determine the outcome. Sample data type is unknown, the similarity between samples required sample set clustering (Clustering) according to attempt to minimize the gap between the classes, to maximize the gap between the classes. Popular point is the actual application, many cases can not be known in advance sample of the label, which means that no training samples corresponding category, which can only sample sets from the original sample labels do not start learning classifier design. As for the difference between the two in detail can be found in https://blog.csdn.net/zb1165048017/article/details/48579677

Generalization

The ability to learn model for new samples.

Independent and identically distributed

That is generally assumed that the sample space of all samples subject to unknown "distribution" (Distribution's) D, we get each sample is independent from this sampling distribution obtained. In general, the more training samples, we get about the more information, the more likely to get this model with high generalization ability by learning.

Rote learning

"Remember," training samples, so-called "machine learning" or "rote learning."

Version space

The real problem we often face a lot of hypothesis space, but the learning process is carried out based on a limited sample of the training set, and therefore, there may be multiple hypotheses consistent with the training set, that there is a consistent set of training "set of assumptions" we call it "version of the space."

Induction preferences

The machine learning algorithms in the learning process on the assumption that some type of preference. That is more attention to what kind of situation.

Occam's Razor

I.e. "If multiple hypotheses consistent with the observation, then select the most simple." There are two curves below with limited training set consistent, easy because A smoother curve A indicates generally selected.

No Free Lunch Theorem (NFL)

For a learning algorithm a, if it is on some issues better than learning algorithm b, then there must be another set of problems, where the island is better than a ratio b. Interestingly, this conclusion any algorithms were established. That no matter how clever a learning algorithm, learning algorithms b and more awkward, actually they have the same stringent performance expectations! However, please note! ! ! NFL theorem is an important premise: the same opportunity to all "problems" arise or all issues are equally important, but the case was not so.

The most important implication NFL theorem is clear to us, from the specific issues, vague talk about "what better learning algorithm" meaningless, because consideration of all potential questions, Tony. All learning algorithms are just as good

 

 

Guess you like

Origin blog.csdn.net/qq_33189961/article/details/97642546