SIGAI basic concepts of machine learning Episode

Outline:

Algorithm classification
supervised learning and unsupervised learning
classification and regression
model and discriminant model generation
reinforcement learning
evaluation
accuracy of the regression error
ROC curve
cross-validation
model selection
through the fitting and underfitting
deviation and variance
regularization

Semi-supervised learning to classify supervised learning.

Most of the problems are supervised learning classification problem, there are supervised classification problem is divided into generative model and discriminant model.

Commonly used classification accuracy evaluation index is commonly used for regression evaluation is regression error mean square error.

Binary classification ROC curve was often do it.

Over-fitting universal solution means that regularization.

Algorithms classification:

Supervision signal, the tag value of the sample is based on the value of the label to know whether there will be classified as machine learning supervised learning, unsupervised learning, semi-supervised learning.

Supervised learning and unsupervised learning:

There are two supervised learning process: training and forecasting.

The predicted input samples (x, y), a trained model y = f (x) to predict the value of the label of the new sample.

Unsupervised learning: clustering and dimensionality reduction.

Dimensionality reduction is to avoid the curse of dimensionality, high-dimensional data algorithm is relatively difficult to handle, a correlation between the data.

Reinforcement Learning:

Is an algorithm born out of the field of policy control, according to environmental data to predict its action, the goal is to maximize the reward value.

Classification and Regression:

Supervised learning is divided into classification and regression problems, such as a classification problem is to determine the category of fruit, according to the personal information projected income is a return to the problem.

Classification:

R & lt n -> the Z, the n-dimensional vector maps to an integer value, this value corresponds to a classification.

Face detection and classification problem is, a location in the image area is a face or not face.

The easiest way is to find a binary classification classifies the linear equation, linear classifiers SGN (W T X + B), the output of +1 or -1.

Regression:

R & lt n- -> R & lt, R & lt real value is to be predicted.

The simplest algorithm is a linear regression F (X) = W T X + B, compared to classification function sgn omitted.

Loss of function, also called the error function, the number of almost all of supervised learning, its goal is to minimize the loss of function or maximize the likelihood function, after determining that the optimization objectives, work half done, left under is complete optimization to solve the problem, can the standard algorithms such as gradient descent, Newton's method, based on the characteristics of their algorithm selection of a suitable optimization algorithm to solve, which is standardized processes, after solving the finish, it is solved the f (x) parameter values to complete the training, then you can use f (x) to predict new sample used for classification or regression.

 

 Linear Regression:

, It is a linear function.

Training goal is to minimize the MSE mean square error, no constraint is a convex optimization problem to be solved when training (to prove losses mean square error function is convex, it is necessary to prove on Hession Semidefinite), convex optimization problem to find the global minimum point L.

MSE prove the loss function is convex:

L is seeking on Hession:

 

 

 

 . 1 / lX T X-quadratic matrix corresponding to X T . 1 / lX T Xx, ie. 1 / L (X T X- T ) (Xx), i.e. (Xx) T (Xx), since (Xx) T is a row vector (Xx) is a column vector, which is multiplied by two vector inner product is greater than zero, so the semi-definite matrix Hession, the MSE is a convex function loss function L, there is a global minimum point.

Generation model and discriminant model:

For classification in accordance with the idea of ​​solving it can be divided into two types:

① discrimination model, it is determined directly from the function which is a model belongs.

The first is y = f (x), the function sgn (w predicted directly from a T + B) to the predicted value y tag.

The second is p (y | x), the posterior probability calculation which, according to the feature x is calculated probability of each category it belongs to, wherein the thrust reverser according to class to which it belongs, which is the posterior probability.

② generation model

X, y pairs model the joint distribution, p (x, y) = p (x | y) p (y), i.e. x is assumed to obey a certain distribution p (x | y), p (y) modeling.

Another model is defined generation algorithm used to generate the data, such as GAN.

Distinguish the difference between the model and generate models:

Seeking discriminant model is p (y | x), seeking to generate a model p (x | y).

Most classification algorithms are learning discriminant model.

Generating model: Bayesian classifier, Gaussian mixture models, hidden Markov model, Restricted Boltzmann Machine, against generation networks.
Discriminant model: decision trees, kNN algorithms, artificial neural networks, support vector machines, logistic regression, AdaBoost algorithm (although logistic used in probability, but it is p (y direct calculation of | x), ie the sample belongs to a class probability, does not assume a probability distribution on x to p (x | y), p (y) modeling).
In solving the classification problem is essentially different:

Discriminant model directly to find out a dividing line, as both sides of the sample obey what kind of distribution, where dense sparse where not concerned. Generation model is to calculate the distribution of the samples on both sides of obedience, again count sample belongs to a class of probability.

Evaluation:

Because the algorithm to compare the pros and cons, so the introduction of evaluation. For the same kind of problem may have different algorithms can solve it, to better determine what kind of algorithm, which is based on a measure of its accuracy or precision is called, there is another indicator is the speed of the algorithm.

For classification and regression problems its definition of accuracy is not the same.

For the classification problem represented by the accuracy of that total number of samples correctly classified / test sample, the sample is divided into a training set and test set (disjoint and training set) with a test set to statistical accuracy, because with a training set of statistical accuracy it's meaningless.

Regression error that evaluation of regression problem, because the classification is a yes and no question, but it is a continuous regression real value can not be answered with yes and no, it is a regression error .

 

 The accuracy of the regression error:

 

 

 

 

 

 

Guess you like

Origin www.cnblogs.com/wisir/p/11843066.html