Recently, I took some time to watch Teacher Li Hongyi's machine learning course, and found that I didn't master the basics very well. Take some notes to facilitate your review later ~
This is Li Hongyi Machine Learning - Logistic Regression.
This video first talks about the most basic problem of machine learning - classification problem
. For the binary classification problem, the generative model in machine learning is used .
We can first briefly introduce the generative model and the discriminant model. Currently, most of the methods used in artificial intelligence are supervised learning using data, but semi-supervised and unsupervised are becoming increasingly popular.
Supervised learning is to learn a model from data to predict the corresponding output Y for a given input X. This model is in the form of a decision function, and the conditional probability distribution P (
Y
| Model : It is to learn the overall joint probability distribution P (X, Y) from the data, and find the conditional probability distribution P (Y |
Discriminant models such as HMM : learn the decision function Y=f(X) or conditional probability distribution P(Y|X) from data as a prediction model.
Typical discriminant models: KNN, logistic regression, SVM, decision tree, CRF
(Reprinted from https://zhuanlan.zhihu.com/p/45494891)
Generate model:
Citing the example of Pokémon,
we will introduce how to use a generative model to solve a two-class classification problem.
The 79 water Pokémon are just a bunch of sample data extracted from a large amount of data. The large amount of data conforms to the Gaussian distribution. The
Gaussian Distribution
generation model is here. To find the two parameters μ and Σ of the Gaussian distribution that are similar to the given data,
you can then use the formula to calculate the probability that a certain object belongs to a certain category:
Simply derive the above formula:
the question becomes Find a W and b,
and then we can compare the difference between logistic regression and linear regression:
Teacher Li Hongyi talks about machine learning, including the subsequent deep learning, in three steps. From the comparison of the first step, we can see that the two In fact, the difference is an activation function. Logistic regression maps the final value to the interval of 0-1 through the activation function (the more commonly used ones are softmax, etc.):
while the value of linear regression can be any value.
What follows is about For the formula derivation of the second and third steps, this video focuses on introducing the basic concepts of deep learning from logistic regression.
First, a simple but difficult-to-solve problem is raised for Logistic Regression -
classify these four points. We know that Logistic Regression cannot solve such a problem, so we can use a Feature Transformation (the basics are not good, so I was a little confused at first) , and later I realized that it is actually similar to feature extraction in deep learning, or can it be said that it is converted to another dimension of features for classification?)
In the video, X1 and X2 are changed into X1': the distance from [0,0]T, X2': the distance from [1,1]T:
In this way, the original classification can also be solved well.
Such feature transformation is found manually. What if we want the machine to find it by itself?
Method: Cascading logistic regression models
are a good introduction to deeplearning.
Simply record the experience of watching the video so that you can review it later. If there are any mistakes, you are welcome to criticize and correct them~