The difference between supervised learning and unsupervised learning_Machine Learning

 

Recently, I found that many people still can't really distinguish the learning method of machine learning. I will briefly talk about this in my humble opinion combined with the book.

 

In machine learning, it can be divided into Supervised Learning, Unsupervised Learning, Semi-Supervised Learning and Reinforcement Learning according to different learning tasks.

 

Supervised learning and unsupervised learning are two learning methods that are used more often. We mainly explain these two learning methods below.

 

Supervised learning

The data in supervised learning is classified information in advance. For example, in spam detection, his training samples are classified information in advance, that is, the marking information for spam and non-spam

Spam Filter

In supervised learning, his training samples contain both feature and label information.

In supervised learning, the more typical problems are the classification problems (Classfication) and regression problems (Regression) mentioned above.

The most important feature of the two of them is that the labels in the classification algorithm are discrete values, just like the label in the email classification problem mentioned above is {1, -1}, which represent spam and non-spam respectively.

The label value in the regression algorithm is generally a continuous value. For example, to predict a person's age, it is generally based on labels such as height, gender, weight, etc. This is because age is a continuous positive integer

 

The more typical algorithms on this are LR (Logistic Regression), BP neural network algorithm and common linear regression algorithm

 

 

unsupervised learning

Unsupervised learning is another commonly used machine learning algorithm. Unlike supervised learning, the samples of unsupervised learning do not contain label information and only have certain features. Therefore, since there is no label information, the classification is not known during the learning process. Is the result correct

The typical problem is that some aggregate news websites use crawlers to crawl news and classify news.

For example Baidu News

 

They don't have journalists, they just aggregate news from the whole network

For example, we search for 5G pilot cities

 

All news about this keyword will appear as a set, here we call it the problem of clustering

The typical problem of unsupervised learning is the clustering problem mentioned above. The more representative algorithms are K-Means algorithm (K-means algorithm), DBSCAN algorithm, etc.

 

Clustering algorithm is the most typical learning algorithm among unsupervised learning algorithms. It uses the characteristics of samples to divide samples with similar characteristics into the same category without caring what the category is.

In addition to clustering algorithms, there is another important type of algorithm in unsupervised learning, which is the dimensionality reduction algorithm. The principle is to map the sample points from the input space to a low-dimensional space through linear or nonlinear transformation, so as to obtain an low-dimensional representation of

The most commonly used clustering algorithm

 

If you don't understand, you can listen to Andrew Ng's video on Coursera about this part

Source link https://www.coursera.org/learn/machine-learning

link on site b

https://www.bilibili.com/video/av9912938/?p=4

 

My personal blog: www.susmote.com

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324740555&siteId=291194637
Recommended