[Machine Learning Series] This article takes you to explain in detail what is Unsupervised Learning (Unsupervised Learning)

foreword

There are three main categories of machine learning: supervised learning, unsupervised learning, and reinforcement learning. In this article, we will introduce the principles, common algorithms and application areas of Unsupervised Learning.


1. Principle

Unsupervised learning is an important method in machine learning, corresponding to supervised learning. Its goal is to discover hidden structures and patterns from unlabeled data without the need for pre-defined target variables.

The core idea of ​​unsupervised learning is to discover the underlying structures and patterns in the data by analyzing the statistical properties and similarities of the data.

Unlike supervised learning, unsupervised learning does not require pre-labeled training data, but learns through automatic processing and clustering of data.

Unsupervised learning can be divided into two classes of problems: clustering and dimensionality reduction. The clustering problem is to divide the data into different groups or clusters, so that the data similarity in the same group is high, and the similarity between different groups is low. Dimensionality reduction problem is to map high-dimensional data to low-dimensional space to reduce feature dimension and data complexity.
insert image description here

2. Algorithm

There are many classic algorithms in unsupervised learning. Here are some common algorithms:

1️⃣ K-means clustering

K-means clustering is a commonly used clustering algorithm. It divides the data into K clusters, each cluster is represented by its centroid (cluster center). The core idea of ​​the algorithm is to cluster by minimizing the distance between the data point and the centroid of the cluster it belongs to.
insert image description here

2️⃣DBSCAN

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a density-based clustering algorithm. It divides data points into core points, boundary points and noise points, and judges whether data points belong to the same cluster by density accessibility.
insert image description here

3️⃣Principal component analysis

Principal Component Analysis (PCA) is a commonly used dimensionality reduction algorithm. It projects high-dimensional data to low-dimensional space through linear transformation to preserve the most important features. The goal of PCA is to find a set of orthogonal basis such that the projected data has the maximum variance.
insert image description here

4️⃣t-SNE

t-SNE (t-distributed Stochastic Neighbor Embedding) is a dimensionality reduction algorithm for visualizing high-dimensional data. It maps high-dimensional data to a low-dimensional space by maintaining the relative distance relationship between data points. t-SNE works great on visual clustering and classification problems.
insert image description here

5️⃣Association rule mining

Association rule mining is an algorithm for discovering frequent itemsets and association rules in a data set. It discovers hidden association patterns by analyzing the associations between items in the data. Association rule mining is widely used in market basket analysis, recommendation system and other fields.
insert image description here

3. Application field

Unsupervised learning has a wide range of applications in various fields, and some of the common application areas are described below:

1️⃣Image segmentation

In computer vision, unsupervised learning is used for image segmentation tasks. By clustering the pixels in an image, the image can be divided into different regions for further processing such as object recognition, image analysis, etc.
insert image description here

2️⃣Recommendation system

In the field of recommender systems, unsupervised learning is used to discover user interests and behavior patterns. By clustering and mining association rules on the user's historical behavior data, it can provide users with personalized recommendation results.
insert image description here

3️⃣ Social Network Analysis

In the field of social network analysis, unsupervised learning is used to discover community structures and relationships in social networks. Through clustering and network analysis of interaction data between users, the organizational structure and information dissemination mode of social networks can be revealed.
insert image description here

4️⃣Automatic driving

In the field of autonomous driving, unsupervised learning is used to sense and understand the environment and the road. By clustering and dimensionally reducing sensor data, important features such as roads, vehicles, and pedestrians can be extracted to support autonomous driving decision-making and control.
insert image description here

Four. Summary

Unsupervised learning is widely used in problems such as clustering, dimensionality reduction and association rule mining, providing us with powerful tools and methods for understanding data and solving practical problems. With the continuous increase of data scale and the continuous expansion of application scenarios, unsupervised learning will play an increasingly important role in various fields.


【Free book】

insert image description here

[! way of participation!

Like + favorite + any comment on this article
Deadline: 2023-07-20 21:00 in the evening

Note: The lucky draw method is <script random draw>, and the winners will be announced on my homepage as scheduled, with free shipping.


【Introduction↓】

This book uses ChatGPT and Python to easily realize office automation, so that ordinary office workers without programming experience can also control Python, realize office automation in multiple scenarios, and improve work efficiency!

This book comprehensively and systematically introduces the automation solutions of Python language in common office scenarios. The content includes basic knowledge of Python language, common methods of reading and writing data in Python, automatic operation of Excel with Python, automatic operation of Word and PPT with Python, automatic operation of files and folders, emails, PDF files, pictures, and videos with Python, data processing with Python Visual analysis and web page interaction, easy advanced Python office automation with ChatGPT.

Dangdang purchase link: "Python Automation Office Application Encyclopedia (ChatGPT Edition)"

insert image description here

Guess you like

Origin blog.csdn.net/m0_63947499/article/details/131529324