Classification of common algorithms in machine learning

Machine learning is undoubtedly a hot topic in the current data analysis field. Many people use machine learning algorithms more or less in their daily work. This article summarizes common machine learning algorithms for your reference in work and study.

There are many algorithms for machine learning. Many times people are confused, many algorithms are a class of algorithms, and some algorithms are extended from other algorithms. Here, we will introduce to you from two aspects, the first aspect is the way of learning, and the second aspect is the similarity of algorithms.

learning method

There are different ways to model a problem depending on the type of data. In the field of machine learning or artificial intelligence, people first think about how algorithms learn. In the field of machine learning, there are several main ways of learning. It is a good idea to categorize algorithms according to how they learn, so that people can consider when modeling and algorithm selection can choose the most appropriate algorithm to obtain the best results based on the input data.

Supervised Learning:

 

Under supervised learning, the input data is called "training data", and each set of training data has a clear identification or result, such as "spam" and "non-spam" in the anti-spam system, and for handwritten digit recognition. "1", "2", "3", "4" etc. When building a prediction model, supervised learning establishes a learning process, compares the prediction result with the actual result of the "training data", and continuously adjusts the prediction model until the prediction result of the model reaches an expected accuracy rate. Common application scenarios of supervised learning are classification problems and regression problems. Common algorithms include Logistic Regression and Back Propagation Neural Network

Unsupervised Learning:

In unsupervised learning, the data is not specifically identified, and the model is learned to infer some intrinsic structure of the data. Common application scenarios include the learning of association rules and clustering. Common algorithms include Apriori algorithm and k-Means algorithm.

Semi-supervised learning:

In this learning method, the input data is partially marked and partially unmarked. This learning model can be used to make predictions, but the model first needs to learn the internal structure of the data in order to organize the data reasonably to make predictions. Application scenarios include classification and regression, and algorithms include some extensions to commonly used supervised learning algorithms that first attempt to model unlabeled data and then make predictions on labeled data. Graph Inference or Laplacian SVM.

 

Reinforcement learning:

In this learning mode, the input data is used as feedback to the model. Unlike supervised models, the input data is only used as a way to check whether the model is right or wrong. In reinforcement learning, the input data is directly fed back to the model, and the model must Make adjustments now. Common application scenarios include dynamic systems and robot control. Common algorithms include Q-Learning and Temporal difference learning

 

In the scenario of enterprise data application, the most commonly used models may be supervised learning and unsupervised learning. In areas such as image recognition, semi-supervised learning is currently a hot topic due to the existence of a large amount of non-identifiable data and a small amount of identifiable data. And reinforcement learning is more used in robot control and other fields that require system control.

 

Algorithmic Similarity

 

According to the similarity of the functions and forms of the algorithms, we can classify the algorithms, such as tree-based algorithms, neural network-based algorithms, and so on. Of course, the scope of machine learning is huge, and some algorithms are difficult to categorize clearly into a certain category. For some categories, algorithms for the same category can address different types of problems. Here, we try to classify the commonly used algorithms in the easiest way to understand them.

Regression algorithm:

Regression algorithms are a class of algorithms that attempt to explore relationships between variables using a measure of error. Regression algorithms are a powerful tool in statistical machine learning. In the field of machine learning, when people talk about regression, sometimes they refer to a class of problems, and sometimes they refer to a class of algorithms, which often confuses beginners. Common regression algorithms include: Ordinary Least Square, Logistic Regression, Stepwise Regression, Multivariate Adaptive Regression Splines, and Locally Smoothed Estimation (Locally). Estimated Scatterplot Smoothing)

Instance-Based Algorithms

 

Instance-based algorithms are often used to model decision problems. Such models often select a batch of sample data and then compare the new data with the sample data according to some approximation. Find the best match in this way. Therefore, instance-based algorithms are often also referred to as "winner-take-all" learning or "memory-based learning". Common algorithms include k-Nearest Neighbor (KNN), Learning Vector Quantization (LVQ), and Self-Organizing Map (SOM)

 

regularization method

 

Regularization methods are extensions of other algorithms (usually regression algorithms) that adjust the algorithm based on its complexity. Regularization methods typically reward simple models and penalize complex algorithms. Common algorithms include: Ridge Regression, Least Absolute Shrinkage and Selection Operator (LASSO), and Elastic Net.

 

decision tree learning

The decision tree algorithm uses a tree structure to establish a decision model according to the attributes of the data, and the decision tree model is often used to solve classification and regression problems. Common algorithms include: Classification and Regression Tree (CART), ID3 (Iterative Dichotomiser 3), C4.5, Chi-squared Automatic Interaction Detection (CHAID), Decision Stump, Random Forest, Multivariate Adaptive Regression Splines (MARS) and Gradient Boosting Machine (GBM)

 

Bayesian method

Bayesian method algorithm is a class of algorithms based on Bayes' theorem, mainly used to solve classification and regression problems. Common algorithms include: Naive Bayes, Averaged One-Dependence Estimators (AODE), and Bayesian Belief Network (BBN).

 

Kernel-Based Algorithms

 

The most famous of the kernel-based algorithms is the Support Vector Machine (SVM). Kernel-based algorithms map input data to a higher-order vector space in which some classification or regression problems can be solved more easily. Common kernel-based algorithms include: Support Vector Machine (SVM), Radial Basis Function (RBF), and Linear Discriminate Analysis (LDA), etc.

 

Clustering Algorithm

 

Clustering, like regression, sometimes people describe a class of problems and sometimes a class of algorithms. Clustering algorithms usually combine input data in a central point or hierarchical manner. All clustering algorithms try to find the internal structure of the data in order to classify the data according to the greatest commonality. Common clustering algorithms include k-Means algorithm and expectation maximization algorithm (Expectation Maximization, EM).

 

Association rule learning

 

Association rule learning finds useful association rules in large multivariate datasets by finding the rules that best explain the relationships between data variables. Common algorithms include Apriori algorithm and Eclat algorithm.

 

Artificial neural networks

 

 

Artificial neural network algorithm simulates biological neural network and is a type of pattern matching algorithm. Often used to solve classification and regression problems. Artificial neural networks are a huge branch of machine learning, with hundreds of different algorithms. (Deep learning is one of them, we will discuss it separately), important artificial neural network algorithms include: Perceptron Neural Network, Back Propagation, Hopfield network, Self-organizing map ( Self-Organizing Map, SOM). Learning Vector Quantization (LVQ)

 

deep learning

 

Deep learning algorithms are the development of artificial neural networks. It has won a lot of attention recently, especially after Baidu has also begun to develop deep learning , it has attracted a lot of attention in China. In a world where computing power is becoming increasingly cheap, deep learning attempts to build much larger and more complex neural networks. Many deep learning algorithms are semi-supervised learning algorithms designed to process large datasets with small amounts of unlabeled data. Common deep learning algorithms include: Restricted Boltzmann Machine (RBN), Deep Belief Networks (DBN), Convolutional Network, Stacked Auto-encoders.

 

Dimensionality reduction algorithm

 

Like clustering algorithms, dimensionality reduction algorithms attempt to analyze the internal structure of the data, but dimensionality reduction algorithms attempt to generalize or interpret data using less information in an unsupervised learning fashion. Such algorithms can be used to visualize high-dimensional data or to simplify data for use in supervised learning. Common algorithms include: Principal Component Analysis (PCA), Partial Least Square Regression (PLS), Sammon Map, Multi-Dimensional Scaling (MDS), Projection Pursuit Wait.

 

Integration algorithm:

Ensemble algorithms independently train on the same samples with some relatively weak learning models, and then combine the results to make an overall prediction. The main difficulty of the ensemble algorithm lies in which independent and weak learning models to integrate and how to integrate the learning results. This is a very powerful class of algorithms, but also very popular. Common algorithms include: Boosting, Bootstrapped Aggregation (Bagging), AdaBoost, Stacked Generalization (Blending), Gradient Boosting Machine (GBM), Random Forest.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326475429&siteId=291194637