Summary of common machine learning algorithms

Machine learning algorithms are mainly divided into the following two types: supervised and unsupervised.

It can be further subdivided into four categories: classification, clustering, regression, and association algorithms.

Classification and regression are supervised learning.

Clustering and association are unsupervised learning.



Classification algorithms mainly include KNN, decision tree, naive Bayes, SVM, logistic regression, and Adaboost algorithm.

        KNN algorithm: Classify based on distance, select the top K most similar samples, and see which category among these K samples is more, then the prediction result will be which category of sample.

        Decision tree: ID3: Generate a decision tree based on the information gain. Cannot handle numeric types

                       C4.5: Generate a decision tree based on the information gain rate. Can handle continuous and numerical types.

        Naive Bayes: conditional probability, total probability, Bayesian formula.

        SVM: Support vector machine. Find the maximum minimum distance. Use the Lagrange multiplier method and the dual problem to solve the Lagrange coefficients and obtain the weights.

        Logistic regression LR: Use echelon descent and stochastic gradient descent methods to solve for the coefficients.

        Adaboost algorithm: train different classifiers and gradually increase the weight of incorrectly classified samples.


Clustering algorithms include: K-means, distance-based clustering, density-based clustering, hierarchical-based clustering, partition-based clustering, grid-based clustering, and model-based clustering.

                        K-means: A typical example of distance based on distance.


Regression algorithm: linear regression, CART tree.

                    Linear regression: linear model, locally weighted linear regression, ridge regression (LASSO and forward stepwise regression).

                    CART tree (GBDT tree): regression tree, model tree.


Association rule algorithms include: apriori algorithm, FP tree.




Some other tools include PCA, SVD and other algorithms.

                 PCA: Subtract the average value, calculate the covariance matrix C, then calculate the eigenvalues ​​and eigenvectors of the covariance matrix C, take the first N largest eigenvalues, and map the data to a new space based on the eigenvectors.

                SVD: svd is often used for implicit semantic indexing LSI and implicit semantic analysis in information retrieval. Singular values ​​represent the concept or topic of the document.


Guess you like

Origin blog.csdn.net/u013995172/article/details/46522217