Introduction to Machine Learning + Algorithms

Machine learning algorithms attempt to dig out the hidden laws from a large amount of historical data and use them for regression (prediction) or classification. In a broad sense, machine learning is a method that can give machine learning capabilities so that it can perform functions that cannot be accomplished by direct programming; but in a practical sense, machine learning is a method that uses data to train models. , and then use a method of model prediction.

1. Classification of machine learning

According to learning theory, machine learning models can be divided into supervised learning, semi-supervised learning, unsupervised learning, transfer learning and reinforcement learning:

Supervised learning is training samples with labels, which are divided into two parts: regression and classification;

Semi-supervised learning is that the training samples are partially labeled and partially unlabeled;

Unsupervised learning is training samples without labels;

Migration learning is to migrate the trained model parameters to the new model to help the new model training;

Reinforcement learning is a learning optimal strategy, which allows the ontology (agent) to act according to the current state in a specific environment, so as to obtain the maximum reward. The biggest difference between reinforcement learning and supervised learning is that each decision of reinforcement learning is not right or wrong, but hopes to get the most cumulative rewards.

2. Introduction to Machine Learning Algorithms

1. Regression algorithm

Regression algorithms include linear regression and logistic regression.

The general idea of ​​linear regression is how to fit a straight line that best fits all of our data? Generally, the "least square method" is used to solve the problem, and the least square method transforms the optimal problem into the problem of finding the extreme value of the function. In mathematics, we generally use the method of finding the derivative to be 0, but this method is not suitable for computers, and may not be solved, or the calculation may be too large. Therefore, the "gradient descent method" and "Newton method" are proposed to deal with the problem of solving the extreme value of the function.

Logistic regression is similar to linear regression, but linear regression deals with numerical problems, that is, the final predicted result is a number; while logistic regression belongs to a classification algorithm, that is, the predicted result of logistic regression is a discrete classification, such as judging this letter Whether the message is spam, whether the user will click on the ad, etc.

2. SVM (Support Vector Machine)

SVM is a supervised learning algorithm.

In a sense, SVM is an enhancement of the logistic regression algorithm.

SVM can be divided into linear support vector machine and nonlinear support vector machine. Linear support vector machines are used to solve linear problems, and nonlinear support vector machines are used to solve nonlinear classification problems. Kernel techniques (kernel functions) are required. "Kernel" is actually a special function, the most typical feature It is possible to map a low-dimensional space to a high-dimensional space.

Advantages of SVM: not easy to overfit

Disadvantages of SVM: large amount of calculation

3. Decision tree

Decision trees are supervised learning algorithms.

A decision tree is a basic classification and regression problem. In a classification problem, a decision tree can be considered as an if-then rule. The decision tree is composed of nodes and directed edges, and the nodes are further divided into internal nodes and leaf nodes.

Decision tree learning algorithm includes feature selection, generation of decision tree (using ID3, C4.5 algorithm), pruning of decision tree (using classification and regression tree CART algorithm).

  • Feature selection determines which feature is used to divide the feature space, and selects features that have the ability to classify training data to improve the learning ability of the decision tree.
  • The generation of the decision tree is by calculating the information gain, starting from the root node, and recursively generating the decision tree from the root node.
  • The pruning of the decision tree is due to the overfitting problem of the generated decision tree, which needs to be pruned to simplify the decision tree.

Advantages of decision trees: strong interpretability and visualization.

Disadvantages of decision trees: easy to overfit (avoid overfitting by pruning), difficult to tune, and low accuracy.

4. Naive Bayes classification

Naive Bayesian classification is a supervised learning algorithm.

"Simple" means that the features are independent and do not interfere with each other.

Naive Bayes classifiers are a simple class of probabilistic classifiers. The Naive Bayes classifier can be used in application scenarios such as inferring the category (Y) of a product through the description (feature X) of the product.

The Naive Bayesian method is a typical generative learning method, which learns the joint probability distribution P(X,Y) from the training data, and then obtains the posterior probability P(Y|X), specifically, uses the training data to learn P(X |Y) and P(Y), the joint probability distribution is obtained: P(X,Y)=P(Y)P(X|Y).

Process: The conditional probability is obtained during training; the conditional probability is compared during reasoning.

5. KNN (K nearest neighbor) algorithm

KNN is a supervised learning algorithm.

KNN: K nearest neighbors is a basic classification and regression algorithm. The basic idea is: your class is inferred by your neighbors. The basic approach is: for a given training instance point and input instance point, first determine the K nearest neighbor training instance points of the input instance point, and then use the majority of the classes of the K instance points to predict the class of the input instance point.

The three elements of KNN: distance measure (commonly used Euclidean distance), selection of K value (selection of K value reflects the trade-off between approximation error and estimation error), classification decision rule (commonly used classification decision rule is majority vote) .

Use kd-tree, that is, kd tree to reduce the number of distance calculations to improve the search efficiency of KNN attachments.

6. Adaboost algorithm

Adaboost is a supervised learning algorithm.

The boosting method learns multiple classifiers by changing the weight of the training samples, and linearly combines multiple classifiers to improve the classification performance. The basic idea is: for a complex task, the judgment of multiple experts is properly integrated to obtain The judgment made by the expert is better than that of any one of the experts alone.

Adaboost starts from the weak classification algorithm, learns repeatedly, obtains a series of weak classifiers, and combines these weak classifiers into some strong classifiers.

7. Clustering algorithm

Clustering algorithm belongs to unsupervised learning, the more commonly used algorithm is K-means (k-means clustering)

The basic idea of ​​the K-means algorithm: first select the centers of K classes, calculate the similarity between each sample and the centroid by using the Euclidean distance, divide the samples into the class closest to the center, and obtain a clustering result, Then calculate the mean of each sample as the center of the class, and repeat the above steps until convergence.

Disadvantages of the K-means algorithm: This algorithm is an iterative algorithm and cannot guarantee global optimality.

3. General application of machine learning

big data analysis

predict

.....

4. The extension of machine learning: deep learning

In recent years, the development of machine learning has produced a new direction, that is, "deep learning". The concept of deep learning is very simple, that is, the traditional neural network has developed to the situation of multiple hidden layers. Since the 1990s, neural networks have been silent for a while, but Geoffrey Hinton, the inventor of the BP algorithm, has not given up on research on neural networks. Since the neural network expands to more than two hidden layers, its training speed will be very slow, so the practicality has always been lower than that of SVM. In 2006, Geoffrey Hinton published an article in the scientific journal "Science", indicating that a neural network with multiple hidden layers is called a deep neural network, and learning research based on a deep neural network is called deep learning. Since then, further research on deep learning has begun.

Reference: Introduction to the Ten Classic Algorithms of Machine Learning

           Introduction to Basic Kmeans Algorithm and Its Implementation_Liam Q's Column-CSDN Blog_kmeans Algorithm

           Li Hang. "Statistical Learning Methods"

Note: This article is mainly used for my own study. Most of the content is obtained from many excellent bloggers and Mr. Li Hang's "Statistical Learning Method".

Guess you like

Origin blog.csdn.net/weixin_44570845/article/details/122365860