[Five schools of machine learning and nine common algorithms]

Pattern recognition, machine learning, and deep learning represent three distinct schools of thought.

Pattern recognition is the oldest (and arguably outdated as a term).

Machine learning is fundamental (one of the hottest areas in startups and research labs today).

Deep learning is such a new and influential frontier that we don't even think about the post-deep learning era.

1) Machine learning continues to hold its head high like a real champion;

2) At the beginning, pattern recognition was mainly used as a synonym for machine learning;

3) Pattern recognition is slowly declining and dying;

4) Deep learning is a new and rapidly rising field.

Pattern Recognition: The Birth of Smart Programs

Machine learning: intelligent programs that learn from examples

Deep Learning: The Architecture of Unifying Jianghu

 

1. How Machine Learning Works

①Select data: Divide your data into three groups: training data, validation data and test data

②Model data: Use training data to build a model that uses relevant features

③Validation model: use your validation data to access your model

④Test model: Use your test data to check the performance of the validated model

⑤Using a model: Use a fully trained model to make predictions on new data

⑥Tune the model: use more data, different features or tuned parameters to improve the performance of the algorithm

 

 

Two, five schools

①Symbolism: use symbols, rules and logic to represent knowledge and carry out logical reasoning, favorite algorithms are: rules and decision trees

②Bayesian: get the probability of occurrence for probabilistic reasoning, the favorite algorithm is: Naive Bayes or Markov

③ Connectionism: Use probability matrices and weighted neurons to identify and generalize patterns dynamically, favorite algorithm is: neural network

④Evolutionism: Generate changes, and then obtain the best one for a specific goal. My favorite algorithm is: Genetic Algorithm

⑤Analogizer: optimize the function according to the constraints (go as high as possible, but don't leave the road at the same time), the favorite algorithm is: support vector machine

 

 

Three, nine common methods

1. Decision Tree : A typical decision tree analysis uses hierarchical variables or decision nodes in a step-by-step response process, for example, to classify a given user as trustworthy or unreliable.

Strengths: Good at evaluating a range of different characteristics, qualities, characteristics of people, places, and things

Scenario example: rules-based credit assessment, horse racing result prediction

 

2. Support Vector Machine : Based on the hyperplane, the support vector machine can classify data groups.

Pros: SVMs are good at binary classification operations between variable X and other variables, whether the relationship is linear or not

Scenario examples: news classification, handwriting recognition.

 

3. Regression: Regression can outline the state relationship between the dependent variable and one or more dependent variables. In this example, a distinction is made between spam and non-spam.

Advantage: Regression can be used to identify continuous relationships between variables, even if the relationship is not very obvious

Scenario example: road traffic flow analysis, mail filtering

 

4. Naive Bayes Classification : Naive Bayes classifiers are used to calculate branch probabilities for possible conditions. Each independent feature is "naive" or conditionally independent, so they do not affect other objects. For example, in a jar with a total of 5 yellow and red balls, what is the probability of getting two yellow balls in a row? As can be seen from the top branch in the figure, the probability of grabbing two yellow balls back and forth is 1/10. Naive Bayes classifiers can compute joint conditional probabilities for multiple features.

Pros: Naive Bayesian methods can quickly classify related objects with significant features on small datasets

Scenario example: sentiment analysis, consumer classification

 

5. Hidden Markov model : Explicit Markov processes are completely deterministic - a given state is often accompanied by another state. Traffic lights are an example. Instead, Hidden Markov Models calculate the occurrence of hidden states by analyzing visible data. Subsequently, with the help of hidden state analysis, hidden Markov models can estimate possible future observation patterns. In this example, the probability of high or low air pressure (which is a hidden state) can be used to predict the probability of sunny, rainy, cloudy days.

Advantages: Tolerates data variability, suitable for recognition and prediction operations

Scenario example: facial expression analysis, weather forecast

 

6. Random forest : The random forest algorithm improves the accuracy of decision trees by using multiple trees with randomly selected subsets of data. This example examines a large number of genes associated with breast cancer recurrence at the level of gene expression, and calculates the risk of recurrence.

Pros: Random forest methods prove useful for large datasets and items with large and sometimes uncorrelated features

Scenario example: user churn analysis, risk assessment

 

7. Recurrent neural network : In any neural network, each neuron converts many inputs into a single output through 1 or more hidden layers. Recurrent Neural Networks (RNNs) pass values ​​further layer by layer, making layer-by-layer learning possible. In other words, RNNs have some form of memory that allows previous outputs to influence subsequent inputs.

Pros: Recurrent Neural Networks are predictive in the presence of large amounts of ordered information

Scenario examples: image classification and subtitle addition, political sentiment analysis

 

8. Long short-term memory (LSTM) and gated recurrent unit neural network (gated recurrent unit nerual network) : The early form of RNN will be lossy. Although these early RNNs allowed only a small amount of early information to be retained, recent long-term short-term memory (LSTM) and gated recurrent unit (GRU) neural networks have both long-term and short-term memories. In other words, these recent RNNs have better control over memory, allowing earlier values ​​to be retained or reset when it is necessary to process many series of steps, which avoids "gradient decay" or values ​​passed from layer to layer the final degradation. LSTM and GRU networks allow us to control memory using memory modules or structures called "gates" that pass or reset values ​​when appropriate.

Advantages: Long Short-Term Memory and Gated Recurrent Unit Neural Networks have the same advantages as other Recurrent Neural Networks, but are used more often because they have better memory capabilities

Scenario example: natural language processing, translation

 

9. Convolutional neural network : Convolution refers to the fusion of weights from subsequent layers, which can be used to label the output layer.

Pros: Convolutional Neural Networks are very useful when there are very large datasets, large numbers of features, and complex classification tasks

Scenario examples: image recognition, text-to-speech, drug discovery

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326225716&siteId=291194637