Machine Learning Algorithms/Models-General Chapter

1. Introduction to Machine Learning

1.1 The concept of machine learning

Introduction to machine learning Introduction to
machine learning (advanced)

1.2 The framework of machine learning

  • Model function: mathematical tools
  • Objective function: modeling the model error
    Objective function = error sum + penalty term
    (structural risk)
  • Optimization algorithm: Solve the parameters of the objective function

2. Classification and regression: supervised learning

We can divide machine learning models into two categories : supervised learning and unsupervised learning based on different model training methods . According to different learning goals , supervised learning can be divided into classification and regression methods.

2.1 Linear regression

Machine learning algorithm/model-linear regression

2.2 Logistic regression

Machine learning algorithm/model-logistic regression

2.3 Support Vector Machine

Machine learning algorithm/model-support vector machine

2.4 Decision tree

Machine learning algorithm/model-decision tree

2.5 Naive Bayes

Machine learning algorithm/model-Naive Bayes classification

3. Clustering: unsupervised learning

The clustering distance measurement includes four methods:
partition-based, hieratical-based, density-based, and model-based ,
but the main ones are partition baesed and model based (the other two are very slow). The typical examples of the two are K -Means and GMM .

3.1 K-means

Machine learning algorithms/models-supervised to unsupervised (clustering): from KNN to K-menas

3.2 Gaussian Mixture Model (GMM)

The Gaussian mixture model is a clustering method based on the high-dimensional Gaussian density function. Suppose there are a total of points to be clustered and obey a certain distribution. We need to find a set of parameters that maximizes the probability of generating these data points.
Insert picture description here

4. Dimensionality reduction: unsupervised learning

After understanding the basic regression, classification, and clustering of machine learning, it is time to analyze the data fed to these algorithms. When we introduced these algorithms before, the data often used has been preprocessed. For example, normalization processing, dimensionality reduction processing and so on have been done. Whether the data preprocessing is good or bad is also crucial to the solution of our final problem, so it is also an important part of machine learning.

In the field of machine learning, dimensionality reduction refers to the use of a certain mapping method to map data samples in the original high-dimensional space to the low-dimensional space.

The essence of dimensionality reduction is to learn a mapping function y=f(x), where x represents the original high-dimensional data and y represents the mapped low-dimensional data.

3.1 Principal component analysis (PCA)

Machine learning algorithm/model-supervised to unsupervised (dimensionality reduction) principal component analysis (PCA)

Insert picture description here

Matrix Factorization PCA

Singular Value Decomposition (SVD) PCA

5. Integrated learning

Integrated learning of machine learning

6. Stage summary

Summary of learning phase:
machine learning model/algorithm — phase summary (1) model framework
machine learning model/algorithm — phase summary (2) key concepts/technical
machine learning algorithm/model — phase summary (3) interview Test site
machine learning algorithm/model-phase summary (4) higher level

Deeper understanding:
Hypothesis:
Generalization of the hypothetical function model: deviation, variance, noise
"distance", "norm" and norm-regularized
linear model Summary: Can't distinguish between linear regression and linear classification models?
A generative model or a discriminative model?
Parameter learning: the difference between LR and SVM?
Linear separable: linear (two classification) model
classification loss function (margin loss function)-take two classification as an example.
Objective function: what do experience loss (loss function) and structured loss (regular term) do?

New knowledge point record:
2020 machine learning knowledge point record

7. Code

7.1 Accumulation

Machine learning code practice-data-how to quickly obtain the required experimental data

7.2 clearance

Machine Learning-Sklearn Study Notes-General Chapter

Guess you like

Origin blog.csdn.net/Robin_Pi/article/details/104381410