Theoretical concepts related to machine learning

Introduction:

  • Machine learning specializes in the study of how computers simulate or implement human learning behaviors so that they can continuously improve their performance . It is an algorithm and application that can discover the value of data. It is the most exciting field in computer science.
  • Machine learning purpose: to obtain knowledge from data through self-learning algorithms , and then predict the future.
  • Machine learning application: speech recognition driverless spam filtering

1. Rule-based learning and model-based learning

  • Rule-based learning
    There is no spam filtering before machine learning,
    Insert picture description here
    but rule-based learning has certain disadvantages:

    1. The rules can be changed
    2. Cannot avoid the influence of human factors on the results
  • Model-based learning
    Insert picture description here

Insert picture description here

  • What does machine learning learn? Machine learning is model- based learning. Through the parameters in the model (y=kx + b as an example), the result is known after k and b are determined .

Machine learning model = data + machine learning algorithm

  • For problems that are not machine learning:
    • 1- identified problem
    • 2- Statistical problem
  • For machine learning, through historical data combined with algorithms, predictive models or laws are given for predictive analysis
    • 1- Recommended scene
    • 2-Facebook character tag
    • 3- predict the scenario

2. Basic concepts of machine learning data sets:

Insert picture description here

  • Iris Iris Dataset is a classic data set in the field of machine learning

Insert picture description here
In the iris flower data set, it contains 150 samples and 4 features, so it is recorded as a 150x4-dimensional matrix.
Insert picture description here
Generally, lowercase letters represent vectors , and uppercase letters represent matrices .

3. Classification of machine learning

  • Supervised learning : learning with a label column, giving the machine a lot of labeled data, and letting the machine learn by itself.
  1. Classification problem predicts discrete values
    Insert picture description here

Insert picture description here

2) Regression problem predicts continuous values
Insert picture description here

  • Unsupervised learning :
    Clustering is an exploratory data analysis technique. In the absence of any relevant prior information (equivalent to unclear data information), it can help us divide the data into meaningful small groups (Also called cluster). Among them, the internal members of each cluster have a certain degree of similarity, and there are big differences between clusters. This is why clustering is used as unsupervised learning.
  1. Clustering: The objects are clustered together, and those with high similarity or the same are clustered together

Insert picture description here

  1. Dimensionality reduction in data compression
    The data faced is high-dimensional, which poses a challenge to the limited data storage space and the performance of machine learning algorithms.

Insert picture description here

  • Semi-supervised learning : part of the data is labeled, part of the data is unlabeled
    Insert picture description here

  • Disadvantages: the introduction of expert knowledge, need to avoid the influence of experts

  • Based on clustering assumptions:

    • First of all, the data set is partly labeled and partly unlabeled, and clustering is performed by combining labeled data with unlabeled data. Gather the samples with high similarity in the same group, and put the samples with higher dissimilarity in different groups
    • For the results after clustering, if there are both unlabeled samples and labeled samples in the same group, you can obtain labeled sample data. According to the category value of the sample, the minority obeys the principle of majority to elect The category value adds the sample label to the unlabeled samples. By this method, all the unlabeled samples can be labeled, so as to realize the transformation into supervised learning.
  • Reinforcement learning : Reinforcement Learning is an important branch of machine learning, mainly used to solve continuous decision-making problems


  1. Autonomous driving 2) alphaGo chess

Expansion:
In addition to the above learning methods, there are learning methods such as deep learning and transfer learning. Generally, deep learning extracts features, reinforcement learning solves continuous decision-making, and transfer learning solves model adaptability problems.

Summary of machine learning classification:
Insert picture description here

Guess you like

Origin blog.csdn.net/m0_49834705/article/details/112849050