Semi-supervised learning pseudo-label learning

Define a method called semi-supervised learning, which can learn patterns from both labeled (supervised learning) and unlabeled data (unsupervised learning)

In order to train a machine learning model for supervised learning, we must have labeled data. Does this mean that unlabeled data is useless for supervision tasks such as classification and regression? Of course not! In addition to using additional data for data analysis, you can also combine unlabeled data with labeled data to train a semi-supervised learning model together.

Insert picture description here
The main idea of ​​this method is actually very simple.
First, train the model on the labeled data, and then use the trained model to predict the label of the unlabeled data, thereby creating pseudo labels. In addition, the label data and the newly generated pseudo label data are combined as new training data.

Process :

  1. Divide the labeled part of the data into two parts: train_set&validation_set, and train the optimal model1
  2. Use model1 to predict the unknown label data (test_set), and give the pseudo-labeled result pseudo-labeled
  3. Take a part of train_set to make a new validation_set, merge the remaining part with the pseudo-labeled part as a new train_set, and train the optimal model2
  4. Then use model2 to predict the unknown label data (test_set) to get the final result label

Insert picture description here
https://blog.csdn.net/leolotus/article/details/78163006?utm_medium=distribute.pc_relevant.none-task-blog-OPENSEARCH-1.control&depth_1-utm_source=distribute.pc_relevant.none-task-blog-OPENSEARCH-1.control

Guess you like

Origin blog.csdn.net/weixin_42764932/article/details/112910467