Transfer LearningTranser Learning
- The previous layer, surface features (representable); the latter layer deep features (not characterization)
- Through the discriminator, the real and fake are forced to be similar
- Developed from Generative Adversarial Networks
Transfer: learn from source domain – put to use in target domain
- Why migrate?
Intersection between source and target domains! commom features
Adaptation: traditional loss function, feature adaptation, model adaptation
- based on error
- Based on confrontation
- Based on reconstruction (Cycle GAN)
- BYOL self-supervised model
Fine-tuning
- The target domain generally has a small amount of data, and the network is prone to overfitting
Therefore, ft is the model trained in the source domain, just put it in the target domain and fine-tune it.
Note: The target domain must also be trained!
Prevent overfitting: the first few layers and the next few layers are fixed, and only the middle layer is adjusted
Which layer should be fine-tuned?
- What does tuning mean: Manually adjust the weight of that layer, and then fix it (without backpropagation update), other layers backpropagation update
Speech: adjust the next few layers
NLP: the first few layers
- What if the source domain has labels but the target domain does not? (Target domain is unlabeled training set and labeled test set)
-
DDC
- m samples in the source domain and n labels in the target domain are respectively calculated as MMD (maximum mean difference), does it look like
- The effect is not good
-
AND
-
RTN : Residual Transfer Network
- Enter the target domain, the first few layers are relatively close
- According to different tasks in the target domain, there are different branches, and the classified branches are added to the residual
-
RE-Gard gradient reversal
- input – CNN – feature -> classification emphasizes fully connected network
- Put it in the target domain, the effect is not good
- Solution : add one
域判别器
, the feature of the source domain corresponds to - domain 1, and the feature corresponding to the target domain corresponds to - domain 0 - Use a network, add a branch of domain discrimination, and the two branches fight against each other
- gradient reversal layer
- The ultimate goal is to make the classifier unclear whether the feature comes from the source domain or the target domain
-
MADA
- Problem: There are 3 classes in the source domain and 2 classes in the target domain, MMD is too simple and the effect is not good
- Add discriminators : Number of categories in target domain = Number of domain discriminators
- Don't know how to classify the labels of the target domain?
- Pre-discriminate in the classifier of the source domain, not only the domain alignment, but also the class alignment