Transfer Learning - Getting Started

Transfer LearningTranser Learning

  • The previous layer, surface features (representable); the latter layer deep features (not characterization)
  • Through the discriminator, the real and fake are forced to be similar
CNN
CNN
判别器
fake
Tensor
integral Tensor
real
Tensor
  • Developed from Generative Adversarial Networks

Transfer: learn from source domain – put to use in target domain

  • Why migrate?

Intersection between source and target domains! commom features

Adaptation: traditional loss function, feature adaptation, model adaptation

  1. based on error
  2. Based on confrontation
  3. Based on reconstruction (Cycle GAN)
  • BYOL self-supervised model

Fine-tuning

source_train_data
fine-tuning
target_train_data
Source
Model
Target
E
  • The target domain generally has a small amount of data, and the network is prone to overfitting

Therefore, ft is the model trained in the source domain, just put it in the target domain and fine-tune it.
Note: The target domain must also be trained!

Prevent overfitting: the first few layers and the next few layers are fixed, and only the middle layer is adjusted

Which layer should be fine-tuned?

  • What does tuning mean: Manually adjust the weight of that layer, and then fix it (without backpropagation update), other layers backpropagation update

Speech: adjust the next few layers
NLP: the first few layers

  • What if the source domain has labels but the target domain does not? (Target domain is unlabeled training set and labeled test set)
  1. DDC

    • m samples in the source domain and n labels in the target domain are respectively calculated as MMD (maximum mean difference), does it look like
    • The effect is not good
  2. AND

  3. RTN : Residual Transfer Network

    • Enter the target domain, the first few layers are relatively close
    • According to different tasks in the target domain, there are different branches, and the classified branches are added to the residual
  4. RE-Gard gradient reversal

    • input – CNN – feature -> classification emphasizes fully connected network
    • Put it in the target domain, the effect is not good
    • Solution : add one 域判别器, the feature of the source domain corresponds to - domain 1, and the feature corresponding to the target domain corresponds to - domain 0
    • Use a network, add a branch of domain discrimination, and the two branches fight against each other
    • gradient reversal layer
    • The ultimate goal is to make the classifier unclear whether the feature comes from the source domain or the target domain
  5. MADA

    • Problem: There are 3 classes in the source domain and 2 classes in the target domain, MMD is too simple and the effect is not good
    • Add discriminators : Number of categories in target domain = Number of domain discriminators
    • Don't know how to classify the labels of the target domain?
      • Pre-discriminate in the classifier of the source domain, not only the domain alignment, but also the class alignment

zero-shot learning

Guess you like

Origin blog.csdn.net/RandyHan/article/details/130347083