Transfer learning (Transformer), which is enough to see the interview! (With code)

1. What is the migration study

Transfer learning (Transformer Learning) is a machine learning method, is to develop a model of the task A as an initial point, re-use in the process of the task B in the development model. Transfer learning a new task is to improve learning through the transfer of knowledge from the relevant task learning, although most machine learning algorithms are designed to solve a single task, but to promote the development of the algorithm is to transfer learning machine learning community sustained attention topic of. Migration is common for humans to learn, for example, we may find that learning to identify apples may help identify pear, or learning to play the keyboard may be helpful to learn the piano.

Find the target problem of similarity, the migration task is to learn from the similarity of view, the old field (domain) studied the model used in new areas.

2. Why do you need to learn to migrate?

  1. Big Data and less marked contradiction : While there are a lot of data, but often are not marked, not training machine learning models. Too time consuming manual calibration data.
  2. Big Data and weak computing contradiction : the average person can not have a huge amount of data and computing resources. Therefore we need the help of the migration model.
  3. Universal model and the individual needs of contradiction : even on the same mission, a model often difficult to meet the individual needs of each person, such as a specific privacy settings. This needs to be done adaptation model between different people.
  4. The needs of specific applications (such as cold start) of .

3. The basic problem migrating to learn what?

There are three basic questions:

  • Transfer to How : How to transfer learning? (Design migration method)
  • Transfer to the What : Given a target area on how to find the corresponding source field, and then migrate? (Source selection field)
  • Transfer to the when : when to migrate, not what time? (Avoid negative transfer)

4. transfer learning concept which commonly have?

  • Basic definitions

    • Body characteristics data distribution and wherein the composition is a Learning: field (the Domain)
      • Source domain (Source Domain) : existing knowledge domain
      • The target domain (Target Domain) : to be the domain of learning
    • Tasks (Task) : objective function and composition of learning outcomes is the result of learning
  • According to a feature space classification

    • Isomorphism transfer learning (Homogeneous TL) : the same spatial characteristics of the source and target domains, \ (D_s D_t = \)
    • Heterogeneous transfer learning (Heterogeneous TL) : Different feature space of the source and target domains, \ (D_s \ ne D_t \)
  • Classification by migration scenarios

    • Inductive transfer learning (Inductive TL) : learning task source and target domains different
    • Direct Push transfer learning (transductive TL) : the source and target domains different, the same learning tasks
    • Migration unsupervised learning (Unsupervised TL) : the source and target domains are no labels
  • Classified by migration method

    • Migration of the sample (Instance based TL) based on migration with the sample source and target domains by heavy weights:

      Sample transfer learning method (Instance based Transfer Learning) according to a certain weight based generation rules, reuse of data samples to study migration. The following figure graphically shows the existence of different types of samples based on the idea the source domain migration method of animals, such as dogs, birds, cats, dogs this target domain only one category. When migrating, and the target domain in order to maximize the similarity, we can artificially increase the source domain samples belonging to the category of heavy right dog.

    • Migration characteristics (Feature based TL) based on : the source domain and the target domain features into the same space transform

      Migration method based feature (Feature based Transfer Learning) refers to a manner by the features of each transform migration, to reduce the gap between the source and target domains; or wherein the source domain and the target domain data is converted into a unified feature space and using conventional methods of machine learning classification. According to a feature of the homogeneous and heterogeneous, it can be divided into homogeneous and heterogeneous migration study. The following figure shows two very image of the feature-based transfer learning method.

    • Based on the migration model (the Parameter based TL) : parameter sharing model with the source and target domains of

      Model-based migration (Parameter / Model based Transfer Learning) refers to the sharing of information between them to find the parameters from the source and target domains to implement the methods of migration. This approach requires the migration on the assumption that: the data source domain and the target domain may share some of the model parameters. The following figure graphically shows the basic idea of ​​model-based study of migration.

    • Based on the relationship between migration (Relation based TL) migration source logical network domain using the relationship:

      Based on the relationship between migration learning methods (Relation Based Transfer Learning) has a very different idea of ​​the above three methods. This method is more concerned about the relationship between the source and target domain-domain samples. FIG image showing the similar relationships between different fields.

The migration of traditional learning and machine learning What is the difference?

Transfer learning Traditional machine learning
Data distribution It does not require the same training and test data distribution Training and test data identically distributed
Data Labels Need not enough data labeling Enough data labeling
Modeling Before the model can be reused Each task separately modeling

6. Migration and the core measure of guidelines learning?

The general idea of the migration study can be summarized as : the development of algorithms to maximize the use of knowledge there are areas marked to assist knowledge acquisition and learning objectives in the area.

迁移学习的核心是:找到源领域和目标领域之间的相似性,并加以合理利用。这种相似性非常普遍。比如,不同人的身体构造是相似的;自行车和摩托车的骑行方式是相似的;国际象棋和中国象棋是相似的;羽毛球和网球的打球方式是相似的。这种相似性也可以理解为不变量。以不变应万变,才能立于不败之地。

有了这种相似性后,下一步工作就是, 如何度量和利用这种相似性。度量工作的目标有两点:一是很好地度量两个领域的相似性,不仅定性地告诉我们它们是否相似,更定量地给出相似程度。二是以度量为准则,通过我们所要采用的学习手段,增大两个领域之间的相似性,从而完成迁移学习。

一句话总结: 相似性是核心,度量准则是重要手段。

7. 迁移学习与其他概念的区别?

  1. 迁移学习与多任务学习关系:
    • 多任务学习:多个相关任务一起协同学习;
    • 迁移学习:强调信息复用,从一个领域(domain)迁移到另一个领域。
  2. 迁移学习与领域自适应:领域自适应:使两个特征分布不一致的domain一致。
  3. 迁移学习与协方差漂移:协方差漂移:数据的条件概率分布发生变化。

8. 什么情况下可以使用迁移学习?

迁移学习最有用的场合是,如果你尝试优化任务B的性能,通常这个任务数据相对较少。 例如,在放射科中你知道很难收集很多射线扫描图来搭建一个性能良好的放射科诊断系统,所以在这种情况下,你可能会找一个相关但不同的任务,如图像识别,其中你可能用 1 百万张图片训练过了,并从中学到很多低层次特征,所以那也许能帮助网络在任务在放射科任务上做得更好,尽管任务没有这么多数据。

假如两个领域之间的区别特别的大,不可以直接采用迁移学习,因为在这种情况下效果不是很好。在这种情况下,推荐以上的方法,在两个相似度很低的domain之间一步步迁移过去(踩着石头过河)。

9. 什么是finetune?

度网络的finetune也许是最简单的深度网络迁移方法。Finetune,也叫微调、fine-tuning, 是深度学习中的一个重要概念。简而言之,finetune就是利用别人己经训练好的网络,针对自己的任务再进行调整。从这个意思上看,我们不难理解finetune是迁移学习的一部分。

为什么需要已经训练好的网络?

在实际的应用中,我们通常不会针对一个新任务,就去从头开始训练一个神经网络。这样的操作显然是非常耗时的。尤其是,我们的训练数据不可能像ImageNet那么大,可以训练出泛化能力足够强的深度神经网络。即使有如此之多的训练数据,我们从头开始训练,其代价也是不可承受的。

为什么需要 finetune?

因为别人训练好的模型,可能并不是完全适用于我们自己的任务。可能别人的训练数据和我们的数据之间不服从同一个分布;可能别人的网络能做比我们的任务更多的事情;可能别人的网络比较复杂,我们的任务比较简单。

10. 什么是深度网络自适应?

深度网络的 finetune 可以帮助我们节省训练时间,提高学习精度。但是 finetune 有它的先天不足:它无法处理训练数据和测试数据分布不同的情况。而这一现象在实际应用中比比皆是。因为 finetune 的基本假设也是训练数据和测试数据服从相同的数据分布。这在迁移学习中也是不成立的。因此,我们需要更进一步,针对深度网络开发出更好的方法使之更好地完成迁移学习任务。

以我们之前介绍过的数据分布自适应方法为参考,许多深度学习方法都开发出了自适应层(AdaptationLayer)来完成源域和目标域数据的自适应。自适应能够使得源域和目标域的数据分布更加接近,从而使得网络的效果更好。

11. GAN在迁移学习中的应用

生成对抗网络 GAN(Generative Adversarial Nets) 受到自博弈论中的二人零和博弈 (two-player game) 思想的启发而提出。它一共包括两个部分:

  • 一部分为生成网络(Generative Network),此部分负责生成尽可能地以假乱真的样本,这部分被成为生成器(Generator);
  • 另一部分为判别网络(Discriminative Network), 此部分负责判断样本是真实的,还是由生成器生成的,这部分被成为判别器(Discriminator) 生成器和判别器的互相博弈,就完成了对抗训练。

GAN 的目标很明确:生成训练样本。这似乎与迁移学习的大目标有些许出入。然而,由于在迁移学习中,天然地存在一个源领域,一个目标领域,因此,我们可以免去生成样本的过程,而直接将其中一个领域的数据 (通常是目标域) 当作是生成的样本。此时,生成器的职能发生变化,不再生成新样本,而是扮演了特征提取的功能:不断学习领域数据的特征使得判别器无法对两个领域进行分辨。这样,原来的生成器也可以称为特征提取器 (Feature Extractor)。

12. 代码实现

Transformer Learning例子

数据集下载:

机器学习通俗易懂系列文章

3.png

13. 参考文献

[https://github.com/scutan90/DeepLearning-500-questions/tree/master/ch11_%E8%BF%81%E7%A7%BB%E5%AD%A6%E4%B9%A0](https://github.com/scutan90/DeepLearning-500-questions/tree/master/ch11_迁移学习)


作者:@mantchs

GitHub:https://github.com/NLP-LOVE/ML-NLP

欢迎大家加入讨论!共同完善此项目!群号:【541954936】NLP面试学习群

Guess you like

Origin www.cnblogs.com/mantch/p/11371670.html