[Learning] depth migration study Transfer Learning

        We usually get a task, such as image classification, recognition, good data collection began after the training model directly, but in reality, due to the urgency of the limitations of the device, resulting in time we can not start training from scratch, one to two million iterations to converge model times, so this time the migration study comes in handy.


 

What is the migration of learning?

  Transfer learning popular terms, is to use existing knowledge to learn new knowledge, the core is to find similarities between the existing and new knowledge, that is, by analogy with the idiom. As a direct target domain from scratch learning costs are too high, we therefore turned to the use of existing knowledge to assist the learning of new knowledge as soon as possible. For example, will have to play Chinese chess, you can learn the analogy with chess; will have to write Java programs, the analogy can learn C #; have learned English, you can learn the analogy with the French; and so on. Everything has the world everything in common, how to properly find similarities between them, and then use this bridge to help you learn new knowledge, learning is the core issue of migration.

 

Why do we need to learn to migrate?

  • To meet the amount of training data depth learning of too little data. For a machine learning task, such as classification, if the amount of data is too small, if we have to study it in depth? In fact, not necessarily. If the actual production process, can be able to get good results with the rules on the use of rules, can use a simple model is a simple model, we often hear of "Occam's razor" In fact, what it means, to reduce complexity of the model, to avoid over-fitting case to some extent. So for a small data set, there is no need necessarily to do with the depth of learning methods. Similarly, if you want to classify a new task, or to identify, can not collect a large number of positive and negative samples, how to train it?
  • The new data set compared to the original data set is smaller but the content is very different. Due to the small data, just training a linear classifier may be better. Because different sets of data, from the top of the network started training a classifier may not be the best choice, which contains more specific set of data features. In addition, from the front of the activation function network classifier could start training a little better.

  • The new data set compared to the original data set is large, but the content is very different. Due to the large data sets, we might expect to start training from scratch a DCNN. However, the training starts to initialize the weights model is still a useful method in practice from a pre. In this case, we have enough confidence in the data and fine-tune the entire network.


Migrating to learn how to do?

   In practice, we usually do not start from scratch random initialization training DCNN, because there is enough to meet the size requirements of depth network data sets quite rare. Instead, it is common in a large data set of a pre-trained DCNn, then right to use DCNn this weight as the initial training set or fixed as related tasks feature extractor. For example, we know Imagnet is the largest image recognition database, there are already many imagenet network model based on training data, such as inceptionv3, v4, etc., if now give you a task, I hope you do a car-based recognition, you there are two options:

  First, a large number of cars to collect data on these cars data model training;

  Second, a good network model based on imagenet training, and then collect data to good cars based on the trained model before continuing training, conduct fine-tuning.

  The traditional approach is a first, but it will encounter a problem, first pictures of cars more than enough, body mass big enough? If the amount of data is not enough, the final effect will not be very bad training? In fact, we can learn through the ImageNet or other large data set of network features used in a picture or other image-based classification feature task, which is to transfer learning ideas. In fact, it can be understood, if trained from scratch, then initialize the weights is generally the case are either 0 or randomly set, when we import trained on large data sets model, which is equivalent to the current model some initialization parameters as the weight of heavy, but as to how generalization on specific tasks, or depends on the specific scene.


Limit migration study

  In the migration study will be used in pre-trained network, so the model is limited in terms of architecture. For example, can not be arbitrarily removed convolution layer pre-training network. However, because of the shared parameters, we can easily run a pre-training network in the images of different spatial dimensions. In this convolution layer and layer and pooling situation will be apparent, since they are a function of forward (forward function) is independent of the spatial size of the input content. In the case of fully connected layer (FC) in which still holds, because the whole connection layer can be converted into a convolution layer. So when we import a model pre-trained, the network structure requires the same pre-trained network structure, and then training for specific scenarios and tasks.


 

Transfer learning relevant information

 

  Migration to the students interested in learning, you can focus on the repo GitHub: transferlearning , as well as a series of articles written by Wang Jindong:

 

 "Wang Love Migration" Zero Series: Migrating area of ​​study leading scholars and research institutions

 

"Wang Love Migration" one of the series: Introduction to migration component analysis (TCA) method

 

"Wang Love Migration" series two: joint distribution adapter (JDA) method Introduction

 

"Wang Love Migration" series of three: the depth of the neural network can be migrated

 

"Wang Love Migration" series four: depth adaptation Network (DAN)

 

"Wang Love Migration" series of five: the geodesic flow of nuclear methods (GFK)

 

"Wang Love Migration" series of six: Transfer of Learning (Learning To Transfer)

 

"Wang Love Migration" series of seven: Negative Transfer (Negative Transfer)

 

"Wang Love Migration" series of eight: Depth Migration articles interpretation of learning

 

"Wang Love Migration" series of nine: OPEN transfer learning (Open Set Domain Adaptation)

 

"Wang Love Migration" series of ten: transfer learning tensor (tensor unsupervised domain adaptation)

 

"Wang Love Migration" series XI: selectivity against transfer learning (Selective Adversarial Network)

 

"Wang Love Migration" Series 12: New Year weather - rearrange the migration of Learning Resources warehouse

 

Guess you like

Origin www.cnblogs.com/zhangchao162/p/11417465.html