Transfer learning dataset --Domain Adaptation

Domain Adaptable

       In the classical machine learning models, we habitually assume that the training data set and the training set target has the same probability distribution. In real life, this constraint hypothesis is difficult to achieve. When the training data set and test set has a huge difference, it is prone to over-fitting phenomenon, making the model training of unsatisfactory performance on the test set.

        As a simple example, if we have a lot of yo & Poor's yellow face supervise all training data set, and want to get the training model can be distinguished black face, the model compared to the case of the yellow race recognition performance degrades. In the case where the training data set and test data sets distributed inconsistent, through the model on the training data set minimum guidelines for training by experience errors resulting in poor performance on the test data set, therefore, we have introduced transfer learning technology.    

       Domain adaptation (Domain Adaptation) is more popular transfer learning a branch, but also focus on the direction of my recent reading. Popular speaking, through the use of domain knowledge to adapt to the training data set obtained by training, improve performance model performance on the test data set.

There are two fields to adapt basic concepts: the source domain (Source Domain) and the target domain (Target Domain). The source domain has a wealth of supervised learning information; target domain representation of the field where the test set, usually without labels or contain only small amounts of · label. The source and target domains are often the same kind of task, but different distributions.

        By region to adapt to the different stages, researchers have proposed several different domains adaptation:

1, the adaptive sample: the sample source domain resampling, distributing it approaches the target field distribution;

2, wherein the level of adaptation: mapping the general source to target a different domain methods, such methods the projection source and target domains to a common subspace, and further such that the source domain training knowledge can be directly applied to the target domain ;

3, the adaptive model levels: the source domain to be modified error function, and taking into account the errors of a target.


 

SAO:

The basic idea is to sample the source domain resampling, so that the sample and the target domain of the sample distribution source domain resampling consistent re learning classifier on a set of resampled samples.

Sample migration (Instance based TL)

Found in the source domain and the target domain similar data, this data value is adjusted weights, such that the new data matches the data object field, and then increase the weight of the sample, such that the proportion added at the prediction target domain Big. The advantage is simple and easy to implement. Disadvantage that measure the weights of experience-dependent selection of similarity, and the source domain and the target domain data distribution are often different.
Write pictures described here

Adaptive wherein:

The basic idea is to learn a common feature representation, feature in the public space, the distribution of the source and target domains to be as similar as possible.

Characteristic of Migration (Feature based TL)

Assuming that the source and target domains contain some common cross feature by feature transformation, the source domain features and the target domain is transformed into the same space, so that the space in the source domain data and the target domain data having the data distribution identically distributed, then traditional machine learning. The advantage is most applicable method is better. Disadvantage that it is difficult to solve, adapted readily occurred.
Links: https://www.zhihu.com/question/41979241/answer/247421889
Write pictures described here

Model adaptation:

The basic idea is directly adaptive model level. Model adaptation methods are two ideas, one direct modeling model, but added in the model "domain between the distance near" constraints, and second, using an iterative approach, incremental sample target domain classification, the letter high sample into a training set, and update the model.

Migration model (Parameter based TL)

Assuming that the source and target domains sharing model parameters, refers to the domain by a large amount before the source data to the trained model is applied to predict the target domain, such as the use of millions of images to train a good image recognition system when we encounter a new problem areas of the image when you do not have to go to train tens of millions of images, and simply migrate originally trained model into new areas, often only in new areas tens of thousands of pictures to be enough, the same can be very high precision. Advantage that can take advantage of similarities between models. Disadvantage is that the model parameters converge difficult.
Write pictures described here

Guess you like

Origin www.cnblogs.com/LiYimingRoom/p/12095523.html