Data collection characteristics

1. The training set all kinds of uneven sample proportion (Unbalanced)

method:

1. oversampling:

  Cons: error (noise) samples are likely to have greater impact

2. undersampling:

  Cons: throw a larger sample loss

3. To expand the data set:

  a. portion of the sample extracted, averaging

  b. random noise

(2. If the training set and test set samples themselves a huge gap, the training process are always people suffering)

 

  

Guess you like

Origin www.cnblogs.com/alilliam/p/10774017.html