Common CV strategies:
Hold-out; Kfold, GroupKFold, StratifiedKFold, TimeSeriesSplit
Adversarial validation (adversarial verification)
concept: is a commonly used feature screening method, used to find the characteristics of obvious timing fluctuations.
operating:
- Binary coding of the divided training set and verification set (test set), eg add an ad_target = 1 for the training set, and add an ad_target = 0 for the verification set
- Train a simple binary classification model, the goal is to do binary classification on ad_target
- Only one feature is used for each training, and the AUC of the validation set is recorded when the model converges
- Iterate step 3 until all the features are traversed, and the features are sorted from high to low according to the AUC index
- Focus on the analysis of higher AUC, the experience threshold can be selected 0.7, 0.8; pay attention to the problem of missing values
Adversarial verification (resolving CV disturbances); observing features with distribution