Parameter adjustment model selection and model evaluation to understand

1. Select suitable parameter

  • 1. The transfer mode parameters: Parameter algorithm means to draw by the model training
  • 2. Ultra parameters: parameters that can be specified refers to human
  • 3. The cross-validation (cross validation): sklearn.model_selection.cross_val_score (to validate parameters, x_train, y_train, cv (specify several parameters off))

Cross-validation is the validation parameters to select the optimal single super, ultra after selecting the optimal parameters, and then optimize the hyper model train into

  • 4. The grid search (Grid Search): sklearn.model_selection.GridSearchCV (model, parameters, cv, scoring)

parameters: a dictionary, like { "xx": [x, x, x], "xxx": [x, x, x]}, representative of the model need substituting parameters
scoring: specifies the evaluation methods are accuracy,.
is a grid search to verify the optimal combination of parameters over a plurality of, preferably such that the model specified in the training effect evaluation

2. persistence model

  • 1. pickle: persistence persistence model and other documents is the same operation

2. sklearn.externals.joblib:

  • Save the model: joblib.dump (Model, path)
  • Load Model: joblib.load (path)

3. Model Evaluation

  • 1. a concept:

Real cases (TP): predictive value is 1, the true value is 1
Anyway cases (FP): predictive value is 1, the true value is zero
true negatives (TN): predictive value is 0, the true value is 0
false counter-example (FN) : predictive value is 0, the true value is 1

  • 2. a few terms:
  • Recall (the Recall): TP / (TP + FN), understood as recall, i.e. the number of positive cases were correctly predicted / actual total number of positive cases
  • Precise ratio (Precision): TP / (TP + the FP), understood as precision, i.e. the number of positive cases correct predictions / total number of positive cases prediction
  • The FPR: the FP / (the TN + the FP), the actual values are all 0 in a sample, it is incorrectly predicted as the ratio of 1
  • f1 value: 2 * (* Precision Recall) / (Precision Recall +), i.e., the number of positive cases were correctly predicted / actual total number of positive cases
  • Accuracy (Accuracy): (the TN + TP) / (TP + FN + the FP + the TN), i.e. the number of positive and negative examples of correct predictions / total number
  • ROC curve: logistic regression inside, for example of the definition of positive and negative, are usually set a threshold value greater than the threshold for positive class, the class is less than the threshold value is negative. If we reduce this threshold, more samples will be identified as positive class, improve the recognition rate of positive class, but it also makes more negative class is misidentified as positive class. For a visual representation of this phenomenon, the introduction of ROC. The calculated results of the classification points corresponding ROC space, the points are connected to form ROC Curve, the abscissa is the False Positive Rate (FPR false positive rate), the ordinate is the True Positive Rate (TPR real rate). Under normal circumstances, this should be located above the curve (0,0) and (1,1) connection
  • PR graph: abscissa PR curve are accurate rate P, the ordinate is the recall rate R. And evaluation criteria as ROC, smooth look non-smooth (blue line significantly better). In general, (better than red green line) in the same test set, well below than above. When the value of P and R is close, the maximum value F1, then draw the connection (0,0) and lines (1,1), where lines overlap and PRC F1 of this line is the largest F1 (the case of smooth under), at this time F1 if the same AUC for PRC for ROC. A digital modulation type is more convenient than a line.

3. sklearn.metrics common evaluation:

  • accuracy_score
  • precision_score
  •  recall_score
  • f1_score
  • precision_recall_curve (PR curve)
  • average_precision_score (ROC curve)

4. The central idea of ​​the model evaluation

Sometimes there is no simple model of who is better (for example, Figure II of the blue line and green line), so I chose the model is to combine specific usage scenarios. Here are two scenarios:

  • Earthquake prediction for predicting earthquakes, we hope RECALL is very high, which means that every earthquake we all want to predict it. This time we can sacrifice PRECISION. Willing to issue an alert 1000, the 10 earthquakes were predicted correctly, and do not predict 100 to 8 times missed twice.
  • Suspects convicted based on the principles of good strange a good man, for the suspect's conviction We want to be very accurate. Instant sometimes let some criminals (low recall), but it is worth it.

For the classifier, the essence is to give a probability, at this time, we'll select a CUTOFF point (threshold), higher than the point of being sentenced to less than a forfeit. Then the selected point on the need to combine your specific scenario to choose. In turn, the scene will determine the criteria of training models, such as the first scenario, we look RECALL = 99.9999% (earthquake in full) when PRECISION, no other indicators becomes meaningless.
When the small gap between the number of positive and negative samples, the ROC and PR trends are similar, but in a very uneven distribution of positive and negative samples of cases, PRC more real than the ROC reflect the actual situation, because the ROC curve seems very good, but the general effect on PR.

  • 3. Evaluation indicators include the above, but are not limited to the above. In some companies in some business, will develop evaluation indicators for your business based on business and industry needs

Guess you like

Origin www.cnblogs.com/Alexisbusyblog/p/12403161.html