【Machine Learning】Introduction and use of model evaluation methods


foreword

Machine learning model evaluation is the process of evaluating model performance, which is one of the core links of machine learning. In model evaluation, we use training set and test set to evaluate the performance of the model. The purpose of machine learning model evaluation is to evaluate the predictive ability of a model by using certain evaluation metrics and to determine which model is the best fit for a particular dataset. This article will introduce the basics of machine learning model evaluation, and provide some commonly used evaluation indicators and corresponding Python code examples

1. Training set and test set

In machine learning, we use the training set to train the model, and use the test set to verify the performance of the model, usually training set/test set = 8/2 or 7/3

2. Evaluation indicators

There are many evaluation indicators for machine learning models. We often use the following indicators to evaluate.
Before introducing the indicators, we will introduce a few parameters

  • TP FP TN FN
  • TP (ture positive): The TP positive class is judged as the positive class, that is, we let the model recognize the male, and it successfully recognizes the male
  • FP (flase positive): The FP negative class is judged as a positive class, let the model recognize men, and he regards women as men's recognition
  • TN (ture negatives): The TN negative class is judged as a negative class, let the model identify men, and successfully identify the remaining women as women
  • FN (flase negatives): The FN positive class is judged as a negative class, let the model identify males, and filter out males as females

1 Accuracy

Accuracy is the most commonly used evaluation metric in classification problems. It refers to the proportion of the number of samples correctly classified by the model to the total number of samples. The higher the accuracy, the better the performance of the model.

  • The mathematical formula is

Accuracy=(TP+TN)/(TP+FP+TN+FN)

In Python, we can use the scikit-learn library to calculate accuracy:

from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_true, y_pred)

y_true represents the true value label
y_pred represents the predicted value label

2 Precision

The accuracy rate refers to the proportion of true examples among the samples predicted by the model as positive examples. Precision is used to evaluate the accuracy of the model.

  • Mathematical formula

Precision=TP/(TP+FP)

  • python code
from sklearn.metrics import precision_score

precision=precision_score(y_true,y_pred)

3 Recall rate (Recall)

The recall rate refers to the proportion of samples that the model correctly predicts as positive samples to the total number of positive samples. Recall is used to assess the completeness of the model

  • Mathematical formula

Recall=TP/(TP+FN)

  • python code
from sklearn.metrics import recall_score

recall=recall_score(y_ture,y_pred)

4 F1 (F1 sorce)

F1 is the weighted average of precision and recall. It is a composite metric that simultaneously evaluates the accuracy and completeness of a model.

  • Mathematical formula

F1=2*(Precision*Recall)/(Precision+Recall)

  • python code
from sklearn.metrics import f1_score

f1 = f1_score(y_true, y_pred)

5. AUC值(Area Under the ROC Curve)

The AUC value is an indicator used to evaluate the performance of the binary classification model. It is the area under the ROC curve, which measures the ability of the model to predict positive and negative examples.

  • python code
from sklearn.metrics import roc_auc_score
auc = roc_auc_score(y_true, y_pred_prob)

3. Overall use

from sklearn.datasets import make_classification
from sklearn.metrics import precision_score,recall_score,roc_auc_score,f1_score,accuracy_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
"生成一个二分类数据"
#n_samples代表生成数量,n_calss代表分类类型,random_state代表随机种子
X,y=make_classification(n_samples=2000,n_classes=2,random_state=42)

"""
生成训练集和测试机
"""
X_train,X_test,y_train,y_test=train_test_split(X,y,random_state=42,test_size=0.2)

"""
训练逻辑回归模型
"""

logistci=LogisticRegression(random_state=42)
logistci.fit(X_train,y_train)

"""
预测
"""

y_pre=logistci.predict(X_test)
# print(y_pre)


"""
模型评估
"""

accuracy_score=accuracy_score(y_test,y_pre)
print("accuracy_score:",accuracy_score)

precision_score=precision_score(y_test,y_pre)
print('precision_score:',precision_score)

recall_score=recall_score(y_test,y_pre)
print('recall_score:',recall_score)

f1_score=f1_score(y_test,y_pre)
print('f1_score:',f1_score)

roc_auc_score=roc_auc_score(y_test,y_pre)
print("roc_auc_score:",roc_auc_score)

insert image description here
I feel that these indicators are still relatively low.

Summarize

This article introduces the commonly used indicators and usage in machine learning evaluation. In the next section, we will use the handwritten digit dataset built into sklearn as an experiment for model training and visual evaluation.
I hope you will support me a lot, I will continue to study hard and share more interesting things

Guess you like

Origin blog.csdn.net/qq_61260911/article/details/129922988