模型评价指标概念说明(回归,分类,多分类)

  1. 回归任务(Regression Tasks)
    • MSE(Mean Squared Error):均方误差,表示预测值与实际值之间的平均平方差。MSE越小,说明模型的预测性能越好。
    • RMSE(Root Mean Squared Error):均方误差的平方根,常用于金融领域的预测。
    • MAE(Mean Absolute Error):平均绝对误差,表示预测值与实际值之间的平均绝对差。MAE越小,说明模型的预测性能越好。
    • R-Square(R-squared):R平方,表示模型解释的数据变异性的比例。R-Square越接近1,说明模型的拟合效果越好。
    • MAPE(Mean Absolute Percentage Error):平均绝对百分比误差,表示预测值与实际值之间的平均绝对百分误差。MAPE越小,说明模型的预测性能越好。
  2. 二分类任务(Binary Classification Tasks)
    • Accuracy:准确率,表示模型正确预测的样本数占总样本数的比例。
    • Precision:精确率,表示模型预测为正例的样本中实际为正例的比例。
    • Recall:召回率,表示实际为正例的样本中被模型预测为正例的比例。
    • LogLoss:对数损失,表示模型预测概率与实际概率之间的对数损失。LogLoss越小,说明模型的预测性能越好。
    • F1-Score:F1分数,是精确率和召回率的调和平均值,用于综合评价模型的性能。F1-Score越高,说明模型的性能越好。
    • AUC(Area Under Curve):曲线下面积,表示模型预测的概率与实际概率之间的相对关系。AUC越接近1,说明模型的性能越好。
    • Confusion Matrix:混淆矩阵,表示模型预测的正例和负例与实际情况的对应关系,用于分析模型的预测性能。
  3. 多分类任务(Multi-class Classification Tasks)
    • Accuracy:准确率,表示模型正确预测的样本数占总样本数的比例。
    • Precision:精确率,表示模型预测为某一类的样本中实际为这一类的比例。
    • Recall:召回率,表示实际为某一类的样本中被模型预测为这一类的比例。
    • Confusion Matrix:混淆矩阵,表示模型预测的各个类与实际情况的对应关系,用于分析模型的预测性能。
    • ROC Curve:ROC曲线,表示模型在不同阈值下的真正例率和假正例率之间的关系。ROC曲线越接近左上角,说明模型的性能越好。
    • PR Curve:PR曲线,表示模型在不同阈值下的精确率和召回率之间的关系。PR曲线越接近对角线,说明模型的性能越好。

  • Regression Tasks :
    • MSE (Mean Squared Error): It measures the average squared difference between predicted values and actual values, indicating how well the model can predict. The smaller the MSE, the better the predictive performance of the model.
    • RMSE (Root Mean Squared Error): It is the square root of the MSE, often used in financial forecasting.
    • MAE (Mean Absolute Error): It measures the average absolute difference between predicted values and actual values, indicating how well the model can predict. The smaller the MAE, the better the predictive performance of the model.
    • R-Square (R-squared): It represents the proportion of data variation explained by the model. The closer the R-Square to 1, the better the fitting effect of the model.
    • MAPE (Mean Absolute Percentage Error): It measures the average absolute percentage error between predicted values and actual values, indicating how well the model can predict. The smaller the MAPE, the better the predictive performance of the model.
  • Binary Classification Tasks :
    • Accuracy: It represents the ratio of correct predictions to the total number of samples.
    • Precision: It indicates the proportion of positive predictions that are actually positive.
    • Recall: It indicates the proportion of actual positives that are correctly predicted as positive.
    • LogLoss: It measures the logarithmic loss between predicted probabilities and actual probabilities, indicating how well the model can predict. The smaller the LogLoss, the better the predictive performance of the model.
    • F1-Score: It is the harmonic mean of precision and recall, used to comprehensively evaluate the performance of the model. A higher F1-Score indicates a better performance of the model.
    • AUC (Area Under Curve): It represents the relative relationship between predicted probability and actual probability. The closer the AUC to 1, the better the performance of the model.
    • Confusion Matrix: It shows the correspondence between predicted classes and actual classes, used to analyze the predictive performance of the model.
  • Multi-class Classification Tasks :
    • Accuracy: It represents the ratio of correct predictions to the total number of samples.
    • Precision: It indicates the proportion of predictions for a specific class that are actually from this class.
    • Recall: It indicates the proportion of actual samples from a specific class that are correctly predicted as this class.
    • Confusion Matrix: It shows the correspondence between predicted classes and actual classes, used to analyze the predictive performance of the model.
    • ROC Curve: It represents the relationship between true positive rate and false positive rate at different thresholds, indicating the performance of the model. The closer the ROC curve to the upper left corner, the better the performance of the model.
    • PR Curve: It represents the relationship between precision and recall at different thresholds, indicating the performance of the model. The closer the PR curve to the diagonal line, the better the performance of the model.

回归

  • Mean Absolute Error (MAE): 平均绝对误差,用于衡量模型预测值与真实值之间的平均绝对差距。它的值越小表示模型拟合得越好。
  • Mean Squared Error (MSE): 均方误差,是预测值与真实值之间平均差的平方,用于衡量模型的预测精度。与MAE类似,值越小表示模型拟合得越好。
  • R-Squared (R2): R平方,用于评估回归模型的拟合程度。它衡量模型预测值与真实值方差的比例,取值范围为[0, 1],越接近1表示模型拟合得越好。
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

# 实际值与预测值
y_true = [3, -0.5, 2, 7]
y_pred = [2.5, 0.0, 1.8, 8]

# MAE
mae = mean_absolute_error(y_true, y_pred)
print("MAE:", mae)

# MSE
mse = mean_squared_error(y_true, y_pred)
print("MSE:", mse)

# R2
r2 = r2_score(y_true, y_pred)
print("R2:", r2)

分类

  1. Accuracy (准确率): 分类模型预测正确的样本数与总样本数的比例,用于衡量模型的整体分类准确性。取值范围为[0, 1],值越大表示模型拟合得越好。
  2. Precision (精确率): 预测为正的样本中,确实为正的比例。它衡量模型分类为正的准确性。取值范围为[0, 1],值越大表示模型拟合得越好。
  3. Recall (召回率): 真实为正的样本中,被正确预测为正的比例。它衡量模型发现真实正例的能力。取值范围为[0, 1],值越大表示模型拟合得越好。
from sklearn.metrics import accuracy_score, precision_score, recall_score

# 实际标签与预测标签
y_true = [0, 1, 1, 0, 0, 1]
y_pred = [0, 1, 0, 0, 1, 1]

# Accuracy
accuracy = accuracy_score(y_true, y_pred)
print("Accuracy:", accuracy)

# Precision
precision = precision_score(y_true, y_pred)
print("Precision:", precision)

# Recall
recall = recall_score(y_true, y_pred)
print("Recall:", recall)

多分类

  1. 多分类任务评价指标:
    • Categorical Accuracy (分类准确率): 多分类问题中的准确率,表示预测正确的样本数与总样本数的比例。取值范围为[0, 1],值越大表示模型拟合得越好。
    • F1-Score: F1分数综合了准确率和召回率。它衡量预测准确性和发现真实正例的能力。取值范围为[0, 1],值越大表示模型拟合得越好。
from sklearn.metrics import accuracy_score, f1_score

# 实际标签与预测标签
y_true = [0, 1, 2, 1, 0, 2]
y_pred = [0, 1, 1, 2, 0, 1]

# Categorical Accuracy
accuracy = accuracy_score(y_true, y_pred)
print("Accuracy:", accuracy)

# F1-Score
f1 = f1_score(y_true, y_pred, average='macro')
print("F1-Score:", f1)

猜你喜欢

转载自blog.csdn.net/weixin_38233104/article/details/133281389
今日推荐