What is Error Decomposition in Deep Learning

Error decomposition is the splitting of the prediction error of a deep learning model into its components in order to better understand model performance. In deep learning, we usually decompose the prediction error into three parts: bias (Bias) , variance (Variance) and inevitable error (Irreducible Error). The data set
used for training and the model you choose together lead to your The effect of the model and the resulting prediction error are also mainly composed of these two elements.
insert image description here

Bias:

Bias measures how far the model's predictions deviate from the true value. A high bias indicates that the model's predictions are very different from the true value, which usually means that the model did not fit well **(underfitting)** on the training data. High bias can be due to reasons such as a model that is too simple, poor feature selection, or insufficient training time. In deep learning, simplifying the network structure (reducing the number of layers and number of neurons) can lead to high bias.

Variance:

Variance measures how sensitive the model is to small changes in the training data. High variance indicates that the model's predictions vary widely across different training datasets, which usually means that the model is overfitting **(overfitting)** on the training data. High variance can be caused by overly complex models, too noisy training data, or too few samples. In deep learning, increasing the complexity of the network structure (increasing the number of layers and the number of neurons) can lead to high variance.

In short, high variance means that your model may have been trained by rote , and the model has not really learned the task.

Inevitable Error (Irreducible Error):

Unavoidable errors are caused by the noise of the data itself and cannot be reduced by improving the model. This part of the error is not related to the performance of the model, but to the data collection and data quality.

For example, we often say that the image data set ImageNet, some data annotations in it, are themselves problematic, or not perfect.

The algorithm model uses these imperfectly labeled data as Ground Truth to learn, so this noise will definitely be learned.
insert image description here
For example, in the target detection task, we have a data set for identifying cats. What we get is the red label box on the left, but let's pay attention, this label box can actually be more accurate, like the blue label box in the picture on the right. . The annotation boxes with individual differences provided by different annotators are the noise of the data set, which leads to this irreducible error

In deep learning, we usually want to find a balance between low bias (good fit) and low variance (good generalization). This is called the Bias-Variance Tradeoff.

To achieve this tradeoff, you can generally try the following:

  • Use more data for training to reduce variance.
  • Use data augmentation techniques to improve the robustness of the model to changes in the data.
  • Apply regularization (such as L1, L2, or Dropout) to the model to mitigate overfitting.
  • Use techniques such as cross-validation and early stopping to choose appropriate model complexity and hyperparameters.
  • Noise reduction and feature selection are performed on the data to reduce unavoidable errors.

Extend understanding

This article introduces it very well, and interested students can learn more about it.
http://scott.fortmann-roe.com/docs/BiasVariance.html

Guess you like

Origin blog.csdn.net/crazyjinks/article/details/131678761