[Reprint] When conducting data analysis, under what circumstances need to standardize the data?

It mainly depends on whether the model has scalability invariance.


After some models are unevenly scaled in each dimension, the optimal solution is not equivalent to the original one, such as SVM. For such a model, unless the distribution range of the data in each dimension is relatively close, it must be standardized to prevent the model parameters from being dominated by data with a larger or smaller distribution range.


After some models perform uneven scaling in each dimension, the optimal solution is equivalent to the original one, such as logistic regression. For such a model, whether or not standardization theoretically does not change the optimal solution. However, since the actual solution often uses iterative algorithms, if the shape of the objective function is too "flat", the iterative algorithm may converge very slowly or even not. Therefore, for models with scalability and invariability, it is best to standardize data as well.


Author: Wang Yun Maigo

Link: https://www.zhihu.com/question/30038463/answer/50491149

Source: Zhihu

The copyright belongs to the author. For commercial reprints, please contact the author for authorization, and for non-commercial reprints, please indicate the source.
 

Guess you like

Origin blog.csdn.net/authorized_keys/article/details/113887983