Machine Learning (a) feature normalization

Scene Description

In order to eliminate the influence dimension between the data features, characteristics we need to be normalized, making comparable between different indicators.
For example, analysis of the influence of a person's height and weight on health, if using m (m) and kilograms (kg) as a unit, characterized in that height will be within the value range of 1.6-1.8m, wherein the weight of 50-100kg will within the range, the results of the analysis will obviously tend to come out of the numerical difference is relatively large weight characteristics. Want to get more accurate results, it is necessary to characterize normalized (Normalization) process, the value of each index in the same order of magnitude for analysis.

  • Question: Why the need for numeric types of features do normalization?

Knowledge Point

Type of feature values ​​for normalization may make all features are unified into a value substantially equal intervals. The most common methods are mainly the following two.

  1. Linear function of normalized (Min-Max Scaling)

    Linear transformation of its original data, the result is mapped to the [0,1] range, to achieve geometric scaling of the original data. Normalized using the following formula: Here Insert Picture Descriptionwherein X is the original data, Xmax, Xmin are the maximum and minimum data.

  2. Zero-mean normalization (Z-Score Normalization)

    It copies the original mappings to the mean of 0 and standard deviation 1 on the distribution. Specifically, assuming that the original characteristics of [mu] is the mean, standard deviation [alpha], then the normalization equation is:
    Here Insert Picture Description

Examples :( normalized linear function)
Here Insert Picture Description
as shown,

  1. Same as the relationship between the relationship between the normalized value obtained with the original data values.
  2. Value obtained normalized between [0, 1], that the stochastic gradient descent algorithm convergence speed accelerated.

to sum up

  • Of course, data normalization is not a panacea of.
  • In practice, by gradient descent model solution typically requires normalized, including linear regression, logistic regression, support vector machines, neural network models.
  • But for the decision tree model is not applicable to C4.5 for example, when making the decision tree node splitting, mainly based on the data set D feature x information gain ratio , and information gain than with whether the feature after normalization is irrelevant, because the normalized gain does not change the information on the characteristics of a sample of x .
Published 14 original articles · won praise 6 · views 594

Guess you like

Origin blog.csdn.net/qq_38883844/article/details/104174798