-
Feature scaling is a method used to normalize the range of independent variables or features of data.
In some machine learning algorithms, objective functions will not work without normalization.
Gradient descent converges much faster with feature scaling than without it.
-
Rescaling(min-max normalization) Normalization
It’s a method consists in rescaling the range of features to scale the range in [0,1] or [-1,1].
The general formula for a min-max of [0,1] is given as:
x ‘ = x − m i n ( x ) m a x ( x ) − m i n ( x ) x`=\frac{x-min(x)}{max(x)-min(x)} x‘=max(x)−min(x)x−min(x)
-
-
Mean normalization
x ‘ = x − a v e r a g e ( x ) m a x ( x ) − m i n ( x ) x`=\frac{x-average(x)}{max(x)-min(x)} x‘=max(x)−min(x)x−average(x)
-
Standardization (Z-score Normalization)
x ‘ = x − x ^ σ x`=\frac{x-\hat x}{\sigma} x‘=σx−x^
-
Scaling to unit length
x ‘ = x ∣ ∣ x ∣ ∣ x`=\frac{x}{||x||} x‘=∣∣x∣∣x
-
Why Should we Use Feature Scaling
Gradient Descent Based Algorithms like linear regression, logistic regression, neural network, etc. that use gradient descent as an optimization technique require data to be scaled.
Distance-Based Algorithms like KNN, K-means, and SVM are most affected by the range of feetures.
Tree-Based Algorithms, are fairly insensitive to the scale of the features. A decision tree is only splitting a node based on a single feature, This split on a feature is not influenced by other features.
-
Normalization vs. Standardization
These are two of the most commonly used feature scaling techniques in machine learning but a level of ambiguity exists in their understanding.
Standardization can be helpful in cases where the data follows a Gaussian distribution;and unlike normalization, standardization does not have a bounding range, Which means even if you have outliers in your data, they will not be affected by standardization;
Normalization is good to use when you know that the distribution of your data does not follow a Gaussian distribution.
-
Normalization
Normalization is a scaling technique in which values are shifted and rescaled so that they end up ranging between 0 and 1. It is also known as Min-Max scaling.
X − X m i n X m a x − X m i n \frac{X-X_{min}}{X_{max}-X_{min}} Xmax−XminX−Xmin -
Standardization
Standardization is another scaling technique where the values are centered around the mean with a unit standard deviation.
X − μ σ \frac{X-\mu}{\sigma} σX−μ -
References
理解normalization||Standardization||Feature scaling
猜你喜欢
转载自blog.csdn.net/The_Time_Runner/article/details/109171542
今日推荐
周排行