131.006 Unsupervised Learning - Feature Scaling | 非监督学习 - 特征缩放

@(131 - Machine Learning | 机器学习)

1 Feature Scaling

transforms features to have range [0,1]
according to the formula

$x' = \frac{x-x_{min}}{x_{max}-x_{min}} $

1.1 Sklearn - MinMaxScaler


from sklearn.preprocessing import MinMaxScaler
import numpy
weights = numpy.array([[115.],[140.],[175.]]) 
#MinMaxScaler assumes floating point values as input
scaler = MinMaxScaler()
rescaled_weight = scaler.fit_transform(weights)
print rescaled_weight

[[0.        ]
 [0.41666667]
 [1.        ]]

1.2 Algorithm affected by feature rescaling?

□ 决策树
□ 使用 RBF 核函数的 SVM√
□ 线性回归
□ K-均值聚类√

Decision Trees use vertical and horizontal lines so there is no trade off.

SVM with RBF Kernel requires making trade-offs in dimensions.

In linear regression, the coefficient and the feature always go together.

K-Means Clustering requires making trade-offs in dimensions.

Algorithms in which two dimensions affect the outcome will be affected by rescaling.

猜你喜欢

转载自www.cnblogs.com/Neo007/p/9207144.html