Standardize data with python

1. Meaning of data standardization: Convert the numerical characteristics of the data into some standard formats so that they are within the same range as other attributes. This process is called normalization.

2. There are two commonly used standardization techniques

  1. Min-Max Normalization: This process makes the range of features within [0,1]. First calculate the minimum and maximum values ​​of each numerical feature, and then perform the following transformation on each value of the feature 

    X_{new}=\frac{X_{i}-min(X)}{max(X)-min(X)}

    Attached is the python program:
    from sklearn import preprocessing
    import  pandas
    data={'price':[492,286,487,519,541,429]}#用字典来存放数据
    price_frame=pandas.DataFrame(data)#把字典类型转化为dataframe对象
    min_max_normalizer=preprocessing.MinMaxScaler(feature_range=(0,1))
    #feature_range设置最大最小变换值,默认(0,1)
    scaled_data=min_max_normalizer.fit_transform(price_frame)
    #将数据缩放(映射)到设置固定区间
    price_frame_normalized=pandas.DataFrame(scaled_data)
    #将变换后的数据转换为dataframe对象
    print(price_frame_normalized)
  2. Z-Score Normalization: When the data contains outliers, max-min normalization is not the first choice. In the presence of outliers, the values ​​will continue to approach zero as the range of the data increases. Z-Score is commonly used in standardization techniques. Z-Score follows statistical principles, so that the mean value of the data is 0 and the standard deviation is 1.

                                                         Z=\frac{X-mean(X)}{StdDev(X)}

    Attached is the python program:
    from sklearn import preprocessing
    import pandas
    data={'price':[492,286,487,519,541,429]}#用字典来存放数据
    price_frame=pandas.DataFrame(data)#把字典类型转化为dataframe对象
    normalizer=preprocessing.scale(price_frame)
    #沿着某个轴标准化数据集,以均值为中心,以分量为单位方差
    price_frame_normalized=pandas.DataFrame(normalizer,columns=['price'])
    #将标准化的数据转换为dataframe对象,将列名改为price
    print(price_frame_normalized)

 

 

Guess you like

Origin blog.csdn.net/weixin_46031067/article/details/118767432