[] Data Analysis Data Mining & standardized way three data - normalized deviation, standard deviation normalized fractional scaling & Standardization

1  Import PANDAS AS PD
 2  Import numpy AS NP
 . 3  
. 4  
. 5  # normalized magnitude ---- removal effect 
. 6  
. 7  # 3 ways 
8  # (1) from the normalized difference 
9  # data linearly changes, mapping data to the range [0,1], 
10  # X = (X - min) / (max - min) 
. 11  # too large or too small outlier will influence the outcome 
12  # susceptible to the influence of outliers 
13 is  DEF max_min_sca (Data ):
 14      "" " 
15      by means of the difference from the normalized data normalized to
 16      : param data: original data
 . 17      : return: data after normalization
 18      " "" 
19     = Data (Data - data.min ()) / (data.max () - data.min ())
 20 is  
21 is      return Data
 22 is  
23 is  
24  # (2) normalized standard deviation 
25  # by mean and standard deviation data conversion 
26 is  # X = (X-Mean) / STD 
27  DEF stand_sca (data):
 28      "" " 
29      standard differential standardization
 30      : param data: original data
 31 is      : return: the standard deviation data after
 32      " "" 
33 is      data = ( Data - data.mean ()) / data.std ()
 34 is  
35      return Data
 36  
37 [  
38 is  #[10, 20] 10000 10000 ---- not affect the mean, standard deviation little effect 
39  # is not susceptible to the influence of outliers 
40  
41 is  
42 is  # (3) fractional scaling standardized 
43  # by moving the number of decimal places data conversion between [1,1] --- constant distribution data 
44 is  # X = X / K 10 ^ 
45  # K -----> rounded up (log10 (| x | .max ( ) )) 
46 is  DEF desc_sca (data):
 47      "" " 
48      decimal scaling the normalized data
 49      : param data: original data
 50      : return: data after normalization
 51 is      " "" 
52 is      data = data / (10 ** int (NP .ceil (np.log10 (data.abs (). max ()))))
 53 is      return   Data
 54 is  
55 
56 is  
57 is  # verification: 
58 detail pd.read_excel = ( " ./meal_order_detail.xlsx " )
 59  
60  Print ( " column index of detail: \ n- " , detail.columns)
 61 is  # Print ( "Shape of detail: \ n" , detail.shape) 
62 is  Print ( " before unnormalized: \ n- " , detail.loc [:, " Amounts " ])
 63 is  Print ( " maximum and minimum values: \ n- " , detail.loc [:, " Amounts " ] .max (), detail.loc [:, "amounts" ] .Min ())
 64  Print ( " after normalizing \ n- " , max_min_sca (detail.loc [:, " Amounts " ]))
 65  Print ( " after normalizing \ n- " , stand_sca (detail.loc [:, " Amounts " ]))
 66  Print ( " after normalizing \ n- " , desc_sca (detail.loc [:, " Amounts " ]))

Guess you like

Origin www.cnblogs.com/Tree0108/p/12116093.html