Sklearn用户手册学习笔记 -- Transformers for data preprocessing

Transformers for data prepocessing:

  1. DictVectorizer  -- one-hot encoder
  2. StandardScaler  -- scaled data has zero mean and unit variance.  Noticed: if the object could be saved, then the same operation could be applied to test data
  3. MinMaxScaler, MaxAbsScaler  : #parameters: afeature_range=(min, max)
  4. RobustScaler: #parameters: with_centering=Truewith_scaling=Truequantile_range=(25.075.0)copy=True
  5. QuantileTransformer and quantile_transform provide a non-parametric transformation based on the quantile function to map the data to a uniform distribution with values between 0 and 1:
  6. PowerTransformer:   Power transforms are a family of parametric, monotonic transformations that aim to map data from any distribution to as close to a Gaussian distribution as possible in order to stabilize variance and minimize skewness. PowerTransformer currently provides two such power transformations, the Yeo-Johnson transform and the Box-Cox transform.
  7.  Encoders:  OrdinalEncoder, LabelEncoder, OneHotEncoder
  8. Discrimization: KBinsDiscretizer, Binarizer
  9. Nans Filling: SimilyImputer,  MissingIndicator
  10. Generating polynomial  features: PolynomialFeatures
  11. Dimensional reduction:
    1. sklearn.decomposition.PCA
    2. Random Projection.GaussianRandomProjection , SparseRandomProjection   -- reduce dimensions
    3. FeatureAgglomeration
  12. Kernel functions
    1.  RBFSampler
    2. AdditiveChi2Sampler 
    3.  SkewedChi2Sampler

猜你喜欢

转载自blog.csdn.net/Emma_Love/article/details/84862577