Introduction to sklearn

scikit-learn is used in fields such as data mining and machine learning

Contains most of the traditional machine learning methods

Come out on Google in 2006

It is based on the Python language

It is based on NumPy, SciPy, and matplotlib toolkit

There are mainly the following six functions:

Classification

Including support vector machine classification (SVC), nearest neighbors, decision tree, random forest, etc.

Regression

Including linear regression, polynomial regression, support vector regression (SVR), ridge regression, lasso regression, etc.

Clustering

k-means, spectral clustering, mean-shift and other methods

降维(Dimensionality reduction)

The effect is to reduce the dimensionality of the sample vector

For example, from 200 dimensions to 15 dimensions

Main algorithms: principal component analysis (PCA), independent component analysis (ICA) and other methods

Model selection

Role: Evaluation model, model selection, cross-validation, parameter adjustment, etc., grid search, etc.

Preprocessing

Used for data normalization, data standardization, mean removal, whitening, and binarization

In short, the data is preprocessed

Guess you like

Origin blog.csdn.net/xtingjie/article/details/72331640