Basics of Machine Learning: "Classification Algorithm (1)—sklearn Converter and Estimator"

1. Converter

1. What is a converter?
The previous steps of feature engineering:
(1) The first step is to instantiate a converter class (Transformer).
(2) The second step is to call fit_transform to convert the data.

2. We call the interface of feature engineering a converter, and there are several forms of converter calls:
fit_transform()
fit()
transform()

3. Example
Let’s take standardization as an example: (x - mean) / std
The feature x to be transformed is subtracted from the mean mean of this column and divided by the standard deviation.
The first step will execute fit() to calculate the mean of each column. , standard deviation.
The second step will execute transform(), and use the result calculated in the first step to bring it into the formula for the final conversion.

2. Estimator

1. What is an estimator?
In sklearn, the estimator plays an important role. It is a type of API that implements algorithms.
All machine learning algorithms are encapsulated into estimators.

2. Estimators for classification
(1) sklearn.neighbors: k-neighbor algorithm
(2) sklearn.naive_bayes: Naive Bayes
(3) sklearn.linear_model.LogisticRegression: logistic regression
(4) sklearn.tree: decision tree with random forests

3. Estimators used for regression
(1) sklearn.linear_model.LinearRegression: Linear regression
(2) sklearn.linear_model.Ridge: Ridge regression

4. Estimators for unsupervised learning
(1) sklearn.cluster.KMeans: clustering

3. Estimator workflow

1. Instantiate an estimator

2. Call estimator.fit(x_train, y_train) to calculate
. In the fit method, pass in the characteristic values ​​and target values ​​of the training set.
When the call is completed, it means that the model is generated.

3. Model evaluation
(1) Directly compare the true value and the predicted value.
Note: x_test test set, y_predict predicted result, y_test test set target value
y_predict = estimator.predict(x_test)
comparison y_test == y_predict

(2) Calculate accuracy
Description: accuracy accuracy
= estimator.score(x_test, y_test)

Guess you like

Origin blog.csdn.net/csj50/article/details/132302678