Python data normalization, standardization, regularization (machine learning)


✌ Data normalization, standardization, and regularization

1. ✌ Normalization

Is to scale the data to 0~1 interval, using the formula (x-min)/(max-min)

2. ✌ Standardization

Convert the data to a standard normal distribution, with a mean of 0 and a variance of 1

3.✌ Regularization

The main function of regularization is to prevent over-fitting. Adding regularization items to the model can limit the complexity of the model and balance the complexity and performance of the model.

Commonly used regularization methods include L1 regularization and L2 regularization. L1 regularization and L2 regularization can be regarded as penalty terms of the loss function. The so-called "penalty" is to impose some restrictions on some parameters in the loss function.

4. ✌ Code test

4.1 ✌ Guide library

import pandas as pd
import numpy as np

from sklearn.preprocessing import MinMaxScaler
from sklearn.preprocessing import StandardScaler
from sklearn.preprocessing import Normalizer

4.2 ✌ Create data

x=np.random.randint(1,1000,(10000,5))
x=pd.DataFrame(x)

Insert picture description here

4.3 ✌ View the mean and variance of the original data

display(x.mean())
display(x.var())

Insert picture description here

4.4 ✌ Normalization

from sklearn.preprocessing import MinMaxScaler
x_min=MinMaxScaler().fit_transform(x)
x_min=pd.DataFrame(x_min)
display(x_min.mean())
display(x_min.var())

Insert picture description here

4.5 ✌ Standardization

from sklearn.preprocessing import StandardScaler
x_std=StandardScaler().fit_transform(x)
x_std=pd.DataFrame(x_std)
display(x_std.mean())
display(x_std.var())

Insert picture description here

4.6 ✌ Regularization

from sklearn.preprocessing import Normalizer
x_nor=Normalizer().fit_transform(x)
x_nor=pd.DataFrame(x_nor)

Insert picture description here

Guess you like

Origin blog.csdn.net/m0_47256162/article/details/113791082