Experimental verification of BatchNorm2d in pytorch

BatchNorm2d

Batch normalize the two-dimensional matrix, meanwhich is the current batchmean and the stdcurrent batchstandard deviation. Using batch normalization can map data with different value ranges to the interval of the standard normal distribution, reducing the gap between data , to facilitate fast model convergence. Batch normalization essentially reduces the absolute error between samples, but does not change the relative error. For example, when doing [1,2,3,4]normalization, although the size of the number changes, the relationship between the numbers will not change. It is generally recommended to follow the convolution kernel with a batch normalization

official

  • Normalization formula
    insert image description here

  • 全局均值估计:running_mean全局方差估计:running_var
    x n e w = ( 1 − m o m e n t u m ) × x o l d + m o m e n t u m × x t x_{new}=(1-momentum) \times x_{old}+momentum \times x_{t} xnew=(1momentum)×xold+momentum×xt
    x n e w x_{new} xnewfor the updated running_mean/running_var, xold x_{old}xoldBefore the update running_mean/running_var, xt x_{t}xtFor the current batch mean和var, momentumit is the weight factor, which is generally taken as0.1

  • BatchNorm2d
    batchnorm=torch.nn.BatchNorm2d(num_features=通道的数量)
    It is not recommended to change other parameters in pytorch

Experimental verification on BatchNorm2d

  • Validation of the normalization formula
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as opti
from torchvision.transforms import RandomRotation
import torchsummary
import time
import datetime
import numpy as np
import copy
import torch.nn as nn
data=torch.tensor(
   [[[[1,2],
    [3,4]]]],dtype=torch.float32
)
batchnorm=nn.BatchNorm2d(num_features=1,momentum=0.1)
print('------------1--------------')
print("初始状态下的running_mean,running_var")
print(batchnorm.running_mean)
print(batchnorm.running_var)
print('------------2--------------')
print("输入data后状态下的running_mean,running_var")
test=batchnorm(data)
print(batchnorm.running_mean)
print(batchnorm.running_var)
print('训练状态下对data进行batchNorm')
print(test)
print('手动计算的batchNorm')
mean=torch.mean(data)
std=torch.var(data,False)
print((data[0][0]-mean)/torch.sqrt(std+1e-5))

Conclusion, the normalized mean and std are the mean and std of the current batch

  • running_meanand running_varthe formula verification
print('------------3--------------')
print("人工计算的running_mean,running_var")
running_mean=torch.tensor(0)
running_var=torch.tensor(1)
running_mean=0.9*running_mean+0.1*mean
running_var=0.9*running_var+0.1*std
print(running_mean)
print(running_var)

print('测试状态下对data进行batchNorm')
batchnorm.training=False
test=batchnorm(data)
print(test)
#得出如下结论:
#running_mean=(1-momentum)*running_mean+momentum*batch_mean
#running_var=(1-momentum)*running_var+momentum*batch_var

running_mean and running_var only have an impact on the test, and have no impact on the training. The test data is used running_meanand running_varnormalized

  • 当track_running_stats=Falsetime impact
print('------------4--------------')
print('track_running_stats设置为False时,输入data前得running_mean,running_var')
batchnorm=nn.BatchNorm2d(num_features=1,momentum=0.1,track_running_stats=False)
print(batchnorm.running_mean)
print(batchnorm.running_var)
print('------------5--------------')
print('track_running_stats设置为False时,输入data后得running_mean,running_var')
test=batchnorm(data)
print(batchnorm.running_mean)
print(batchnorm.running_var)
print('------------6--------------')
print('track_running_stats设置为False时,训练状态下对data进行batchnorm')
print(test)
print('------------7--------------')
print('track_running_stats设置为False时,测试状态下对data进行batchnorm')
batchnorm.training=False
test=batchnorm(data)
print(test)
#得出如下结论
#running_mean和running_var是用于对测试集进行归一化,如果track_running_stats设置为False,则测试集进行归一化时不会使用running_mean和running_var
#而是直接用自身得mean和std

Do not track_running_statsset toFalse

Guess you like

Origin blog.csdn.net/qq_33880925/article/details/130244586