[Record] Find the mean and variance of batch pictures

When training a neural network, it is necessary to normalize the input image. For single-channel grayscale images, sampling can be performed and the mean and variance of the sampled images can be calculated.
Assume that the picture names used for training are stored in figure_file.txt.

import os 
import numpy as np
from PIL import Image

channel_mean = 0
channel_square_root = 0
pixels_num = 0
count = 0 # 总共图片个数
count_add = 0 # 一共采样次数
root_path = $ROOT_PATH # 绝对路径
figure_file = $FIGURE_FILE # 存图片名的txt文件
ff = open(figure_file, 'r')
for line in ff.readlines():
    count += 1
    if count % 4 == 0: # 每4张图进行一次采样
        filename = os.path.join(root_path, line.strip())
        data = np.array(Image.open(filename))
        h, w = data.shape[0], data.shape[1]
        pixels_num += h * w
        print(filename, h, w)
        
        channel_mean += np.sum(data)
        channel_square_root += np.sum(np.power(data, 2))
        count_add += 1
        
channel_mean = channel_mean / pixels_num
channel_square_root = np.sqrt(channel_square_root / pixels_num - channel_mean * channel_mean)
print(count, count_add, channel_mean, channel_square_root)

The following mathematical formula can be used to derive the variance:
S 2 = Σ ( x − μ ) 2 N = Σ ( x 2 + μ 2 − 2 x μ ) 2 NS^2 = \frac{\Sigma(x-\mu )^2}{N} = \frac{\Sigma(x^2+\mu ^2 - 2x\mu)^2}{N}S2=NS ( xm )2=NS ( x2+m22 x m )2
= Σ x 2 + N μ 2 − 2 μ Σ x N =\frac{\Sigma x^2 + N\mu ^2 - 2\mu \Sigma x}{N}=Nx _2+N μ22 m Σ x
= Σ x 2 + N μ 2 − 2 μ N μ N =\frac{\Sigma x^2 + N\mu ^2 - 2\mu N \mu}{N}=Nx _2+N μ22 m N m
= Σ x 2 N − μ 2 =\frac{\Sigma x^2}{N} - \mu ^2=Nx _2m2

Guess you like

Origin blog.csdn.net/yaoyao_chen/article/details/128327254