Python-multidimensional matrix adding Gaussian noise

Article Directory

Python-multidimensional matrix adding Gaussian noise

brief introduction

There are two steps in total: ① Create a multidimensional matrix as the object to add Gaussian noise; ② Define the function of Gaussian noise addition, in which noise is created and added to the original noise.

Step 1: Create a multidimensional matrix

if __name__ == '__main__':
    # 生成一个三维的小数矩阵，模拟4张特征图，每一张特征图有20行，15列
    matrix = np.random.random(size=[4, 20, 15])
    # print(type(matrix)) # numpy.ndarray
    # print(type(matrix[0][0][0])) # numpy.float64
   
	# 转换成numpy.float32
    matrix_new = matrix.astype(np.float32)
    # print(type(matrix_new[0][0][0])) # numpy.float32

    print(">>>>>>>>>before adding gaussain noise")
    print(matrix_new[0][0])
    
    # 加入高斯噪声（方法一）
    gauss_noise(matrix_new, 0.2)

    # 加入高斯噪声（方法二）
    gauss_noise_matrix(matrix_new, 0.2)

Knowledge points involved

1. Using numpy to create a multidimensional random matrix

Using the function np.random.random(), the elements in the generated matrix are float type, and the size range is $[0.0, 1.0)$ . For usage details, see:numpy.random.random

You can also use the function np.random.randint(), the elements in the generated matrix are int type, and the size range can be customized. For usage details, see: numpy.random.randint

Create a matrix of decimals or integers?

If you want to add Gaussian noise, because Gaussian noise itself follows the Gaussian distribution, and the maximum value of the curve of the Gaussian distribution is $\frac{1}{\sqrt{2\pi} \sigma}$ ，当 $\sigma = 1$ , the maximum value of the Gaussian distribution curve is about 0.3989; when $\sigma = 0.2$ , the maximum value is about 1.9947. Only in $\sigma$ The maximum value of the curve will be relatively large (at least greater than 2) when the value of $σ is very small.$

But the commonly used $\sigma$ value is relatively small (not greater than 1), the maximum value of the curve will not be greater than 1, so the matrix generated by using the function np.random.random() can be more sensitive to the applied Gaussian noise, that is, the added noise interferes more with the original matrix, so in the following cases, we use the matrix generated by the function np.random.random().

2. View the data type of the variable

Use the type() function to view the data type of a variable. like:

matrix = np.random.random(size=[4, 20, 15])
print(type(matrix)) # 输出结果为：<class 'numpy.ndarray'>，表示matrix这个变量是使用numpy生成的一个array
print(type(matrix[0][0][0])) # 输出结果为：<class 'numpy.float64'>，表示matrix这个数组中的元素的数据类型是float64

3. Convert the data type of the variable from float64 to float32

Use the astype() function to convert the data type of a variable. like:

matrix = np.random.random(size=[4, 20, 15])
matrix_new = matrix.astype(np.float32) # 将matrix中元素的数据类型由float64转换为float32
# 注意：①参数要带上np，而不是'float32'或者直接写float32，那样会转换失败；②要使用一个新的变量来承接转换结果

Why should the element data type be converted from float64 to float32?

In terms of memory space occupation, float32 occupies less space: float64 occupies 64bits in memory, that is, 8bytes of space; float32 occupies 32bits in memory, that is, 4bytes of space. When the amount of data is large, the space occupied by float32 is much smaller than that of float64. When there is no high requirement for data accuracy, using float32 can improve the computing efficiency during network training.

The difference between float64 and float32 Reference: The essential difference between float32 and float64 (the impact of type on deep learning and the use of python)

Step 2: Define the function to add Gaussian noise

The standard deviation of the equation $\frac{1}{\sqrt{2\pi}\sigma} e^{- \frac{(x - \mu)^2}{2\sigma^2}}$ , Gaussian noise is controlled by two variables: $\mu$ 和 $\sigma$ , so these two parameters are provided.

Method 1: Add Gaussian noise to the elements in the multidimensional matrix one by one

def gauss_noise(matrix, sigma):
    mu = 0 # 高斯噪声的第一个参数mu
    # 高斯噪声的第二个参数sigma在这里写成了由调用此函数的函数输入

    # 1. 因为是向多维矩阵中的元素逐个添加高斯噪声，所以要先获取矩阵的大小。这里我因为事先知道输入的矩阵大小，所以没有做特殊情况的判断，直接获取channel大小、矩阵行数height和矩阵列数width。
    channel_size = len(matrix)
    height = len(matrix[0])
    width = len(matrix[0][0])
    # print("matrix_shape: channel_size: {}, height: {}, width: {}".format(channel_size, height, width))

    # 2. 遍历多维矩阵中的元素，逐个添加高斯噪声
    for channel in range(0, channel_size):
        for i in range(0, height):
            for j in range(0, width):
                matrix[channel][i][j] += random.gauss(mu, sigma)
    
    # 3. 输出添加噪声后的矩阵
    print(">>>>>>>>>added gaussain noise with method 1")
    print(matrix[0][0]) # 这里为了方便观察，只输出了第一个channel的第一行元素

Knowledge points involved

1. Get the size of the variable

Use the len() function. like:

channel_size = len(matrix)
# 输出结果为4，表示matrix矩阵第一维大小为4

2. Generate random floating point numbers with a Gaussian distribution

Use the random.gauss() function, which returns a random floating point number with a Gaussian distribution. For usage details, see: random.gauss() function in Python

Example:

add_gauss = random.gauss(mu, sigma)
# print("add_gauss: {}, type: {}".format(add_gauss, type(add_gauss))) # type: 'float'

Method 2: Define a Gaussian noise matrix that is as large as the multidimensional matrix, and add Gaussian noise directly to the multidimensional matrix

def gauss_noise_matrix(matrix, sigma):
    # 1. 定义一个与多维矩阵等大的高斯噪声矩阵
    mu = 0
    channel_size = len(matrix)
    height = len(matrix[0])
    width = len(matrix[0][0])
    noise_matrix = np.random.normal(mu, sigma, size=[channel_size, height, width]).astype(np.float32) # 这里在生成噪声矩阵的同时将其元素数据类型转换为float32
    # print("noise_matrix_element_type: {}".format(type(noise_matrix[0][0][0]))) # numpy.float32
    print(noise_matrix[0][0]) # 这里为了方便观察，只输出了第一个channel的第一行元素
 	
    # 2. 与原来的多维矩阵相加，即可达到添加高斯噪声的效果
    matrix += noise_matrix
    
    # 3. 输出添加噪声后的矩阵
    print(">>>>>>>>>added gaussain noise with method 2")
    print(matrix[0][0]) # 这里为了方便观察，只输出了第一个channel的第一行元素

Knowledge points involved

1. Generate a random matrix with Gaussian distribution characteristics

Use the random.normal() function, which returns a matrix characterized by a Gaussian distribution. For usage details, see: numpy.random.normal

overall code

import numpy as np
import random

# 方法一：逐个元素添加噪声
def gauss_noise(matrix, sigma):
    mu = 0

    channel_size = len(matrix)
    height = len(matrix[0])
    width = len(matrix[0][0])
    # print("matrix_shape: channel_size: {}, height: {}, width: {}".format(channel_size, height, width))

    for channel in range(0, channel_size):
        for i in range(0, height):
            for j in range(0, width):
                matrix[channel][i][j] += random.gauss(mu, sigma)
    print(">>>>>>>>>added gaussain noise with method 1")
    print(matrix[0][0])
    # print(type(matrix[0][0][0])) # numpy.float32

# 方法二：等大噪声矩阵添加噪声
def gauss_noise_matrix(matrix, sigma):
    mu = 0
    channel_size = len(matrix)
    height = len(matrix[0])
    width = len(matrix[0][0])
    noise_matrix = np.random.normal(mu, sigma, size=[channel_size, height, width]).astype(np.float32)
    # print("noise_matrix_element_type: {}".format(type(noise_matrix[0][0][0]))) # numpy.float32
    print(noise_matrix[0][0])

    matrix += noise_matrix
    print(">>>>>>>>>added gaussain noise with method 2")
    print(matrix[0][0])

if __name__ == '__main__':
    # 生成一个四维小数矩阵，21行，25列
    matrix = np.random.random(size=[4, 21, 25])
    # print(type(matrix)) # numpy.ndarray
    # print(type(matrix[0][0][0])) # numpy.float64
    
    # 转换成numpy.float32
    matrix_new = matrix.astype(np.float32)
    # print(type(matrix_new[0][0][0])) # numpy.float32

    print(">>>>>>>>>before adding gaussain noise")
    print(matrix)
    # print(matrix_new[0][0])
    
    # 加入高斯噪声（方法一）
    gauss_noise(matrix_new, 0.2)

    # 加入高斯噪声（方法二）
    gauss_noise_matrix(matrix_new, 0.2)