It is true that the paddle+aistudio provided by Baidu is a good deep learning platform, but it is found that the random seed seems to be invalid during use.
In pyotrch, the random seed can be fixed. In the case of keeping various configurations unchanged, the training process can be guaranteed to be consistent no matter how many times the training is performed. For example, the loss and accuracy of each round are the same as the previous few training sessions. .
Here are some links on the subject:
https://aistudio.baidu.com/paddle/forum/topic/show/987738
https://aistudio.baidu.com/paddle/forum/topic/show/990814
Experiment 1: CPU can be aligned
Paddle can paddle.seed(seed)
set the random seed of paddlepaddle, and through some experiments, if the cpu is used for training, the reproducibility can be guaranteed.
This at least shows that there is no problem with my code, and there are no reasons such as input differences.
But if the device is set to GPU, it cannot be aligned.
Experiment 2: Test on non-aistudio platform
This time I used a 3090 for testing, using paddle.seed(seed)
a fixed random seed, and found that it can be aligned , fart! Before, the epoch setting was too small, and it was still impossible to align when it was slightly larger
ultimate solution
As far as the current situation is concerned, paddles can only be aligned with random seeds on the CPU, but cannot be aligned on the GPU.
According to the analysis, it is the cuDNN on the GPU that causes the uncertainty of the convolution operator, so a certain convolution operator can be set :
export FLAGS_cudnn_deterministic=True
Because the indeterminate operator is looking for the best operator to calculate, once it is determined, the speed may be slow, and it is uncertain how much slower it is.
First, set the environment variable in the terminal
aistudio@jupyter-368487-6009734:~$ export FLAGS_cudnn_deterministic=True
Add detection in python to prevent environment variables from dropping
Add the following code at the beginning of the main file, which means to check whether there is FLAGS_cudnn_deterministic
this in the environment variable. If there is no proof, an exception will be triggered. At this time, just set the environment variable in the terminal again.
import os
assert os.environ.get('FLAGS_cudnn_deterministic'),print("请设置$:export FLAGS_cudnn_deterministic=True")
print("存在环境变量FLAGS_cudnn_deterministic=True")