[Error] The error encountered in Yolo training

Table of contents

 1.RuntimeError: Unable to find a valid cuDNN algorithm to run convolution

 1.RuntimeError: Unable to find a valid cuDNN algorithm to run convolution

RuntimeError: Unable to find a valid cuDNN algorithm to run convolution

The reason for the error is that the gpu memory usage is too high

We need to reduce the bach_size. It is generally recommended to be a multiple of 8. When the memory is not enough, try to reduce it.

parser.add_argument('--batch-size', type=int, default=16, help='total batch size for all GPUs, -1 for autobatch')

The result works fine.

When using the Yolov5 model, the depth and number of channel layers of the s and n models are relatively small, and the batch-size is set slightly larger, so the gpu can run, but it may not be able to run when using the l and x models. up. Use the command nvidia-smi to adjust the size of the batch-size according to the usage of the GPU to shorten the time for each training model as much as possible.

Guess you like

Origin blog.csdn.net/WakingStone/article/details/129466995