Table of contents
1. Locating detailed error information
2. Possible error 1: num_class setting is incorrect
3. Error 2 may be reported: the model output size is wrong
Error:
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
Solution steps:
1. Locating detailed error information
First of all, this kind of vague error message is difficult to locate the specific error code location, so we need to print a more detailed error report description to find the error code location. Add the following two lines to the beginning of the code:
import os
os.environ['CUDA_LAUNCH_BLOCKING'] = '1'
At this time, the error report will be located at the actual location, and everyone can prescribe the right medicine to solve it.
2. Possible error 1: num_class setting is incorrect
Check whether the value corresponding to the label of your data set exceeds num_class. For example, for a data set with num_class=3, the label should be [0,1,2], and there should be no values other than these three values.
3. Error 2 may be reported: the model output size is wrong
Usually the input size of the model is 3 and will not change, but the channel of the output size should be the size of num_class. Therefore, it is the easiest to forget to change here, and it is me. . .
Remember to change the out_channel of the model to num_class! ! !