In the process of using the server to run the YOLO model, it is often interrupted due to various reasons, such as disconnection, shutdown, etc., YOLO provides a parameter resume: which means breakpoint training, that is, it can continue the previous training. train.
specific methods
Modify resume
parameters :
parser.add_argument('--resume', nargs='?', const=True, default=True, help='resume most recent training')
Modify the weight file, the original weight file is the pre-training weight, now use its last training weight, as follows:
parser.add_argument('--weights', type=str, default='/home/ubuntu/conda/yolov7/runs/train/exp8/weights/last.pt', help='initial weights path')
At this time, the operation reports an error:
File "/home/ubuntu/anaconda3/envs/python/lib/python3.8/site-packages/torch/serialization.py", line 1033, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: STACK_GLOBAL requires str
This is a data set error, and the original cache file needs to be deleted:
Then run train.py
it OK
As for the practice of YOLOv5, it can be done in the same way.