pythorch problem of the low utilization of the graphics card

In question docker in pytorch card utilization is too low

Just in time to take advantage of the training model with a docker with pytorch of a reported error: RuntimeError: DataLoader worker (pid 493) is killed by signal:.. Bus error Details are lost due to multiprocessing Rerunning with num_workers = 0 may give better error trace .

It started as a discovery in the training card utilization is too low, the graphics card is idle half the time, it would not do ah, this was training to what month to go, guess should be consuming CPU data preprocessing caused by too much time, so the num_workers DataLoader parameter is set to 8, and then on top of that they reported the error, then google search a bit, this should be set to 0 when no num_works, leading to the host computer and share docker memory is not enough, so he reported this wrong, how to solve it? When creating docker plus -shm-size = 16G this parameter is ok friends . So the utilization of basic graphics can be maintained at 99% of this, of course, it will fluctuate a little, a lot faster training.

Released four original articles · won praise 0 · Views 209

Guess you like

Origin blog.csdn.net/ogzhen/article/details/103977490