Under normal circumstances, the video memory will be released when the process is stopped
But if the process is closed in an abnormal situation, it may not be released. At this time, this situation will occur:
Mon Oct 19 16:00:00 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.104 Driver Version: 410.104 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... Off | 00000000:00:0D.0 Off | 0 |
| N/A 38C P0 35W / 250W | 16239MiB / 16280MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
The solution, of course, is to kill the process that normally uses video memory
To release the process, of course you need to find the process
fuser -v /dev/nvidia*
USER PID ACCESS COMMAND
/dev/nvidia0: root 26031 F...m python
root 26035 F...m python
root 26041 F...m python
root 26050 F...m python
root 32512 F...m ZMQbg/1
/dev/nvidiactl: root 26031 F...m python
root 26035 F...m python
root 26041 F...m python
root 26050 F...m python
root 32512 F.... ZMQbg/1
/dev/nvidia-uvm: root 26031 F.... python
root 26035 F.... python
root 26041 F.... python
root 26050 F.... python
root 32512 F.... ZMQbg/1
Then use kill -9 26031 to kill the process, and the process releases resources. It is necessary to kill the processes queried above one at a time
It's normal without accident: