The meaning of each parameter of the Nvidia smi command

nvidia-smi is used to view GPU usage. I often use this command to determine which GPUs are idle, but the recent GPU usage status has confused me, so I will explain the specific meaning of each content in the GPU usage table displayed by the nvidia-smi command .

Here is the information for the Tesla K80 on the server . In the above table: Fan in the first column : N/A is the fan speed, which varies from 0 to 100% . This speed is the fan speed expected by the computer. In actual circumstances, if the fan is blocked, it may not be displayed. Rotating speed. Some devices don't return to RPM because it doesn't rely on fans for cooling but is kept cool by other peripherals (for example, our lab's servers are kept in air-conditioned rooms all year round). Temp in the second column : is the temperature in degrees Celsius. Perf in the third column : is the performance state, from P0 to P12 , P0 represents the maximum performance, P12 represents the state minimum performance. Pwr at the bottom of the fourth column : is energy consumption, and Persistence-M at the top : is the state of continuous mode. Although continuous mode consumes a lot of energy, it takes less time when a new GPU application is started. Here it is displayed as off status. The Bus-Id in the fifth column is something involving the GPU bus, 
 
 
 
 
 
 
Disp.A in the sixth column of domain:bus:device.function is Display Active , indicating whether the display of the GPU is initialized. The Memory Usage below the fifth and sixth columns is the memory usage rate. The seventh column is the floating GPU utilization. Above the eighth column is something about ECC . Compute M at the bottom of the eighth column is the computing mode. The following one shows the video memory usage occupied by each process. 
 
 
 
 

Video memory usage and GPU usage are two different things . Graphics cards are composed of GPU and video memory. The relationship between video memory and GPU is somewhat similar to the relationship between memory and CPU . When I run the caffe code, the video memory takes up less, and the GPU takes up more. When I run the TensorFlow code, the video memory takes up more, and the GPU takes up less.


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324948698&siteId=291194637