The nvidia-smi command of the GPU is explained in detail to view the information of the graphics card:
cmd: nvidia-smi
Detailed explanation of nvidia-smi command of GPU - Programmer Sought
edit
GPU: The number of the GPU in this machine (when there are multiple graphics cards, the number starts from 0). The number of the GPU on the picture is: 0
Fan: fan speed (0%-100%), N/A means no fan
Name: GPU type, the type of GPU on the picture is: Tesla T4
Temp: The temperature of the GPU (GPU temperature is too high will cause the frequency of the GPU to drop)
Perf: The performance state of the GPU, from P0 (maximum performance) to P12 (minimum performance), on the graph: P0
Persistence-M: The state of the persistence mode. Although the persistence mode consumes a lot of energy, it takes less time to start a new GPU application. The figure shows: off
Pwr: Usage/Cap: energy consumption display, Usage: how much is used, how much is the total Cap
Bus-Id: GPU bus related display, domain: bus: device.function
Disp.A: Display Active, indicating whether the display of the GPU is initialized
Memory-Usage: memory usage
Volatile GPU-Util: GPU usage
Uncorr. ECC: About ECC, whether to enable error checking and correction technology, 0/disabled, 1/enabled
Compute M: computing mode, 0/DEFAULT, 1/EXCLUSIVE_PROCESS, 2/PROHIBITED
Processes: Display the video memory usage, process number, and GPU occupied by each process
Refresh the memory status every few seconds: nvidia-smi -l seconds
Refresh the status of the GPU every two seconds: nvidia-smi -l 2
Tensorflow graphics card usage
1. Use directly
This method will basically occupy the remaining video memory of all graphics cards on the current machine. Note that it is the remaining video memory of all graphics cards on the machine. So the program may only need one graphics card, but the program is so overbearing, I don't need other graphics cards, or I can't use so many graphics cards, but I just want to occupy them.
with tf.compat.v1.Session() as sess:
# 输入图片为256x256,2个分类
shape, classes = (224, 224, 3), 20
# 调用keras的ResNet50模型
model = keras.applications.resnet50.ResNet50(input_shape = shape, weights=None, classes=classes)
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])
# 训练模型 categorical_crossentropy sparse_categorical_crossentropy
# training = model.fit(train_x, train_y, epochs=50, batch_size=10)
model.fit(train_x,train_y,validation_data=(test_x, test_y), epochs=20, batch_size=6,verbose=2)
# # 把训练好的模型保存到文件
model.save('resnet_model_dog_n_face.h5')
2. Use of distribution ratio
The difference between this method and the above direct use method is that I do not occupy all the video memory. For example, if I write this way, I will occupy 60% of each video card.
from tensorflow.compat.v1 import ConfigProto# tf 2.x的写法
config =ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction=0.6
with tf.compat.v1.Session(config=config) as sess:
model = keras.applications.resnet50.ResNet50(input_shape = shape, weights=None, classes=classes)
3. Dynamic application and use
This method dynamically applies for video memory, only applies for memory, and does not release memory. And if someone else's program occupies all the remaining graphics cards, an error will be reported.
The above three methods should be selected according to the scene.
The first type takes up all the memory, so as long as the size of the model does not exceed the size of the video memory, there will be no video memory fragmentation and affect computing performance. It can be said that the configuration is suitable for deploying applications.
The second and third types are suitable for multiple people using one server, but the second type has a waste of video memory, and the third type avoids the waste of video memory in a certain program, but it is very easy for the program to fail to apply for memory Circumstances that lead to crashes.
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.InteractiveSession(config=config)
with tf.compat.v1.Session(config=config) as sess:
model
4 Specify the GPU
When running tensorflow on a server with multiple GPUs, if you use python programming, you can specify the GPU, the code is as follows:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "2"
With a complete example: resnet50 image classification:
edit
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.InteractiveSession(config=config)
with tf.compat.v1.Session(config=config) as sess:
# 输入图片为256x256,2个分类
shape, classes = (224, 224, 3), 20
# 调用keras的ResNet50模型
model = keras.applications.resnet50.ResNet50(input_shape = shape, weights=None, classes=classes)
model.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])
# 训练模型 categorical_crossentropy sparse_categorical_crossentropy
# training = model.fit(train_x, train_y, epochs=50, batch_size=10)
model.fit(train_x,train_y,validation_data=(test_x, test_y), epochs=20, batch_size=6,verbose=2)
# # 把训练好的模型保存到文件
model.save('resnet_model_dog_n_face.h5')