About the relationship between the device id of torch and the real GPU id

A few points to know:

  • cuda:{id}The id in is not necessarily the GPU id of the real hardware, but the GPU id available at runtime (counting from 0)
  • torch.cuda.device_count()Can view the number of GPUs available at runtime
  • torch.cuda.get_device_name(i)The name of the i-th device can be obtained

Test code. server.py:

device_count = torch.cuda.device_count()

for i in range(device_count):
    print(f"Device {
      
      i}: {
      
      torch.cuda.get_device_name(i)}")

device = torch.device(f"cuda:1" if torch.cuda.is_available() else "cpu")

For example, execute the command:

CUDA_VISIBLE_DEVICES=4,5 python server.py

Output result:

Device 0: NVIDIA GeForce RTX 3090
Device 1: NVIDIA GeForce RTX 3090

At this point, the code will think that the first visible GPU (ie cuda:0) is 4 cards, and the second visible GPU (ie cuda:1) is 5 cards. server.pyIn the code, 5 cards deviceare used for subsequent reasoning.

Guess you like

Origin blog.csdn.net/muyao987/article/details/128305565