[Nvidia] Detailed introduction to nvidia advanced features MIG (1)

The blogger has not authorized any person or organization to reprint any original articles of the blogger, thank you for your support for the original!
blogger link

I work for an internationally renowned terminal manufacturer and am responsible for the research and development of modem chips.
In the early days of 5G, he was responsible for the development of the terminal data service layer and the core network. Currently, he is leading the research on technical standards for 6G computing power networks.


The content of the blog mainly revolves around:
       5G/6G protocol explanation
       computing power network explanation (cloud computing, edge computing, end computing)
       advanced C language explanation
       Rust language explanation



Detailed introduction to nvidia's advanced features MIG (1)

insert image description here

MIG:Multi-Instance GPU

       New Multi-Instance GPU (MIG) feature allows a GPU (starting with NVIDIA Ampere architecture) to be safely partitioned into up to 7 independent GPU instances for CUDA applications, providing multiple users with independent GPU resources for optimal GPU utilization. This feature is especially beneficial for GPUs whose workloads are not fully saturated, so users may wish to run different workloads in parallel to maximize utilization.

       With MIG, each instance's processor has independent and isolated paths through the entire memory system, and the on-chip crossbar ports, L2 cache, memory controller, and DRAM address bus are all uniquely assigned to a single instance. This ensures that individual user workloads can run with predictable throughput and latency with the same L2 cache allocation and DRAM bandwidth, even if other tasks are thrashing their own caches or saturating their DRAM interfaces. MIG enables partitioning of available GPU computing resources (including streaming multiprocessors, or SMs, and GPU engines such as replication engines or decoders), thereby providing well-defined fault isolation for different clients such as virtual machines, containers, or processes Quality of Service (QoS). MIG allows multiple GPU instances to run in parallel on a single physical NVIDIA Ampere GPU .

Please add a picture description

       Using MIG, users will be able to view and schedule jobs on new virtual GPU instances just like physical GPUs. MIG supports the Linux operating system, containers using the Docker engine, Kubernetes and virtual machines using hypervisors such as Red Hat virtualization and VMware vSphere.

1. Introduction to basic concepts of MIG

  • 流多处理器(SM,Streaming Multiprocessor): The stream multiprocessor (SM) executes calculation instructions on the GPU;

  • GPU上下文: A GPU context is similar to a CPU process. It encapsulates all the resources needed to perform operations on the GPU, including different address spaces, memory allocation, etc. A GPU context has the following properties:

    • error isolation;
    • Independent scheduling;
    • different address spaces;
  • GPU引擎: The GPU engine performs work on the GPU. most commonly usedengineis the computing/graphics engine that executes computing instructions.other enginesIncluding the engine (CE, copy engine) responsible for performing DMA copy, NVDEC responsible for video decoding, NVENC responsible for encoding, etc. Each engine can be scheduled independently and perform work for different GPU contexts;

  • GPU内存切片: A GPU memory slice is the smallest portion of GPU memory , including the corresponding memory controller and cache. A GPU memory slice is about one-eighth of the total GPU memory resources , including capacity and bandwidth;

  • GPU SM切片: A GPU SM slice is the smallest part of an SM on a GPU . When configured in MIG mode, a GPU SM slice is about one-seventh of the total number of available SMs for the GPU ;

  • GPU切片: A GPU slice is the smallest part of a GPU consisting of a single GPU memory slice and a single GPU SM slice ;

  • GPU实例: A GPU instance (GI) is a combination of one or more GPU slices and other GPU engines (DMAs, NVDECs, etc.). Anything in a GPU instance always shares all GPU memory slices and GPU engines , but its SM slices can be further subdivided into compute instances (CI, Compute Instance) . GPU instances provide memory QoS. Each GPU slice contains dedicated GPU memory resources, limits available capacity and bandwidth, and provides memory QoS. Each GPU memory slice gets 1/8 of the total GPU memory resources, and each GPU SM slice gets 1/7 of the total number of SMs;

  • 计算实例: A GPU instance can be subdivided into multiple computing instances . A Compute Instance (CI) is a subset of SM slices of a parent GPU instance and other GPU engines (DMAs, NVDECs, etc.) . CIs share memory and engines .

GPU resource logic diagram:
insert image description here

CI diagram:
insert image description here


Two, give an example

The following is an A100 (40GB) GPU, which can be regarded as 8x5GBsome memory slices and 7two SM slices, as shown in the figure below:
Please add a picture description

2.1 Examples of GPU instances

       As mentioned above, creating a GPU instance (GI) requires combining a certain number of memory slices and a certain number of compute slices. In the diagram below, a 5GB memory slice is combined with 1 compute slice to create a 1g.5gbGI profile:

Please add a picture description
       Similarly, 4x5GBmemory slices can 4x1be combined with compute slices to create 4g.20gbGI profiles:

Please add a picture description

2.2 Example of Computation Instance (CI)

       The computing slice of a GPU instance can be further subdivided into multiple computing instances (compute Instance, CI). CIs share the engine and memory of the parent GI, but each CI has dedicated SM resources. 4g.20gbUsing the example mentioned above , create a CI using only the first compute slice as follows:

Please add a picture description

       In this example, 4 different CIs can be created by selecting any compute slice. Two compute slices can also be combined to create a single 2c.4g.20gbprofile:

Please add a picture description
       In this example, 3 compute slices can also be combined to create a 3c.4g.20gbprofile or all 4 CIs can be combined to create a single 4c.4g.20gbprofile. When all 4 compute slices are combined, the profile is simply called 4g.20gb.


3. Naming rules for MIG equipment

       By default, MIG devices consist of 单个GPU实例and 单个计算实例. The following table highlights a naming convention that refers to MIG devices by their GPU instance's compute slice count and total memory in GB (rather than just its memory slice count). When creating a CI that consumes the calculation slice of the entire GI, the size of the CI is hidden in the device name.

Please add a picture description

Note: The table below shows the configuration file names for an A100-SXM4-40GB device. For A100-SXM4-80GB, the profile names will vary according to the memory ratio - eg, 1g.10gb, 2g.20gb, 3g.40gb, 4g.40gb, 7g80gb.

Memory 20gb 10gb 5gb
GPU Instance 3g 2g
Compute Instance 3c 2c 1c
MIG Device 3g.20gb 2g.10gb 1g.5gb
GPCGPCGPC GPCGPC GPC

       Each GI can be further subdivided into multiple CIs based on the user's workload. The table below highlights the name of the MIG device in this case. The example shown is a subdivision of a 3g.20gb device into a sub-device with a different number of compute instance slices.

insert image description here

       GPU Instances (GIs) and Compute Instances (CIs) are enumerated in the MIG /procfile system, and the corresponding device nodes (mig-minor) are created /dev/nvidia-capsbelow . MIG supports running CUDA applications by specifying CUDA devices to the application at runtime. CUDA 11/R450 and CUDA 12/R525 only support enumeration of a single MIG instance. In other words, no matter how many MIG devices are created (or made available to the container), a single CUDA process can only enumerate one MIG device.

       CUDA applications see CI and its parent GI as a single CUDA device . CUDA can only use a single CI, if multiple CIs are visible, the first available CI will be chosen. To summarize, there are two constraints:

  1. CUDA can only enumerate a single computing instance;
  2. CUDA will not enumerate non-MIG GPUs if any compute instances are enumerated on any other GPU;

Note that these limits may change in future releases of the NVIDIA MIG driver

       CUDA_VISIBLE_DEVICEShas been extended to add support for MIG. Depending on the driver version used, two formats are supported:

  1. Driver version >= R470 (470.42.01+), each MIG device is assigned a MIG-<UUID>GPU UUID starting with;
  2. For driver versions < R470, enumerate each MIG device by specifying a CI and corresponding parent GI. The format is MIG-<GPU- uuid >/<GPU实例ID>/<计算实例ID>.

Using the R470 NVIDIA Datacenter Driver (470.42.01+), the example below shows how the MIG device assigns GPU uuids in an 8-GPU system, with each GPU configured differently.

$ nvidia-smi -L        

GPU 0: A100-SXM4-40GB (UUID: GPU-5d5ba0d6-d33d-2b2c-524d-9e3d8d2b8a77)
  MIG 1g.5gb      Device  0: (UUID: MIG-c6d4f1ef-42e4-5de3-91c7-45d71c87eb3f)
  MIG 1g.5gb      Device  1: (UUID: MIG-cba663e8-9bed-5b25-b243-5985ef7c9beb)
  MIG 1g.5gb      Device  2: (UUID: MIG-1e099852-3624-56c0-8064-c5db1211e44f)
  MIG 1g.5gb      Device  3: (UUID: MIG-8243111b-d4c4-587a-a96d-da04583b36e2)
  MIG 1g.5gb      Device  4: (UUID: MIG-169f1837-b996-59aa-9ed5-b0a3f99e88a6)
  MIG 1g.5gb      Device  5: (UUID: MIG-d5d0152c-e3f0-552c-abee-ebc0195e9f1d)
  MIG 1g.5gb      Device  6: (UUID: MIG-7df6b45c-a92d-5e09-8540-a6b389968c31)
GPU 1: A100-SXM4-40GB (UUID: GPU-0aa11ebd-627f-af3f-1a0d-4e1fd92fd7b0)
  MIG 2g.10gb     Device  0: (UUID: MIG-0c757cd7-e942-5726-a0b8-0e8fb7067135)
  MIG 2g.10gb     Device  1: (UUID: MIG-703fb6ed-3fa0-5e48-8e65-1c5bdcfe2202)
  MIG 2g.10gb     Device  2: (UUID: MIG-532453fc-0faa-5c3c-9709-a3fc2e76083d)
GPU 2: A100-SXM4-40GB (UUID: GPU-08279800-1cbe-a71d-f3e6-8f67e15ae54a)
  MIG 3g.20gb     Device  0: (UUID: MIG-aa232436-d5a6-5e39-b527-16f9b223cc46)
  MIG 3g.20gb     Device  1: (UUID: MIG-3b12da37-7fa2-596c-8655-62dab88f0b64)
GPU 3: A100-SXM4-40GB (UUID: GPU-71086aca-c858-d1e0-aae1-275bed1008b9)
  MIG 7g.40gb     Device  0: (UUID: MIG-3e209540-03e2-5edb-8798-51d4967218c9)
GPU 4: A100-SXM4-40GB (UUID: GPU-74fa9fb7-ccf6-8234-e597-7af8ace9a8f5)
  MIG 1c.3g.20gb  Device  0: (UUID: MIG-79c62632-04cc-574b-af7b-cb2e307120d8)
  MIG 1c.3g.20gb  Device  1: (UUID: MIG-4b3cc0fd-6876-50d7-a8ba-184a86e2b958)
  MIG 1c.3g.20gb  Device  2: (UUID: MIG-194837c7-0476-5b56-9c45-16bddc82e1cf)
  MIG 1c.3g.20gb  Device  3: (UUID: MIG-291820db-96a4-5463-8e7b-444c2d2e3dfa)
  MIG 1c.3g.20gb  Device  4: (UUID: MIG-5a97e28a-7809-5e93-abae-c3818c5ea801)
  MIG 1c.3g.20gb  Device  5: (UUID: MIG-3dfd5705-b18a-5a7c-bcee-d03a0ccb7a96)
GPU 5: A100-SXM4-40GB (UUID: GPU-3301e6dd-d38f-0eb5-4665-6c9659f320ff)
  MIG 4g.20gb     Device  0: (UUID: MIG-6d96b9f9-960e-5057-b5da-b8a35dc63aa8)
GPU 6: A100-SXM4-40GB (UUID: GPU-bb40ed7d-cbbb-d92c-50ac-24803cda52c5)
  MIG 1c.7g.40gb  Device  0: (UUID: MIG-66dd01d7-8cdb-5a13-a45d-c6eb0ee11810)
  MIG 2c.7g.40gb  Device  1: (UUID: MIG-03c649cb-e6ae-5284-8e94-4b1cf767e06c)
  MIG 3c.7g.40gb  Device  2: (UUID: MIG-8abf68e0-2808-525e-9133-ba81701ed6d3)
GPU 7: A100-SXM4-40GB (UUID: GPU-95fac899-e21a-0e44-b0fc-e4e3bf106feb)
  MIG 4g.20gb     Device  0: (UUID: MIG-219c765c-e07f-5b85-9c04-4afe174d83dd)
  MIG 2g.10gb     Device  1: (UUID: MIG-25884364-137e-52cc-a7e4-ecf3061c3ae1)
  MIG 1g.5gb      Device  2: (UUID: MIG-83e71a6c-f0c3-5dfc-8577-6e8b17885e1f)


4. Pre-configured configuration files

       The number of slices a GI can create is not arbitrary. The NVIDIA driver APIs provide a number of "GPU instance profiles", and users can create GIs by specifying one of the profiles. These profiles can be used to create multiple GIs as long as there are enough slices to satisfy the request on a given GPU.

       For example, the table below shows the configuration file names for an A100-SXM4-40GB device. For A100-SXM4-80GB, the profile names will vary according to the memory ratio - eg, 1g.10gb, 2g.20gb, 3g.40gb, 4g.40gb, 7g80gb.

Profile Name Fraction of Memory Fraction of SMs Hardware Units L2 Cache Size Copy Engines Number of Instances Available
ME 1g.5gb 1/8 1/7 0 NVDECs /0 JPEG /0 OFA 1/8 1 7
ME 1g.5gb+me 1/8 1/7 1 NVDEC /1 JPEG /1 OFA 1/8 1 1 (A single 1g profile can include media extensions)
ME 1g.10gb 1/8 1/7 1 NVDECs /0 JPEG /0 OFA 1/8 1 4
ME 2g.10gb 2/8 2/7 1 NVDECs /0 JPEG /0 OFA 2/8 2 3
MIG 3g.20gb 4/8 3/7 2 NVDECs /0 JPEG /0 OFA 4/8 3 2
MIG 4g.20gb 4/8 4/7 2 NVDECs /0 JPEG /0 OFA 4/8 4 1
MIG 7g.40gb Full 7/7 5 NVDECs /1 JPEG /1 OFA Full 7 1

The figure below shows a graphical representation of how all valid combinations of GPU instances are built.

Please add a picture description
       In this diagram, by starting with the instance profile on the left and combining it with other instance profiles as you move to the right, you can build an efficient combination so that no two profiles are vertically Overlap. The only exception to this rule is the combination of (4 memory, 4 compute) and (4 memory, 3 compute) profiles, which is not currently supported. However, combinations of (4 memory, 3 compute) and (4 memory, 3 compute) are supported. See the Supported Profiles section for a list of all supported profile combinations and placements on the A100 and A30.

Please add a picture description
       Note that the diagram represents the physical layout of a GPU instance as it exists after it has been instantiated on the GPU. Since GPU instances are created and destroyed in different locations, fragmentation can occur, and the physical location of a GPU instance will play a role in which other GPU instances can be instantiated next to it.



Thank you for reading, here is Congshanruoshui's blog!


insert image description here

Guess you like

Origin blog.csdn.net/qq_31985307/article/details/128893798