[2023 CANN Training Camp Season 1] Application Development (Elementary) - Chapter 1 Overview of AscendCL

image.png

ACL basic concepts

ACL basic concepts

  • Host:
    Host refers to the X86 server and ARM server connected to the Device, which will use the NN (Neural-Network) computing power provided by the Device to complete the business.
  • Device:
    Device refers to the hardware device installed with the chip, which is connected to the Host side through the PCle interface, and provides the NN computing capability for the Host.
  • Synchronous/Asynchronous:
    Synchronous and asynchronous are observed from the perspective of the caller and the executor. In the current scenario, if the Host does not wait for the Device to complete execution before returning after calling the interface, it means that the Host's scheduling is asynchronous; if After the Host calls the interface, it needs to wait for the Device to finish executing before returning, which means that the scheduling of the Host is synchronous.
  • Process/Thread:
    The processes and threads mentioned in this course, unless otherwise specified, refer to the processes and threads on the Host.
  • Context:
    As a container, Context manages the life cycle of all objects (including Stream, Event, device memory, etc.). Streams of different Contexts.
    Events of different Contexts are completely isolated, and no synchronous waiting relationship can be established.

Context is divided into two types:
Default Context: When the aclrtSetDevice interface is called to specify the Device used for calculation, the system will automatically and implicitly create a default Context. A Device corresponds to a default Context. The default Context cannot be released through the aclrtDestroyContext interface.
Explicitly created Context: Call the aclrtCreateContext interface in a process or thread to explicitly create a Context.

  • Stream:
    Stream is used to maintain the execution order of some asynchronous operations to ensure that they are executed on the Device according to the code invocation order in the application. Stream-based kernel execution and data transmission can achieve the following types of parallelism:
    Host operation operation and Device operation operation in parallel;
    Host operation operation and "Host to Device data transmission" in parallel; "Host to Device data transmission" and Device Computation operation parallelism
    Operation parallelism in Device

There are two types of streams:
default stream: when the aclrtSetDevice interface is called to specify a device for calculation, the system will automatically and implicitly create a default stream, a device corresponds to a default stream, and the default stream cannot be released through the aclrtDestroyStream interface.
Explicitly created Stream: Call the aclrtCreateStream interface in a process or thread to explicitly create a Stream.

  • Event:
    Support calling the ACL interface to synchronize tasks between Streams, including synchronizing tasks between Host and Device, and tasks between Devices. >For example, if the tasks of stream2 depend on the tasks of stream1, you want to ensure that the tasks in stream1 are completed first , then create an Event and insert the Event into stream1, and wait for the completion of the Event synchronously before executing the task of stream2.
  • Dynamic AIPP:
    AIPP (Al Preprocessing) is used to complete image preprocessing on Al Core, including color gamut conversion (converting image format), image normalization (subtracting mean value/multiplication coefficient) and matting (specifying the matting starting point, Cut out the picture of the size required by the neural network), etc. > AIPP is divided into static AIPP and dynamic AIPP. You can only choose static AIPP or dynamic AIPP to process pictures, and you cannot configure static AIPP and dynamic AIPP at the same time.
    – Static AIPP: Set the APP mode to static during model conversion, and set the AIPP parameters at the same time. After the model is generated, the AIPP parameter values ​​​​are saved in the offline model ("om) and each model reasoning process adopts fixed AIPP preprocessing parameters (cannot be modified ).
    If the static AIPP method is used, the same AIPP parameter is shared under multiple batches.
    – Dynamic AIPP: Only set the AIPP mode to dynamic during model conversion. Before each model inference, set the dynamic AIPP parameter value before executing the model according to the requirements , and then different AIPP parameters can be used when the model is executed. For the interface to set dynamic AIPP parameter values, please refer to Setting Dynamic AIPP Parameters. If the dynamic AIPP method is used, multiple batches can use different AIPP parameters.

Main interface call process

image.png

おすすめ

転載: blog.csdn.net/qq_45257495/article/details/130874724