The new version of MegCC is coming! New tools such as Benchmark have been added, and the performance has been greatly improved! Prize-winning essay activity starts simultaneously

The latest version of MegCC is freshly released, with new tools and new experience. This version has comprehensively improved user experience and model reasoning performance. The main improvements include:

  1. Added the Benchmark tool, which is used to quickly benchmark the inference performance and visualization of commonly used models;
  2. A new Kernel C code export tool is added, which is convenient for users to customize and obtain the operator Kernel, and facilitates migration and reuse;
  3. Optimize the performance of NN Kernel and maintain the advanced performance of the inference SDK;
  4. Support third-party NPU loader to facilitate the migration of NPU-related applications. The following is an introduction to the new functions and features of the latest version of MegCC:

1. MegCC Benchmark

The new version of MegCC supports the basic Benchmark module to test the inference performance of various models, obtain the performance data of each Kernel during inference, and analyze model performance bottlenecks.

  1. Currently, the models supported by Benchmark are: effecientnetb0, resnet18, resnet50, vgg11, vgg16, shufflenetv2, mobilenetv2. The main format of the model is ONNX format, which is converted to MegEngine format by MgeConvert as MegCC's indirect support for ONNX format. More about MgeCovnert For content, please refer to: https://github.com/MegEngine/mgeconvert ;
  2. The newly added Benchmark can support the acquisition of performance data of each Kernel in the model. Bechmark has added the function of Kernel performance visualization for model performance bottleneck analysis;1.png
  3. Benchmark supports the visualization of model inference data for an overview of model inference performance on different devices.2.png

2. MegCC Kernel export tool - Kernel_exporter

MegCC has added Kernel_exporter, a tool for exporting Kernel C code. Users can export the C code of the required Kernel by setting the properties of the required Kernel, which is convenient for further transplantation and reuse.

  1. There are currently two ways to use the Kernel export tool, one is to export the Kernel using default parameters, and the other is to export a customized Kernel by giving some key parameters interactively.
  • Default parameter usage:
./kernel_exporter --arch <arch_type> --kernel <kernel_type> --use_default_attr
  • Interactive usage:
./kernel_exporter --arch <arch_type> --kernel <kernel_type>

The specific parameters of arch_type and kernel_type can be viewed through --help, currently supported Kernels include:

ArgSortKernel           ArgmaxKernel                BatchMatmulKernel       CVTransposeKernel
ConcatKernel            ConvBackDataKernel          ConvKernel              CvtColorKernel
ElemwiseKernel          ElemwiseMultiKernel         FlipKernel              IndexingMultiAxisKernel
IndexingOneHotKernel    MatrixInvKernel             MatrixMulKernel         PoolingKernel
PowCKernel              ReduceKernel                RelayoutKernel          ResizeKernel
RoiCopyKernel           RotateKernel                TopK                    TypeCvtKernel
WarpAffineKernel        WarpPerspectiveKernel
  1. The exported Kernel C code will be under the current directory used by the tool.

3. Performance optimization

a. On the basis of the first version, MegCC has made a series of optimizations to the existing Kernel, mainly including:

  1. Support the function of multiple elemwise fuses during compilation, and the performance of elemwise after fuse is better;
  2. Supports the Kernel implementation of General Instrinsic MAX and MIN, effectively improving the inference performance of models with a large number of MAX MIN operators;
  3. Use assembly to optimize arm64 sigmoid to reduce the constraints of sigmoid on model reasoning performance;
  4. Added conv3x3 winograd optimization, which greatly improves the calculation performance of 3x3 convolution during inference;
  5. Added some Kernel heuristic selection operator functions to ensure that when multiple Kernels are available, the appropriate Kernel is selected.

b. Optimized model performance: The reason why MegCC is slightly slower than MegEngine for some models in the above figure is that MegEngine has perfect algorithm search logic, and the algorithm selected in some scenarios is better than MegCC, and subsequent versions of MegCC will make up for this part of the work.

The reason why MegCC is slightly slower than MegEngine for some models in the above figure is that MegEngine has perfect algorithm search logic, and the algorithm selected in some scenarios is better than MegCC, and subsequent versions of MegCC will make up for this part of the work.

The new version of MegCC mainly improves and optimizes the basic functions of inference, and provides peripheral tools such as Benchmark and Kernel_exporter, which are convenient for users to obtain inference performance and the Kenrel code in the inference model, and continuously optimize Kernel performance. Interested friends come and try it out !4.jpg

Attached:

To get more information about MegEngine, you can: view documents , and  GitHub projects , or join the MegEngine user communication QQ group: 1029741705. Welcome to contribute to the MegEngine community, become an  Awesome MegEngineer , and enjoy endless certificates of honor and customized gifts.

Clarification about MyBatis-Flex plagiarizing MyBatis-Plus Arc browser officially released 1.0, claiming to be a substitute for Chrome OpenAI officially launched Android version ChatGPT VS Code optimized name obfuscation compression, reduced built-in JS by 20%! LK-99: The first room temperature and pressure superconductor? Musk "purchased for zero yuan" and robbed the @x Twitter account. The Python Steering Committee plans to accept the PEP 703 proposal, making the global interpreter lock optional . The number of visits to the system's open source and free packet capture software Stack Overflow has dropped significantly, and Musk said it has been replaced by LLM
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/5265910/blog/6044643