opencv新版本Improvement

 

Table 1 Improvements to OpenCV3.4.x

 

OpenCV3.4.x

GSoC

谷歌编程之夏

new background subtraction algorithms have been integrated.

集成了新的 背景减法算法。

DNN

Added faster R-CNN support and the corresponding example.

增加了更快的R-CNN支持和相应的例子。

Javascript bindings have been extended to cover DNN module.

Javascript绑定已经扩展到DNN模块。

DNN has been further accelerated for iGPU using OpenCL.

In particular, MobileNet-SSD networks now run ~7 times faster than in OpenCV 3.3.1.

使用OpenCL的iGPU进一步加速了DNN。特别是,MobileNet-SSD网络现在比OpenCV 3.3.1快7倍。

Added support for quantized TensorFlow networks.

增加了对量子化张量网络的支持。

OpenCV is now able to use Intel DL inference engine as DNN acceleration backend.

(Refer to table 2.)

 

OpenCV现在可以使用Intel DL推理引擎作为DNN加速后端。


(见表2)

Added AVX-512 acceleration to the performance-critical kernels, such as convolution and fully-connected layers.

为性能关键的内核增加了AVX-512加速,例如卷积层和全连接层。

SSD-based models trained and retrained in TensorFlow Object Detection API can be easier imported by a single invocation of python script making a text graph representation.

在TensorFlow对象检测API中训练和重新训练的基于ssd的模型可以通过对python脚本的一次调用来实现文本图表示来更容易地导入。

Performance of pthreads backend of cv::parallel_for_() (which is used by default on Linux/Android, unless you installed TBB or chose OpenMP) has been greatly improved on many core machines, in particular 10-core Core i9. That let us to increase performance of DNN inference quite significantly (up to 6x) on such machines.

cv::parallel_for_()(默认在Linux/Android上使用,除非安装TBB或选择OpenMP))的pthreads后端性能已在许多核心机器上得到极大改进,特别是10核核心i9。这让我们可以大大提高DNN推理的性能(最高可达6x)。

OpenCL backend has been expanded to cover more layers. The layer fusion has also been improved to increase the speed even further. It shall be reminded that in order to enable OpenCL backend (if it's available on the host machine) one should call the method my_dnn_net.setPreferableTarget(cv::dnn::DNN_TARGET_OPENCL) before the inference, where my_dnn_net is the network loaded using cv::dnn::readNetFromCaffe()cv::dnn::readNetFromTensorFlow() etc.

OpenCL后端已经扩展到覆盖更多的层。层融合也得到了改进,进一步提高了速度。需要提醒的是,为了启用OpenCL后端(如果在主机上可用),应该在推断之前调用方法my_dnn_net::DNN_TARGET_OPENCL),其中my_dnn_net是使用cv::dnn::readNetFromCaffe()加载的网络,cv::dnn::readNetFromTensorFlow()等。

Several bugs in various layers have been fixed; in particular, SSD priors are now computed slightly differently so that we can more accurate bounding boxes when running SSD on variable-size images.

修复了不同层中的几个bug;特别的是,SSD先验现在的计算方式稍有不同,这样我们就可以在不同大小的图像上运行SSD时更精确的边界框。

Added a new computational target DNN_TARGET_OPENCL_FP16 for half-precision floating point arithmetic of deep learning networks using OpenCL. Just use net.setPreferableTarget(DNN_TARGET_OPENCL_FP16).

Extended support of Intel's Inference Engine backend to run models on GPU (OpenCL FP32/FP16) and VPU (Myriad 2, FP16) devices. See an installation guide for details.

添加了一个新的计算目标DNN_TARGET_OPENCL_FP16,用于使用OpenCL的深度学习网络的半精确浮点算法。只使用net.setPreferableTarget(DNN_TARGET_OPENCL_FP16)。扩展了对Intel推理引擎后端的支持,以在GPU (OpenCL FP32/FP16)和VPU (Myriad 2, FP16)设备上运行模型。有关详细信息,请参阅安装指南。

Enabled import of Intel's OpenVINO pre-trained networks from intermediate representation (IR).

允许从中间表示(IR)导入英特尔的OpenVINO预训练网络。

Introduced custom layers support which let you define unimplemented layers or override existing ones. Learn more in a corresponding tutorial.

引入自定义层支持,允许您定义未实现的层或覆盖现有的层。在相应的教程中了解更多信息。

Implemented a new deep learning sample inspired by EAST: An Efficient and Accurate Scene Text Detector.

实现了一个新的受EAST启发的深度学习示例:一个高效准确的场景文本检测器。

Added a support of YOLOv3 and image classification models from Darknet framework.

添加了一个来自Darknet框架的YOLOv3和图像分类模型的支持。

Reduced top DNN's memory consumption and improvements in support of networks from TensorFlow and Keras.

减少了top DNN的内存消耗,改善了对TensorFlow和Keras网络的支持。

OpenCL

On-disk caching of precompiled OpenCL kernels has been finally implemented. It noticeably reduces initialization time of applications that use a lot of kernels.

最终实现了预编译的OpenCL内核的磁盘缓存。它显著减少了使用大量内核的应用程序的初始化时间。

it's now possible to load and run pre-compiled OpenCL kernels via T-API. It can be useful on embedded platforms without OpenCL JIT compiler available.

现在可以通过T-API加载和运行预编译的OpenCL内核了。它可以在没有OpenCL JIT编译器的嵌入式平台上使用。

Bit-exact 8-bit and 16-bit resize has been implemented (currently supported only bilinear interpolation). Use INTER_LINEAR_EXACT interpolation mode. In many places in the library we've switched to this new resize. Bit-exact means that on any platform with any compiler etc. you will get absolutely the same results for the same scale factor values, there will be no difference (even +/-1) in pixel values in the output image. The function complements a few other bit-exact algorithms added in OpenCV 3.3.1: cvtColor(RGB<=>Lab, RGB<=>Luv).

位精确的8位和16位大小调整已经实现(目前只支持双线性插值)。使用INTER_LINEAR_EXACT插值模式。在库中的许多地方,我们已经切换到这个新的调整大小。位精确的意思是,在任何平台上,任何编译器等,你会得到完全相同的结果,对于相同的比例因子值,没有区别(甚至+/-1)像素值在输出图像。该函数补充了OpenCV 3.3.1中添加的其他一些位精确算法:cvtColor(RGB<=>Lab, RGB<=>Luv)。

On-disk caching of precompiled OpenCL kernels has been fixed to comply with OpenCL standard. Correspondingly, it now works well with the new Intel OpenCL (NEO) drivers.

预编译的OpenCL内核的磁盘缓存已修复,以符合OpenCL标准。相应地,它现在与新的英特尔OpenCL (NEO)驱动程序一起工作得很好。

Certain cases with UMat deadlock when copying UMats in different threads has been fixed.

当在不同线程中复制UMats时,UMat死锁的某些情况已经修复。

RTFM

/

Android

Supported Android NDK16

支持Android NDK16

Added build.gradle into OpenCV 4 Android SDK

Added initial support of Camera2 API via JavaCamera2View interface

通过JavaCamera2View接口增加了Camera2 API的初始支持

C++

C++11: added support of multi-dimentional cv::Mat creation via C++ initializers lists:

auto K = Mat_<double>({3, 3}, {0, -1, 0, -1, 5, -1, 0, -1, 0});

C++17: OpenCV source code and tests comply C++17 standard

github

opencv_contrib: added GMS matching

opencv_contrib: added CSR-DCF tracker

opencv_contrib: several improvements in OVIS module (OGRE 3D based visualizer)

win10

Video I/O: improved support of Microsoft Media Foundation (MSMF)

 

Table 2 noticeable performance boost on many models when using Intel DL inference engine as DNN acceleration backend

Model

CPU, default backend

CPU, Inference Engine backend, MKL-DNN plugin

Model Optimizer + Inference Engine, MKL-DNN plugin (a standalone application)

AlexNet

14.44ms

12.09ms (x1.19)

12.05ms

GoogLeNet

15.26ms

8.92ms (x1.71)

8.75ms

ResNet-50

35.78ms

19.53ms (x1.83)

19.4ms

SqueezeNet v1.1

4.01ms

2.60ms (x1.54)

2.5ms

MobileNet-SSD from Caffe

21.62ms

8.89ms (x2.43)

 

DenseNet-121

61.71ms

28.21ms (x2.18)

 

OpenPose (COCO) @ 368x368

885.57ms

544.05ms (x1.62)

 

OpenPose (MPI) @ 368x368

879.13ms

533.96ms (x1.64)

 

OpenPose (MPI, 4 stages) @ 368x368

605.63ms

378.49ms (x1.60)

 

OpenFace

3.84ms

2.59ms (x1.48)

 

 

Table 3 Improvements to OpenCV4.0.0 alpha

 

OpenCV4.0.0 alpha

DNN

ONNX parser has been added to OpenCV DNN module. It supports various classification networks, such as AlexNet, Inception v2, Resnet, VGG etc. The tiny YOLO v2 object detection network is also partially supported.

ONNX解析器已添加到OpenCV DNN模块。它支持各种分类网络,如AlexNet、Inception v2、Resnet、VGG等,也部分支持微型YOLO v2对象检测网络。

A few other notable DNN improvements:

Mask RCNN support and the example

Faster object detection when using Intel Inference Engine (a part of Intel OpenVINO)

Several stability improvements in the OpenCL backend

其他一些值得注意的DNN改进:

Mask RCNN支持和例子;

使用Intel推理引擎(Intel OpenVINO的一部分)时,能更快的检测目标;

在OpenCL后端更稳定的改善。

Fast QR code detector (~80FPS @ 640x480 resolution on Core i5 desktop). By 4.0 gold we plan to add the QR code decoder as well, so that we have a complete solution.

快速二维码检测器(核心i5桌面分辨率~80FPS @ 640x480)。到4.0 gold我们还计划添加二维码解码器,这样我们就有了一个完整的解决方案。

Constantly expanding set of SSE4-, AVX2- and NEON-optimized kernels via so called “wide universal intrinsics”.

通过所谓的“广泛的普遍特性”不断扩展SSE4-、AVX2-和NEON-优化的内核集。

exclusive features

OpenCV is C++11 library now and it requires C++11 compliant compiler. Therefore, some nice features like parallel_for with lambda functions, convenient iteration over cv::Mat, initialization of cv::Mat by listing its elements etc. are available by default.

OpenCV现在是c++ 11库,它需要符合c++ 11的编译器。因此,默认情况下,可以使用parallel_for和lambda函数、cv::Mat上方便的迭代、cv::Mat的初始化(列出其元素等)等功能。

The standard std::string and std::shared_ptr replaced hand-crafted cv::String and cv::Ptr. Our parallel_for can now use the pool of std::threads as the backend.

标准的std::string和std::shared_ptr替换了手工制作的cv:: string和cv::Ptr。parallel_for现在可以使用std::threads池作为后端。

The legacy C API from OpenCV 1.x (using CvMat, IplImage, etc.) is partially excluded; the cleanup should mostly be finished by OpenCV 4.0 gold.

部分排除OpenCV 1.x(使用CvMat, IplImage等)中的遗留C API。大部分清理工作将由OpenCV 4.0 gold完成。

Added basic FP16 support (the new CV_16F type has been added).

添加了基本的FP16支持(添加了新的CV_16F类型)。

CPU- and GPU- accelerated

CPU- and GPU-accelerated KinFu live 3d dense reconstruction algorithm has been included into opencv_contrib.

  • HPX parallel backend
  • The new chessboard detector

Performance improvements

A few hundreds of basic kernels in OpenCV have been rewritten using so-called "wide universal intrinsics".

​​​​​​​OpenCV中的几百个基本内核已经用所谓的“广泛的普遍本质”重写了。

 

猜你喜欢

转载自blog.csdn.net/Fan0920/article/details/82971252