linux + opencv + cuvid中使用cv::cuda::GpuMat类的一些坑

1.我最终成功实现了opencv中利用cuvid实现GPU视频解码:
核心代码是:

 1 cv::cuda::GpuMat d_frame;
 2     cv::Ptr<cv::cudacodec::VideoReader> d_reader = cv::cudacodec::createVideoReader(mp4_file_name);
 3 for (;;)
 4 {
 5     if (!d_reader->nextFrame(d_frame))  //BRGA格式
 6         break;
 7     gpu_frame_count++;
 8     cv::Mat frame2;
 9     d_frame.download(frame2);
10     cv::imwrite("xxx.png", frame2);
11 }

2.GupMat类的参考地址是:
  https://docs.opencv.org/master/d0/d60/classcv_1_1cuda_1_1GpuMat.html

  源码在: opencv-master/modules/core/include/opencv2/core/cuda.hpp

  GPUMat类的成员变量都是public的,就算没有提供访问的方法也没关系。

  一些重要的成员变量和成员函数是:

 1 class CV_EXPORTS_W GpuMat
 2 {
 3 public:
 4 
 5     /** @brief Performs data download from GpuMat (Blocking call)
 6 
 7     This function copies data from device memory to host memory. As being a blocking call, it is
 8     guaranteed that the copy operation is finished when this function returns.
 9     */
10     CV_WRAP void download(OutputArray dst) const;
11 
12     /** @brief Performs data download from GpuMat (Non-Blocking call)
13 
14     This function copies data from device memory to host memory. As being a non-blocking call, this
15     function may return even if the copy operation is not finished.
16 
17     The copy operation may be overlapped with operations in other non-default streams if \p stream is
18     not the default stream and \p dst is HostMem allocated with HostMem::PAGE_LOCKED option.
19     */
20     CV_WRAP void download(OutputArray dst, Stream& stream) const;
21 
22     //! the number of rows and columns
23     int rows, cols;
24 
25     //! a distance between successive rows in bytes; includes the gap if any
26     CV_PROP size_t step;
27 
28     //! pointer to the data
29     uchar* data;
30 
31     //! helper fields used in locateROI and adjustROI
32     uchar* datastart;
33     const uchar* dataend;
34 
35 };

data是GPU内存中,存储图像数据的指针

datastart的地址与data相同

dataend指向图像存储空间的结束位置。(很可惜,这里是错误的)

rows 是图片的高度

cols是图片的宽度

channels() 返回4, 说明每个像素是四个字节,格式是BGRA

step是图片每行的字节数。注意:这个值是按2的幂对齐的。我测试中使用的图片,宽度是480,每像素四字节的话,一行应该是1920; 而此处的step值是2048, 每行多出来32像素,这些像素的alpha通道值为0。

因此,虽然看起来dataend-datastart是GPU内存所占空间大小,但实际的所占空间是:step*rows

3. GpuMat类使用dowmload()方法后,Mat类会去掉多余的对齐的像素
   具体怎么做到的呢?搜索了很久终于找到源码原来在:opencv-master/modules/core/src/cuda/gpu_mat.cu

   download方法的源码是:

1 void cv::cuda::GpuMat::download(OutputArray _dst) const
2 {
3     CV_DbgAssert( !empty() );
4 
5     _dst.create(size(), type());
6     Mat dst = _dst.getMat();
7 
8     CV_CUDEV_SAFE_CALL( cudaMemcpy2D(dst.data, dst.step, data, step, cols * elemSize(), rows, cudaMemcpyDeviceToHost) );
9 }

直接这样拷贝也是可以的:
cudaMemcpy(host_data, d_frame.data,  d_frame.rows * d_frame.step , cudaMemcpyDeviceToHost);
但要注意:
#include <cuda_runtime.h>
cudaGetDeviceCount(&num_devices);
cudaSetDevice(cuda_device);
//调用各种函数来初始化cuda运行环境,否则一执行就崩溃


    

猜你喜欢

转载自www.cnblogs.com/ahfuzhang/p/10852659.html