iOS audio and video capture and format conversion (RGB YUV rpm)

Recently encountered in the project something related to iOS audio and video, and the library will use libyuv NV12 into BGRA, this knowledge work much used in order to avoid forgetting, build on the progress to write this article.

1. The audio and video capture (using the AVFoundation)

The basic process
1. Initialization input device
2. Initialize the output device
3. Create AVCaptureSession, used to capture video and data management
4. Create preview

// 初始化输入设备
- (void)initInputDevice{
    //获得输入设备
    AVCaptureDevice *backCaptureDevice=[self getCameraDeviceWithPosition:AVCaptureDevicePositionBack];//取得后置摄像头
    AVCaptureDevice *frontCaptureDevice=[self getCameraDeviceWithPosition:AVCaptureDevicePositionFront];//取得前置摄像头
    
    //根据输入设备初始化设备输入对象,用于获得输入数据
    _backCamera = [[AVCaptureDeviceInput alloc]initWithDevice:backCaptureDevice error:nil];
    _frontCamera = [[AVCaptureDeviceInput alloc]initWithDevice:frontCaptureDevice error:nil];
    
    AVCaptureDevice *audioDevice = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeAudio];
    self.audioInputDevice = [AVCaptureDeviceInput deviceInputWithDevice:audioDevice error:nil];
}
// 初始化输出设备
- (void)initOutputDevice{
    //创建数据获取线程
    dispatch_queue_t captureQueue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);
    
    //视频数据输出
    self.videoDataOutput = [[AVCaptureVideoDataOutput alloc] init];
    //设置代理,需要当前类实现protocol:AVCaptureVideoDataOutputSampleBufferDelegate
    [self.videoDataOutput setSampleBufferDelegate:self queue:captureQueue];
    //抛弃过期帧,保证实时性
    [self.videoDataOutput setAlwaysDiscardsLateVideoFrames:YES];
    //设置输出格式为 yuv420
    [self.videoDataOutput setVideoSettings:@{
                                             (__bridge NSString *)kCVPixelBufferPixelFormatTypeKey:@(kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange)
                                             }];
    
    //音频数据输出
    self.audioDataOutput = [[AVCaptureAudioDataOutput alloc] init];
    //设置代理,需要当前类实现protocol:AVCaptureAudioDataOutputSampleBufferDelegate
    [self.audioDataOutput setSampleBufferDelegate:self queue:captureQueue];
}

// 创建AVCaptureSession
- (void)createAVCaptureSession{
    
    self.captureSession = [[AVCaptureSession alloc] init];
    
    // 改变会话的配置前一定要先开启配置,配置完成后提交配置改变
    [self.captureSession beginConfiguration];
    // 设置分辨率
    [self setVideoPreset];

    //将设备输入添加到会话中
    if ([self.captureSession canAddInput:self.backCamera]) {
        [self.captureSession addInput:self.backCamera];
    }
    
    if ([self.captureSession canAddInput:self.audioInputDevice]) {
        [self.captureSession addInput:self.audioInputDevice];
    }
    
    //将设备输出添加到会话中
    if ([self.captureSession canAddOutput:self.videoDataOutput]) {
        [self.captureSession addOutput:self.videoDataOutput];
    }
    
    if ([self.captureSession canAddOutput:self.audioDataOutput]) {
        [self.captureSession addOutput:self.audioDataOutput];
    }
    
    [self createPreviewLayer];
    
    //提交配置变更
    [self.captureSession commitConfiguration];
    
    [self startRunning];
    
}

// 创建预览视图
- (void)createPreviewLayer{
    
    [self.view addSubview:self.preView];
    
    //创建视频预览层,用于实时展示摄像头状态
    _captureVideoPreviewLayer = [[AVCaptureVideoPreviewLayer alloc]initWithSession:self.captureSession];
    
    _captureVideoPreviewLayer.frame = self.view.bounds;
    _captureVideoPreviewLayer.videoGravity=AVLayerVideoGravityResizeAspectFill;//填充模式
    //将视频预览层添加到界面中
    [self.view.layer addSublayer:_captureVideoPreviewLayer];
}


#pragma mark - Control start/stop capture or change camera
- (void)startRunning{
    if (!self.captureSession.isRunning) {
        [self.captureSession startRunning];
    }
}
- (void)stop{
    if (self.captureSession.isRunning) {
        [self.captureSession stopRunning];
    }
    
}

/**设置分辨率**/
- (void)setVideoPreset{
    if ([self.captureSession canSetSessionPreset:AVCaptureSessionPreset1920x1080])  {
        self.captureSession.sessionPreset = AVCaptureSessionPreset1920x1080;
    }else if ([self.captureSession canSetSessionPreset:AVCaptureSessionPreset1280x720]) {
        self.captureSession.sessionPreset = AVCaptureSessionPreset1280x720;
    }else{
        self.captureSession.sessionPreset = AVCaptureSessionPreset640x480;
    }
    
}

/**
 *  取得指定位置的摄像头
 *
 *  @param position 摄像头位置
 *
 *  @return 摄像头设备
 */
-(AVCaptureDevice *)getCameraDeviceWithPosition:(AVCaptureDevicePosition )position{
    NSArray *cameras= [AVCaptureDevice devicesWithMediaType:AVMediaTypeVideo];
    for (AVCaptureDevice *camera in cameras) {
        if ([camera position]==position) {
            return camera;
        }
    }
    return nil;
}

2. CMSampleBufferRef

Introduced in front of the camera how to get real-time audio and video data, we then need to know in the end what kind of captured data, audio and video interface to get to the data provided by the system are stored in CMSampleBufferRef, this structure in iOS represents an audio / video data, it contains the content and format of the frame data, we can take out its contents, the data is extracted / converted into what we want.
Saved CMSampleBufferRef representing the video format of the data is yuv420 video frame (because we set the video output settings will output format: kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange).
In the following callback, data can get final CMSampleBufferRef

-(void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection{
            
}

3.yuv,NV12

Video is connected by a data frame is made while a video is actually a picture.
yuv is a picture storage format, similar with the RGB format.
RGB format images is well understood, the computer most of the pictures are stored in RGB format.
yuv in, y represents the brightness, only y data alone can form a picture, but this picture is gray. u and v represent the color difference (u and v is also known as: Cb- blue color difference, Cr- red color difference),
why should yuv?
There are certain historical reasons, the first television signal, for compatibility with black and white TV, yuv format is used.
A yuv images, remove uv, leaving only y, this picture is black and white.
Yuv and bandwidth optimization can be done by discarding color.
For example, compared to RGB format image yuv420 who will have to save half the size in bytes, to abandon the adjacent color to the human eye, it is not very different.
Yuv an image format, the number of bytes occupied (width * height + (width * height) / 4 + (width * height) / 4) = (width * height) * 3/2
a RGB format image, the number of bytes occupied (width * height) * 3
in the transmission, the video format yuv more flexible (yuv3 kinds of data can be transmitted, respectively).
Many video encoder initially does not support rgb format. But all video encoders are supported yuv format.
It is the video we use here yuv420 format.
yuv420 also contain different data arrangement format: I420, NV12, NV21.
format are as follows,
I420 format: y, u, v 3 parts are stored: Y0, Yl ... Yn of, U0, Ul ... Un / 2, V0, Vl ... Vn of / 2
in NV12 format: y and uv 2 parts store: Y0, Yl ... Yn of, U0, V0, Ul, Vl ... Un / 2, Vn of / 2
NV21 format: with NV12, but in the reverse order of U and V.
In summary, except for sequentially storing the format for the display is no difference.
Which video format to use, depending on the set when initializing the camera video output format.
When set to kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange, showing the video format of the output is in NV12;
When set kCVPixelFormatType_420YpCbCr8Planar, indication I420.
When set to kCVPixelFormatType_32RGBA, indication BGRA.

When GPUImage set the camera output data use is the NV12.
For consistency, we here NV12 also choose the output video format.

4.libyuv

libyuv Google is open source to achieve mutual conversion, rotation, scaling between various RGB and YUV library. It is cross-platform, available in Windows, Linux, Mac, Android and other operating systems, x86, x64, arm compiled to run on the architecture to support SSE, AVX, NEON SIMD instructions such as acceleration.

5. Use libyuv will nv12 into rgba

Import libyuv library, and set the search path, or will be error


1070332-d918c7620865ea90.png
Header Search Paths
-(void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection{
    CVPixelBufferRef initialPixelBuffer= CMSampleBufferGetImageBuffer(sampleBuffer);
    if (initialPixelBuffer == NULL) {
        return;
    }
    // 获取最终的音视频数据
    CVPixelBufferRef newPixelBuffer = [self convertVideoSmapleBufferToBGRAData:sampleBuffer];
    
    // 将CVPixelBufferRef转换成CMSampleBufferRef
    [self pixelBufferToSampleBuffer:newPixelBuffer];
    NSLog(@"initialPixelBuffer%@,newPixelBuffer%@", initialPixelBuffer, newPixelBuffer);

    // 使用完newPixelBuffer记得释放,否则内存会会溢出
    CFRelease(newPixelBuffer);
}


//转化
-(CVPixelBufferRef)convertVideoSmapleBufferToBGRAData:(CMSampleBufferRef)videoSample{
    
    //CVPixelBufferRef是CVImageBufferRef的别名,两者操作几乎一致。
    //获取CMSampleBuffer的图像地址
    CVImageBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(videoSample);
  //VideoToolbox解码后的图像数据并不能直接给CPU访问,需先用CVPixelBufferLockBaseAddress()锁定地址才能从主存访问,否则调用CVPixelBufferGetBaseAddressOfPlane等函数则返回NULL或无效值。值得注意的是,CVPixelBufferLockBaseAddress自身的调用并不消耗多少性能,一般情况,锁定之后,往CVPixelBuffer拷贝内存才是相对耗时的操作,比如计算内存偏移。
    CVPixelBufferLockBaseAddress(pixelBuffer, 0);
    //图像宽度(像素)
    size_t pixelWidth = CVPixelBufferGetWidth(pixelBuffer);
    //图像高度(像素)
    size_t pixelHeight = CVPixelBufferGetHeight(pixelBuffer);
    //获取CVImageBufferRef中的y数据
    uint8_t *y_frame = (unsigned char *)CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0);
    //获取CMVImageBufferRef中的uv数据
    uint8_t *uv_frame =(unsigned char *) CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 1);
    
    
    // 创建一个空的32BGRA格式的CVPixelBufferRef
    NSDictionary *pixelAttributes = @{(id)kCVPixelBufferIOSurfacePropertiesKey : @{}};
    CVPixelBufferRef pixelBuffer1 = NULL;
    CVReturn result = CVPixelBufferCreate(kCFAllocatorDefault,
                                          pixelWidth,pixelHeight,kCVPixelFormatType_32BGRA,
                                          (__bridge CFDictionaryRef)pixelAttributes,&pixelBuffer1);
    if (result != kCVReturnSuccess) {
        NSLog(@"Unable to create cvpixelbuffer %d", result);
        return NULL;
    }
    CVPixelBufferUnlockBaseAddress(pixelBuffer, 0);
    
    result = CVPixelBufferLockBaseAddress(pixelBuffer1, 0);
    if (result != kCVReturnSuccess) {
        CFRelease(pixelBuffer1);
        NSLog(@"Failed to lock base address: %d", result);
        return NULL;
    }
    
    // 得到新创建的CVPixelBufferRef中 rgb数据的首地址
    uint8_t *rgb_data = (uint8*)CVPixelBufferGetBaseAddress(pixelBuffer1);
    
    // 使用libyuv为rgb_data写入数据,将NV12转换为BGRA
    int ret = NV12ToARGB(y_frame, pixelWidth, uv_frame, pixelWidth, rgb_data, pixelWidth * 4, pixelWidth, pixelHeight);
    if (ret) {
        NSLog(@"Error converting NV12 VideoFrame to BGRA: %d", result);
        CFRelease(pixelBuffer1);
        return NULL;
    }
    CVPixelBufferUnlockBaseAddress(pixelBuffer1, 0);
    
    return pixelBuffer1;
}

// 将CVPixelBufferRef转换成CMSampleBufferRef
-(CMSampleBufferRef)pixelBufferToSampleBuffer:(CVPixelBufferRef)pixelBuffer
{
    
    CMSampleBufferRef sampleBuffer;
    CMTime frameTime = CMTimeMakeWithSeconds([[NSDate date] timeIntervalSince1970], 1000000000);
    CMSampleTimingInfo timing = {frameTime, frameTime, kCMTimeInvalid};
    CMVideoFormatDescriptionRef videoInfo = NULL;
    CMVideoFormatDescriptionCreateForImageBuffer(NULL, pixelBuffer, &videoInfo);
    
    OSStatus status = CMSampleBufferCreateForImageBuffer(kCFAllocatorDefault, pixelBuffer, true, NULL, NULL, videoInfo, &timing, &sampleBuffer);
    if (status != noErr) {
        NSLog(@"Failed to create sample buffer with error %zd.", status);
    }
    CVPixelBufferRelease(pixelBuffer);
    if(videoInfo)
        CFRelease(videoInfo);
    
    return sampleBuffer;
}

6. Summary

AVFoundation article only describes the use of video capture and use libyuv format conversion, audio and video-related knowledge, there are many, this is no longer done in detail.

Download demo

Guess you like

Origin blog.csdn.net/weixin_34405332/article/details/91019203