Android native层使用SoftwareRenderer及AudioTrack播放

本文链接： https://blog.csdn.net/myvest/article/details/98472281

之前的文章中，视频播放将YUV转RGB后使用SurfaceView+ANativeWindow显示，音频播放则是利用Java层的AudioTrack进行播放。
https://blog.csdn.net/myvest/article/details/90717333
https://blog.csdn.net/myvest/article/details/90731805
其实效率是比较低的，那么我们可以把这些都放到native层，视频可以直接利用AwesomePlayer中的SoftwareRenderer来进行显示，音频则利用C++的AudioTrack进行播放。
使用SoftwareRenderer可以省去将Java转换为RGB的过程，而音频部分则少了JNI数据传递的流程了。

音频：native层AudioTrack使用

在native层使用AudioTrack和Java接口的使用并无太大区别，
构造函数：
默认参数一般不用修改，我们仅关注前4个参数：
streamType：表示声音类型，和Java接口一样，对于音频流输出，我们选AUDIO_STREAM_MUSIC
sampleRate：采样率
format：音频格式
channelMask：音频声道对应的layout


    /* Creates an AudioTrack object and registers it with AudioFlinger.
     * Once created, the track needs to be started before it can be used.
     * Unspecified values are set to appropriate default values.
     * With this constructor, the track is configured for streaming mode.
     * Data to be rendered is supplied by write() or by the callback EVENT_MORE_DATA.
     * Intermixing a combination of write() and non-ignored EVENT_MORE_DATA is not allowed.
     *
     * Parameters:
     *
     * streamType:         Select the type of audio stream this track is attached to
     *                     (e.g. AUDIO_STREAM_MUSIC).
     * sampleRate:         Data source sampling rate in Hz.
     * format:             Audio format (e.g AUDIO_FORMAT_PCM_16_BIT for signed
     *                     16 bits per sample).
     * channelMask:        Channel mask.
     * frameCount:         Minimum size of track PCM buffer in frames. This defines the
     *                     application's contribution to the
     *                     latency of the track. The actual size selected by the AudioTrack could be
     *                     larger if the requested size is not compatible with current audio HAL
     *                     configuration.  Zero means to use a default value.
     * flags:              See comments on audio_output_flags_t in <system/audio.h>.
     * cbf:                Callback function. If not null, this function is called periodically
     *                     to provide new data and inform of marker, position updates, etc.
     * user:               Context for use by the callback receiver.
     * notificationFrames: The callback function is called each time notificationFrames PCM
     *                     frames have been consumed from track input buffer.
     *                     This is expressed in units of frames at the initial source sample rate.
     * sessionId:          Specific session ID, or zero to use default.
     * transferType:       How data is transferred to AudioTrack.
     * threadCanCallJava:  Not present in parameter list, and so is fixed at false.
     */

                        AudioTrack( audio_stream_type_t streamType,
                                    uint32_t sampleRate,
                                    audio_format_t format,
                                    audio_channel_mask_t,
                                    int frameCount       = 0,
                                    audio_output_flags_t flags = AUDIO_OUTPUT_FLAG_NONE,
                                    callback_t cbf       = NULL,
                                    void* user           = NULL,
                                    int notificationFrames = 0,
                                    int sessionId        = 0,
                                    transfer_type transferType = TRANSFER_DEFAULT,
                                    const audio_offload_info_t *offloadInfo = NULL,
                                    int uid = -1);

其他函数无太多可以说道的，使用比较简单：
1、构造->初始化initCheck->start->setVolume(xxx)
2、然后不断将一帧PCM数据写入即可ssize_t write(const void* buffer, size_t size);
3、播放结束调用stop->flush
直接看代码吧。

示例代码：

1、初始化：

bool AudioState::audio_play()
{
    if(audio_track == NULL) {                
        audio_track = new AudioTrack(AUDIO_STREAM_MUSIC, audio_ctx->sample_rate, \
                                     AUDIO_FORMAT_PCM_16_BIT, av_get_default_channel_layout(2));
        CHECK_EQ(audio_track->initCheck(), (status_t)OK);
        audio_track->start(); 
        audio_track->setVolume(1.0);
    }
    if(audio_track == NULL)
    {
        LOGE("AudioTrack fail\n");
        return false;
    }
    wait_video = false;
    pthread_create(&audio_tid, NULL, audio_thread, this);
    return true;
}

2、解码播放线程，写入PCM数据：

void* audio_thread(void *userdata)
{
    AudioState *audio_state = (AudioState *)userdata;
    LOGE("[%s]:[%d]\n",__FUNCTION__,__LINE__);
    while (!quit)
    {
        int audio_size = 0;
        memset(audio_state->audio_buff,0x00,sizeof(audio_state->audio_buff));
        audio_size = audio_decode_frame(audio_state, audio_state->audio_buff);
        if (audio_size < 0) // 没有解码到数据或出错，填充0
        {
            break;
        }
        else
        {
            audio_state->audio_track->write(audio_state->audio_buff, audio_size);
        }        
        usleep(5000);
    }
    return NULL;
}

3、停止：

AudioState::~AudioState()
{
    if(audio_buff)
        delete[] audio_buff;
    if(audio_ctx)
        avcodec_close(audio_ctx);
    if(audio_track.get() != NULL)
    {
        audio_track->stop();
        audio_track->flush();
        audio_track = NULL;
    }
}

视频：native层SoftwareRenderer使用

SoftwareRenderer是AwesomePlayer用来做渲染显示的类，其内部也是调用ANativeWindow的API。我们可以直接注入YUV数据，底层会转换成RGB数据显示。内部原理后面我会进一步研究学习，本次先将其使用方法。

首先需要创建一个surface，native层surface可以通过两种途径获取：

在native层构造：

通过SurfaceComposerClient获取SurfaceControl，通过getSurface函数既可以获取surface对象。
createSurface主要需要传入其宽、高和像素格式。

sp<SurfaceControl> SurfaceComposerClient::createSurface(
        const String8& name,
        uint32_t w,
        uint32_t h,
        PixelFormat format,
        uint32_t flags)

设置surface x,y,z序，并将其显示：

	SurfaceComposerClient::openGlobalTransaction();
    CHECK_EQ(surfaceControl->setLayer(INT_MAX), (status_t)OK);
    CHECK_EQ(surfaceControl->show(), (status_t)OK);
    CHECK_EQ(surfaceControl->setPosition(0, 0), (status_t)OK);
    SurfaceComposerClient::closeGlobalTransaction();

从Java层传入：

从surfaceview获取surface：

	mSurfaceView1 = (SurfaceView)findViewById(R.id.surfaceView1); 
	final SurfaceHolder sh1= mSurfaceView1.getHolder();
	sh1.addCallback(new Callback() {
		@Override
		public void surfaceChanged(SurfaceHolder arg0, int arg1, int arg2, int arg3) {
			// TODO Auto-generated method stub
			if(mPlayer1 != null){
			Log.d(TAG, "mPlayer1 start");
			mPlayer1.setSurface((int)mSurfaceView1.getX(),(int)mSurfaceView1.getY(),
			mSurfaceView1.getWidth(), mSurfaceView1.getHeight(), 
			sh1.getSurface());
			mPlayer1.startPlay();
		}			
	}

在native层进行转换：

sp<Surface> surface(android_view_Surface_getSurface(env, jsurface));

得到surface后，构造SoftwareRenderer，除了surface，还需传入其他参数，包括宽、高和像素格式。
要注意的是，render一定要和video数据的宽高，格式一致（如：OMX_COLOR_FormatYUV420Planar对应AV_PIX_FMT_YUV420P）。render会自己将数据转换为surface的像素格式，并根据surface大小进行缩放工作。

sp<MetaData> meta = new MetaData;
meta->setInt32(kKeyWidth, w);
meta->setInt32(kKeyHeight,h);
meta->setInt32(kKeyColorFormat, OMX_COLOR_FormatYUV420Planar);
softRenderer = new SoftwareRenderer(surface, meta);

最后，即可用SoftwareRenderer渲染一帧图像，调用其render函数，传入数据及size即可。

看下代码示例吧。

示例代码：

1、surface创建。

void VideoState::video_play(MediaState *media)
{
    int w = 1280;//video_ctx->width;
    int h = 720;//video_ctx->height;

    // create surface
    sp<SurfaceComposerClient> composerClient = new SurfaceComposerClient;
    CHECK_EQ(composerClient->initCheck(), (status_t)OK);

    surfaceControl = composerClient->createSurface(
                 String8("FSPlalyer_Surface"),
                 w,
                 h,
                 PIXEL_FORMAT_RGBA_8888,
                 0);
    CHECK(surfaceControl != NULL);
    CHECK(surfaceControl->isValid());

    SurfaceComposerClient::openGlobalTransaction();
    CHECK_EQ(surfaceControl->setLayer(INT_MAX), (status_t)OK);
    CHECK_EQ(surfaceControl->show(), (status_t)OK);
    CHECK_EQ(surfaceControl->setPosition(0, 0), (status_t)OK);
    SurfaceComposerClient::closeGlobalTransaction();

    surface = surfaceControl->getSurface();
    CHECK(surface != NULL);
    
    frame = av_frame_alloc();
    displayFrame = av_frame_alloc();
    displayFrame->format = AV_PIX_FMT_YUV420P;//video_ctx->pix_fmt;
    displayFrame->width = w;
    displayFrame->height = h;

    int numBytes = avpicture_get_size((AVPixelFormat)displayFrame->format, displayFrame->width, displayFrame->height);
    uint8_t *buffer = (uint8_t *)av_malloc(numBytes * sizeof(uint8_t));

    avpicture_fill((AVPicture *)displayFrame, buffer, (AVPixelFormat)displayFrame->format, displayFrame->width, displayFrame->height);

    pthread_create(&video_tid, NULL, vdecode_thread, this);
    usleep(40*1000);
    //display thread
    pthread_create(&video_display_tid, NULL, video_refresh_thread, media);  
}

2、根据video frame的格式创建render，显示一帧画面。
这里用了两种处理方式：
1）使用FFmpeg的像素格式转换，将frame其放大为1280*720，并转换为YUV420P，目的有两个，一是因为render选择的格式为OMX_COLOR_FormatYUV420Planar；二是render函数中会做16字节对齐，所以利用FFmpeg的像素格式转换函数将宽设置为1280，否则显示时YUV数据可能会有错乱（例如宽是1000的视频）。而进行像素格式转换后的数据，可以直接传入vframe->data[0]，因为视频宽度已经做对齐，和linesize一致，render函数中数据位置不会错乱。

2）当然，这样未免有点浪费资源了，所以第二种是自己将YUV提取播放，但需要注意解码出来的frame一定要是YUV420P的数据，且要做对齐处理。另外就是要注意，以这种方式传入到render的第一个参数需为void*，否则都可能崩溃。

代码示例中，会将两种方式都展示出来

static int ALIGN(int x, int y) {
    // y must be a power of 2.
    return (x + y - 1) & ~(y - 1);
}

void display(VideoState *video)
{
    if(video == NULL)
        return;

    //LOGE("[%s]:[%d]\n",__FUNCTION__,__LINE__);
#if 0  
    //video conver 
    video->sws_ctx = sws_getCachedContext(video->sws_ctx ,video->video_ctx->width, video->video_ctx->height, video->video_ctx->pix_fmt,
                                         video->displayFrame->width, video->displayFrame->height, (AVPixelFormat)video->displayFrame->format, SWS_POINT, NULL, NULL, NULL);
    
    sws_scale(video->sws_ctx , (uint8_t const * const *)video->frame->data, video->frame->linesize, 0,
              video->video_ctx->height, video->displayFrame->data, video->displayFrame->linesize);
    //LOGE("size1[%d] size2[%d] \n",video->frame->linesize[0],video->displayFrame->linesize[0]);

    AVFrame *vframe = video->displayFrame;
    if(video->softRenderer ==  NULL)
    {
        sp<MetaData> meta = new MetaData;
        meta->setInt32(kKeyWidth, vframe->width);
        meta->setInt32(kKeyHeight,vframe->height);
        meta->setInt32(kKeyColorFormat, OMX_COLOR_FormatYUV420Planar);
        video->softRenderer = new SoftwareRenderer(video->surface, meta);
    }
    
    if(video->softRenderer !=  NULL) 
    {
        video->softRenderer->render( vframe->data[0], vframe->linesize[0], NULL);
    }

#else 
    //just for YUV420P
    AVFrame *vframe = video->frame;
    int height = vframe->height;
    int width = vframe->width;
    if(width%16)
        width = ALIGN(width/2,16)*2;    
         
    size_t size = (width * height) * 3 / 2;
    void *data = video->displayFrame->data[0];//(void *)malloc(size);
    uint8_t *dst = (uint8_t *)data;
        
    int y_wrap = vframe->linesize[0];
    int u_wrap = vframe->linesize[1];
    int v_wrap = vframe->linesize[2];

    uint8_t *y_buf = (uint8_t *)vframe->data[0];
    uint8_t *u_buf = (uint8_t *)vframe->data[1];
    uint8_t *v_buf = (uint8_t *)vframe->data[2];

    int i = 0;
    //save y
    for (i = 0; i < height; i++){
        memcpy(dst,y_buf + i * y_wrap,width);
        dst += width;
    }
    //save u
    for (i = 0; i < height/2; i++){
        memcpy(dst,u_buf + i * u_wrap,width/2);
        dst += width/2;
    }
    //save v
    for (i = 0; i < height/2; i++){
        memcpy(dst,v_buf + i * v_wrap,width/2);
        dst += width/2;
    }

    if(video->softRenderer ==  NULL)
    {
        sp<MetaData> meta = new MetaData;
        meta->setInt32(kKeyWidth, width);
        meta->setInt32(kKeyHeight,height);
        meta->setInt32(kKeyColorFormat, OMX_COLOR_FormatYUV420Planar);
        video->softRenderer = new SoftwareRenderer(video->surface, meta);
    }
    
    if(video->softRenderer !=  NULL) 
    {
        video->softRenderer->render(data, size, NULL);
    }
#endif
    
    //LOGE("[%s]:[%d]\n",__FUNCTION__,__LINE__);
    return;
}

说个题外话，对于放大，在优先追求渲染速度的情况下，FFmpeg的像素格式转算法可以选择SWS_POINT。因为我的测试设备比较弱，如果采用其他算法，一帧的显示耗时（放大+render）为60-80ms，这对于帧率为25fps的视频是无法接受的（一帧间隔40ms），会导致后面做音视频同步十分困难。我的设备上采用SWS_POINT则可以将显示耗时降低到30ms。
如果是缩小，则几种算法耗时都在可接受范围内。