Several solutions and simple use of recording through the system API

foreword

Regarding how to use video recording, I have mentioned that there are many ways to achieve it. Intent jumps to the system page, soft editing such as FFmpeg, and hard coding implementation of CameraX package. MediaRecorder configuration hard coding can also be implemented through MediaCodec + MediaMuxer implements hardcoding by itself.

Because I talked about the preview of the use of the three Cameras and the simple package. Then this article will briefly review the following hard coding schemes, which are all Android system APIs and their encapsulation APIs.

The text does not involve too professional audio and video knowledge points. We only need to understand some configuration information required for basic video recording to complete the recording (after all, the API of the system has been well encapsulated).

Frame Rate: Frame rate refers to the number of images displayed per second, usually in FPS. The higher the frame rate, the smoother the motion in the video will appear. Common frame rates are 24, 30, 60, etc.
Resolution: Resolution refers to the pixel size of the video, usually represented by width and height, such as 1920x1080 or 1280x720. Higher resolutions provide a clearer picture.
Bit Rate: Bit rate indicates the number of bits transmitted per second, usually in Mbps. The bit rate determines the amount of data in the video, and also affects the quality and file size of the video. Generally, we set it to multiply the width and height of the resolution by 3 or multiply the width and height by 5. You can also set a relatively large value, such as 3500000.
Key frame (I frame): General video is divided into key frame (I frame), prediction frame (P frame) and bidirectional prediction frame (B frame). We only need to understand that I frame has high-quality and complete image information. They are often used as keyframes. We generally choose the cover or thumbnail from the I frame, an independent picture frame. In recording, we usually need to choose the I frame interval of the recorded video, usually choose 1 (each frame becomes a key frame, the file is larger) or 2 (there will be some P frames or B frames between two I frames, file will be smaller)

After a brief understanding of these, we can start recording, so what kind of Camera should we take as an example to achieve hard-editing and recording?

In fact, each camera has its own advantages and disadvantages, and the callback data is different. Camera1 callback is NV21, and Camera2 and Camerax callback is YUV420. We can convert the corresponding format through some tools to realize Mediacodec encoding. If you just want Simple implementation of recording Then use the VideoCapture use case of CameraX to quickly complete the preview and recording functions.

Although Camera2, which is more complicated to use, has many code implementations and different device compatibility and support are also different, it can realize some customized requirements. Scaled ISO, exposure, autofocus, white balance, multi-cam support and more.

The implementation of the recording API used in this article is also based on Camera2 and its encapsulation.

Let's talk about how the different methods are implemented in detail.

1. MediaRecorder recording

It is very convenient to use Intent itself, but some functions have compatibility issues, and the degree of system version and device support is also different, so it is not so easy to use unless there are no restrictions.

Therefore, in the early development, we recorded according to the MediaRecorder packaged by the system. The audio and video recording can be completed through the configuration options, which can be said to be very convenient.

 public void startCameraRecord() {

        mMediaRecorder = new MediaRecorder();
        mMediaRecorder.reset();
        if (mCamera != null) {
            mMediaRecorder.setCamera(mCamera);
        }
        mMediaRecorder.setOnErrorListener(new MediaRecorder.OnErrorListener() {
            @Override
            public void onError(MediaRecorder mr, int what, int extra) {
                if (mr != null) {
                    mr.reset();
                }
            }
        });

        mMediaRecorder.setPreviewDisplay(mSurfaceHolder.getSurface());
        mMediaRecorder.setVideoSource(MediaRecorder.VideoSource.CAMERA); // 视频源
        mMediaRecorder.setAudioSource(MediaRecorder.AudioSource.MIC);  // 音频源
        mMediaRecorder.setOutputFormat(MediaRecorder.OutputFormat.MPEG_4);  // 视频封装格式
        mMediaRecorder.setAudioEncoder(MediaRecorder.AudioEncoder.AMR_NB);  // 音频格式
        if (mBestPreviewSize != null) {
//            mMediaRecorder.setVideoSize(mBestPreviewSize.width, mBestPreviewSize.height);  // 设置分辨率
            mMediaRecorder.setVideoSize(640, 480);  // 设置分辨率
        }
//        mMediaRecorder.setVideoFrameRate(16); // 比特率
        mMediaRecorder.setVideoEncodingBitRate(1024 * 512);// 设置帧频率，
        mMediaRecorder.setOrientationHint(90);// 输出旋转90度，保持竖屏录制
        mMediaRecorder.setVideoEncoder(MediaRecorder.VideoEncoder.MPEG_4_SP);// 视频输出格式
        mMediaRecorder.setOutputFile(mVecordFile.getAbsolutePath());

        try {
            mMediaRecorder.prepare();
            mMediaRecorder.start();
        } catch (IOException e) {
            e.printStackTrace();
        }

    }

It directly displays the camera page, cannot record special effects, cannot specify the encoding source, can only be pure camera images, and does not support bit rate resolution friendly, and needs to adapt to the resolution supported by the device, etc.

The most unacceptable thing is that many models start MediaRecorder with a 'beep' beep. This system beep really makes people bald.

It's no wonder that the recording of CameraX launched later does not use its own MediaRecorder, and MediaRecorder itself is also implemented based on MediaCodec, and it is encapsulated, so we can see how the lower-level MediaCodec is implemented.

2. MediaCodec + Camera asynchronously implements video encoding

If it’s just a single video recording, we don’t need to consider audio and video synchronization, and we don’t need to deal with timestamps. We can actually use MediaCodec asynchronous callbacks to implement it more simply.

For example, we can directly encode the I420 format of the original camera into the H264 file format.

Taking Camera2 as an example, we set callbacks in our previous package, so we won’t repeat the package code here. If you are interested, you can go to the previous article to check and post the used code directly.

    //子线程中使用同步队列保存数据
    private val originVideoDataList = LinkedBlockingQueue<ByteArray>()

    //标记当前是否正在录制中
    private var isRecording: Boolean = false

    private lateinit var file: File
    private lateinit var outputStream: FileOutputStream

    fun setupCamera(activity: Activity, container: ViewGroup) {

        file = File(CommUtils.getContext().externalCacheDir, "${System.currentTimeMillis()}-record.h264")
        if (!file.exists()) {
            file.createNewFile()
        }
        if (!file.isDirectory) {
            outputStream = FileOutputStream(file, true)
        }

        val textureView = AspectTextureView(activity)
        textureView.layoutParams = ViewGroup.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT, ViewGroup.LayoutParams.MATCH_PARENT)

        mCamera2Provider = Camera2ImageReaderProvider(activity)
        mCamera2Provider?.initTexture(textureView)

        mCamera2Provider?.setCameraInfoListener(object :
            BaseCommonCameraProvider.OnCameraInfoListener {
            override fun getBestSize(outputSizes: Size?) {
                mPreviewSize = outputSizes
            }

            override fun onFrameCannback(image: Image) {
                if (isRecording) {

                    // 使用C库获取到I420格式，对应 COLOR_FormatYUV420Planar
                    val yuvFrame = yuvUtils.convertToI420(image)
                    // 与MediaFormat的编码格式宽高对应
                    val yuvFrameRotate = yuvUtils.rotate(yuvFrame, 90)

                    // 用于测试RGB图片的回调预览
                    bitmap = Bitmap.createBitmap(yuvFrameRotate.width, yuvFrameRotate.height, Bitmap.Config.ARGB_8888)
                    yuvUtils.yuv420ToArgb(yuvFrameRotate, bitmap!!)
                    mBitmapCallback?.invoke(bitmap)

                    // 旋转90度之后的I420格式添加到同步队列
                    val bytesFromImageAsType = yuvFrameRotate.asArray()

                    originVideoDataList.offer(bytesFromImageAsType)
                }
            }

            override fun initEncode() {
                mediaCodecEncodeToH264()
            }

            override fun onSurfaceTextureAvailable(surfaceTexture: SurfaceTexture?, width: Int, height: Int) {
                [email protected] = surfaceTexture
            }
        })

        container.addView(textureView)
    }

Initialize the encoder when the camera is ready, and in each frame callback, we add to the sync queue, because the encoding and preview data are not the same thread. We can then set the encoding using an asynchronous callback.

    /**
     * 准备数据编码成H264文件
     */
    fun mediaCodecEncodeToH264() {

        if (mPreviewSize == null) return

        //确定要竖屏的，真实场景需要根据屏幕当前方向来判断，这里简单写死为竖屏
        val videoWidth: Int
        val videoHeight: Int
        if (mPreviewSize!!.width > mPreviewSize!!.height) {
            videoWidth = mPreviewSize!!.height
            videoHeight = mPreviewSize!!.width
        } else {
            videoWidth = mPreviewSize!!.width
            videoHeight = mPreviewSize!!.height
        }
        YYLogUtils.w("MediaFormat的编码格式，宽：${videoWidth} 高:${videoHeight}")

        //配置MediaFormat信息(指定H264格式)
        val videoMediaFormat = MediaFormat.createVideoFormat(MediaFormat.MIMETYPE_VIDEO_AVC, videoWidth, videoHeight)

        //添加编码需要的颜色格式
        videoMediaFormat.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatYUV420Planar)
//        videoMediaFormat.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatYUV420Flexible)

        //设置帧率
        videoMediaFormat.setInteger(MediaFormat.KEY_FRAME_RATE, 30)

        //设置比特率
        videoMediaFormat.setInteger(MediaFormat.KEY_BIT_RATE, mPreviewSize!!.width * mPreviewSize!!.height * 5)

        //设置每秒关键帧间隔
        videoMediaFormat.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 1)

        //创建编码器
        val videoMediaCodec = MediaCodec.createEncoderByType(MediaFormat.MIMETYPE_VIDEO_AVC)

        //这里采取的是异步回调的方式，与dequeueInputBuffer，queueInputBuffer 这样的方式获取数据有区别的
        // 一种是异步方式，一种是同步的方式。
        videoMediaCodec.setCallback(callback)
        videoMediaCodec.configure(videoMediaFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE)
        videoMediaCodec.start()
    }

    /**
     * 具体的音频编码Codec回调
     */
    val callback = object : MediaCodec.Callback() {

        override fun onOutputFormatChanged(codec: MediaCodec, format: MediaFormat) {
        }

        override fun onError(codec: MediaCodec, e: MediaCodec.CodecException) {
            Log.e("error", e.message ?: "Error")
        }

        /**
         * 系统获取到有可用的输出buffer，写入到文件
         */
        override fun onOutputBufferAvailable(codec: MediaCodec, index: Int, info: MediaCodec.BufferInfo) {
            //获取outputBuffer
            val outputBuffer = codec.getOutputBuffer(index)
            val byteArray = ByteArray(info.size)
            outputBuffer?.get(byteArray)

            when (info.flags) {
                MediaCodec.BUFFER_FLAG_CODEC_CONFIG -> {  //编码配置完成
                    // 创建一个指定大小为 info.size 的空的 ByteArray 数组，数组内全部元素被初始化为默认值0
                    configBytes = ByteArray(info.size)
                    // arraycopy复制数组的方法，
                    // 5个参数，1.源数组 2.源数组的起始位置 3. 目标数组 4.目标组的起始位置 5. 要复制的元素数量
                    // 这里就是把配置信息全部写入到configBytes数组，用于后期的编码
                    System.arraycopy(byteArray, 0, configBytes, 0, info.size)
                }
                MediaCodec.BUFFER_FLAG_END_OF_STREAM -> {  //完成处理
                    //当全部写完之后就回调出去
                    endBlock?.invoke(file.absolutePath)
                }
                MediaCodec.BUFFER_FLAG_KEY_FRAME -> {  //包含关键帧（I帧），解码器需要这些帧才能正确解码视频序列
                    // 创建一个临时变量数组，指定大小为 info.size + 配置信息 的空数组
                    val keyframe = ByteArray(info.size + configBytes!!.size)
                    // 先 copy 写入配置信息，全部写完
                    System.arraycopy(configBytes, 0, keyframe, 0, configBytes!!.size)
                    // 再 copy 写入具体的帧数据，从配置信息的 end 索引开始写，全部写完
                    System.arraycopy(byteArray, 0, keyframe, configBytes!!.size, byteArray.size)

                    //全部写完之后我们就能写入到文件中
                    outputStream.write(keyframe, 0, keyframe.size)
                    outputStream.flush()
                }
                else -> {  //其他帧的处理
                    // 其他的数据不需要写入关键帧的配置信息，直接写入到文件即可
                    outputStream.write(byteArray)
                    outputStream.flush()
                }
            }

            //释放
            codec.releaseOutputBuffer(index, false)
        }

        /**
         * 当系统有可用的输入buffer，读取同步队列中的数据
         */
        override fun onInputBufferAvailable(codec: MediaCodec, index: Int) {
            val inputBuffer = codec.getInputBuffer(index)
            val yuvData = originVideoDataList.poll()

            //如果获取到自定义结束符，优先结束掉
            if (yuvData != null && yuvData.size == 1 && yuvData[0] == (-333).toByte()) {
                isEndTip = true
            }

            //正常的写入
            if (yuvData != null && !isEndTip) {
                inputBuffer?.put(yuvData)
                codec.queueInputBuffer(
                    index, 0, yuvData.size,
                    surfaceTexture!!.timestamp / 1000,  //注意这里没有用系统时间，用的是surfaceTexture的时间
                    0
                )
            }

            //如果没数据，写入空数据，等待执行...
            if (yuvData == null && !isEndTip) {
                codec.queueInputBuffer(
                    index, 0, 0,
                    surfaceTexture!!.timestamp / 1000,  //注意这里没有用系统时间，用的是surfaceTexture的时间
                    0
                )
            }

            if (isEndTip) {
                codec.queueInputBuffer(
                    index, 0, 0,
                    surfaceTexture!!.timestamp / 1000,  //注意这里没有用系统时间，用的是surfaceTexture的时间
                    MediaCodec.BUFFER_FLAG_END_OF_STREAM
                )

            }

        }

    }

The callback object is an asynchronous callback, and each line of code is commented in detail.

Controls to start and end recording:

    /**
     * 停止录制
     */
    fun stopRecord(block: ((path: String) -> Unit)? = null) {
        endBlock = block

        //写入自定义的结束符
        originVideoDataList.offer(byteArrayOf((-333).toByte()))

        isRecording = false
    }


    /**
     * 开始录制
     */
    fun startRecord() {
        isRecording = true
    }

Recorded effect:

3. MediaCodec + AudioRecord implements audio encoding asynchronously

After we use MediaCodec to complete the H264 recording, we can compile the audio separately in the same way. Let's take the common format AAC as an example.

It's just that the previous video source is the callback from Camera2, but here our audio source comes from the recording of AudioRecord.

    //子线程中使用同步队列保存数据
    private var mAudioList: LinkedBlockingDeque<ByteArray>? = LinkedBlockingDeque()

    //标记当前是否正在录制中
    private var isRecording: Boolean = false

    // 输入的时候标记是否是结束标记
    private var isEndTip = false

    /**
     * 初始化音频采集
     */
    private fun initAudioRecorder() {
        //根据系统提供的方法计算最小缓冲区大小
        minBufferSize = AudioRecord.getMinBufferSize(
            AudioConfig.SAMPLE_RATE,
            AudioConfig.CHANNEL_CONFIG,
            AudioConfig.AUDIO_FORMAT
        )

        //创建音频录制器对象
        mAudioRecorder = AudioRecord(
            MediaRecorder.AudioSource.MIC,
            AudioConfig.SAMPLE_RATE,
            AudioConfig.CHANNEL_CONFIG,
            AudioConfig.AUDIO_FORMAT,
            minBufferSize
        )


        file = File(CommUtils.getContext().externalCacheDir, "${System.currentTimeMillis()}-record.aac")
        if (!file.exists()) {
            file.createNewFile()
        }
        if (!file.isDirectory) {
            outputStream = FileOutputStream(file, true)
            bufferedOutputStream = BufferedOutputStream(outputStream, 4096)
        }

        YYLogUtils.w("最终输入的File文件为：" + file.absolutePath)
    }

    /**
     * 启动音频录制
     */
    fun startAudioRecord() {

        //开启线程启动录音
        thread(priority = android.os.Process.THREAD_PRIORITY_URGENT_AUDIO) {
            isRecording = true  //标记是否在录制中

            try {
                //判断AudioRecord是否初始化成功
                if (AudioRecord.STATE_INITIALIZED == mAudioRecorder.state) {

                    mAudioRecorder.startRecording()  //音频录制器开始启动录制
                    val outputArray = ByteArray(minBufferSize)

                    while (isRecording) {

                        var readCode = mAudioRecorder.read(outputArray, 0, minBufferSize)

                        //这个readCode还有很多小于0的数字，表示某种错误，
                        if (readCode > 0) {
                            val realArray = ByteArray(readCode)
                            System.arraycopy(outputArray, 0, realArray, 0, readCode)
                            //将读取的数据保存到同步队列
                            mAudioList?.offer(realArray)

                        } else {
                            Log.d("AudioRecoderUtils", "获取音频原始数据的时候出现了某些错误")
                        }
                    }

                    //自定义一个结束标记符，便于让编码器识别是录制结束
                    val stopArray = byteArrayOf((-666).toByte(), (-999).toByte())
                    //把自定义的标记符保存到同步队列
                    mAudioList?.offer(stopArray)

                }
            } catch (e: IOException) {
                e.printStackTrace()

            } finally {
                //释放资源
                mAudioRecorder.release()
            }

        }

        //测试编码
        thread {
            mediaCodecEncodeToAAC()
        }

    }

The same thing as video encoding is that data collection and encoding are performed in different threads, so synchronous queues are still used to save data. We enable audio recording in sub-threads, and at the same time enable sub-threads to start asynchronous callback encoding.

    private fun mediaCodecEncodeToAAC() {

        try {

            //创建音频MediaFormat
            val encodeFormat = MediaFormat.createAudioFormat(MediaFormat.MIMETYPE_AUDIO_AAC, AudioConfig.SAMPLE_RATE, 1)

            //配置比特率
            encodeFormat.setInteger(MediaFormat.KEY_BIT_RATE, 96000)
            encodeFormat.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC)

            //配置最大输入大小
            encodeFormat.setInteger(MediaFormat.KEY_MAX_INPUT_SIZE, minBufferSize * 2)

            //初始化编码器
            mAudioMediaCodec = MediaCodec.createEncoderByType(MediaFormat.MIMETYPE_AUDIO_AAC)
            //这里采取的是异步回调的方式，与dequeueInputBuffer，queueInputBuffer 这样的方式获取数据有区别的
            // 一种是异步方式，一种是同步的方式。
            mAudioMediaCodec?.setCallback(callback)

            mAudioMediaCodec?.configure(encodeFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE)
            mAudioMediaCodec?.start()

        } catch (e: IOException) {
            Log.e("mistake", e.message ?: "Error")

        } finally {

        }

    }

    /**
     * 具体的音频编码Codec回调
     */
    val callback = object : MediaCodec.Callback() {

        val currentTime = Date().time * 1000  //以微秒为计算单位

        override fun onOutputFormatChanged(codec: MediaCodec, format: MediaFormat) {
        }

        override fun onError(codec: MediaCodec, e: MediaCodec.CodecException) {
            Log.e("error", e.message ?: "Error")
        }

        /**
         * 系统获取到有可用的输出buffer，写入到文件
         */
        override fun onOutputBufferAvailable(
            codec: MediaCodec,
            index: Int,
            info: MediaCodec.BufferInfo
        ) {

            //通过bufferinfo获取Buffer的数据，这些数据就是编码后的数据
            val outBitsSize = info.size

            //为AAC文件添加头部，头部占7字节
            //AAC有 ADIF和ADTS两种  ADIF只有一个头部剩下都是音频文件
            //ADTS是每一段编码都有一个头部
            //outpacketSize是最后头部加上返回数据后的总大小
            val outPacketSize = outBitsSize + 7  // 7 is ADTS size

            //根据index获取buffer
            val outputBuffer = codec.getOutputBuffer(index)

            // 防止buffer有offset导致自己从0开始获取，
            // 取出数据(但是我实验的offset都为0，可能有些不为0的情况)
            outputBuffer?.position(info.offset)

            //设置buffer的操作上限位置，不清楚的可以查下ByteBuffer(NIO知识),
            //了解limit ，position，clear(),filp()都是啥作用
            outputBuffer?.limit(info.offset + outBitsSize)

            //创建byte数组保存组合数据
            val outData = ByteArray(outPacketSize)

            //为数据添加头部，后面会贴出，就是在头部写入7个数据
            addADTStoPacket(AudioConfig.SAMPLE_RATE, outData, outPacketSize)

            //将buffer的数据存入数组中
            outputBuffer?.get(outData, 7, outBitsSize)

            outputBuffer?.position(info.offset)

            //将数据写到文件
            bufferedOutputStream.write(outData)
            bufferedOutputStream.flush()
            outputBuffer?.clear()

            //释放输出buffer,并释放Buffer
            codec.releaseOutputBuffer(index, false)
        }

        /**
         * 当系统有可用的输入buffer，读取同步队列中的数据
         */
        override fun onInputBufferAvailable(codec: MediaCodec, index: Int) {

            //根据index获取buffer
            val inputBuffer = codec.getInputBuffer(index)

            //从同步队列中获取还未编码的原始音频数据
            val pop = mAudioList?.poll()

            //判断是否到达音频数据的结尾，根据自定义的结束标记符判断
            if (pop != null && pop.size >= 2 && (pop[0] == (-666).toByte() && pop[1] == (-999).toByte())) {
                //如果是结束标记
                isEndTip = true
            }

            //如果数据不为空，而且不是结束标记，写入buffer，让MediaCodec去编码
            if (pop != null && !isEndTip) {

                //填入数据
                inputBuffer?.clear()
                inputBuffer?.limit(pop.size)
                inputBuffer?.put(pop, 0, pop.size)

                //将buffer还给MediaCodec，这个一定要还
                //第四个参数为时间戳，也就是，必须是递增的，系统根据这个计算
                //音频总时长和时间间隔
                codec.queueInputBuffer(
                    index,
                    0,
                    pop.size,
                    Date().time * 1000 - currentTime,
                    0
                )
            }

            // 由于2个线程谁先执行不确定，所以可能编码线程先启动，获取到队列的数据为null
            // 而且也不是结尾数据，这个时候也要调用queueInputBuffer，将buffer换回去，写入
            // 数据大小就写0

            // 如果为null就不调用queueInputBuffer  回调几次后就会导致无可用InputBuffer，
            // 从而导致MediaCodec任务结束 只能写个配置文件
            if (pop == null && !isEndTip) {

                codec.queueInputBuffer(
                    index,
                    0,
                    0,
                    Date().time * 1000 - currentTime,
                    0
                )
            }

            //发现结束标志，写入结束标志，
            //flag为MediaCodec.BUFFER_FLAG_END_OF_STREAM
            //通知编码结束
            if (isEndTip) {
                codec.queueInputBuffer(
                    index,
                    0,
                    0,
                    Date().time * 1000 - currentTime,
                    MediaCodec.BUFFER_FLAG_END_OF_STREAM
                )

                endBlock?.invoke(file.absolutePath)
            }
        }

    }

The same process, but the identifier of the custom end is different, and because it is a separately recorded audio, we need to add an ADTS header to play normally. Just copy one from the Internet:

    private fun addADTStoPacket(sampleRateType: Int, packet: ByteArray, packetLen: Int) {
        val profile = 2 // AAC LC
        val chanCfg = 1 // CPE

        packet[0] = 0xFF.toByte()
        packet[1] = 0xF9.toByte()
        packet[2] = ((profile - 1 shl 6) + (sampleRateType shl 2) + (chanCfg shr 2)).toByte()
        packet[3] = ((chanCfg and 3 shl 6) + (packetLen shr 11)).toByte()
        packet[4] = (packetLen and 0x7FF shr 3).toByte()
        packet[5] = ((packetLen and 7 shl 5) + 0x1F).toByte()
        packet[6] = 0xFC.toByte()
    }

At this point we can record an audio file. Since most of them are fixed codes, but the configuration is different, the effect is almost the same. Every line of code should have detailed comments as much as possible.

The recording effect is as follows:

Can't hear? There is no way, let's run the code by ourselves.

4. MediaCodec + MediaMuxer synchronous audio and video encoding and packaging format

It's just a separate audio and video recording, we can use the callback method to handle it easily. What about audio and video recording together in MP4 format?

I understand, call back a video, call back an audio, and then combine them together!

The reason is this, but it is not implemented in this way. The encoding of audio and video is slow, so the picture and audio will be out of sync. If you want to ensure the synchronization of audio and video, you can only use the same timestamp (timestamp) and presentation Timestamp (presentation timestamp), which associates encoded audio and video frames.

It is generally more convenient for us to encode in a synchronous way. We encode the audio stream as a thread, the video stream as a video, and put the operation of the synthesizer MediaMuxer into a thread, and then complete their respective tasks. , and finally output an MP4 file.

The general steps are as follows:

Create and configure MediaCodec objects for audio and video.
Provide the audio data to be encoded to the audio MediaCodec for encoding, and obtain the encoded audio frame.
Provide the video data to be encoded to the MediaCodec of the video for encoding, and obtain the encoded video frame.
Use the timestamps and presentation timestamps of audio and video frames to maintain their correspondence.
Write encoded audio and video frames to a shared output buffer.
Use MediaMuxer to encapsulate the audio and video data in the shared output buffer into the final format (such as MP4).
After the audio and video encoding and encapsulation are completed, resources are released and the operation is completed.

The encoding of the video is still the previous logic, the data source is obtained from Camera2, and then added to the encoder for processing.

    private fun initVideoFormat() {
        //确定要竖屏的，真实场景需要根据屏幕当前方向来判断，这里简单写死为竖屏
        val videoWidth: Int
        val videoHeight: Int
        if (mPreviewSize!!.width > mPreviewSize!!.height) {
            videoWidth = mPreviewSize!!.height
            videoHeight = mPreviewSize!!.width
        } else {
            videoWidth = mPreviewSize!!.width
            videoHeight = mPreviewSize!!.height
        }
        YYLogUtils.w("MediaFormat的编码格式，宽：${videoWidth} 高:${videoHeight}")

        //配置MediaFormat信息(指定H264格式)
        val videoMediaFormat = MediaFormat.createVideoFormat(MediaFormat.MIMETYPE_VIDEO_AVC, videoWidth, videoHeight)
        //添加编码需要的颜色类型
        videoMediaFormat.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatYUV420Planar)
        //设置帧率
        videoMediaFormat.setInteger(MediaFormat.KEY_FRAME_RATE, 30)
        //设置比特率
        videoMediaFormat.setInteger(MediaFormat.KEY_BIT_RATE, mPreviewSize!!.width * mPreviewSize!!.height * 5)
        //设置关键帧I帧间隔
        videoMediaFormat.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 2)

        videoCodec = MediaCodec.createEncoderByType(MediaFormat.MIMETYPE_VIDEO_AVC)
        videoCodec!!.configure(videoMediaFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE)
        videoCodec!!.start()
    }

    /**
     * 视频流的编码处理线程
     */
    inner class VideoEncodeThread : Thread() {

        //由于摄像头数据的获取与编译不是在同一个线程，还是需要同步队列保存数据
        private val videoData = LinkedBlockingQueue<ByteArray>()

        // 用于Camera的回调中添加需要编译的原始数据，这里应该为YNV420
        fun addVideoData(byteArray: ByteArray?) {
            videoData.offer(byteArray)
        }

        override fun run() {
            super.run()

            initVideoFormat()

            while (!videoExit) {
                // 从同步队列中取出 YNV420格式，直接编码为H264格式
                val poll = videoData.poll()
                if (poll != null) {
                    encodeVideo(poll, false)
                }
            }

            //发送编码结束标志
            encodeVideo(ByteArray(0), true)

            // 当编码完成之后，释放视频编码器
            videoCodec?.release()
        }
    }

    //调用Codec硬编音频为AAC格式
    // dequeueInputBuffer 获取到索引，queueInputBuffer编码写入
    private fun encodeVideo(data: ByteArray, isFinish: Boolean) {

        val videoInputBuffers = videoCodec!!.inputBuffers
        var videoOutputBuffers = videoCodec!!.outputBuffers

        val index = videoCodec!!.dequeueInputBuffer(TIME_OUT_US)

        if (index >= 0) {

            val byteBuffer = videoInputBuffers[index]
            byteBuffer.clear()
            byteBuffer.put(data)

            if (!isFinish) {
                videoCodec!!.queueInputBuffer(index, 0, data.size, System.nanoTime() / 1000, 0)
            } else {
                videoCodec!!.queueInputBuffer(
                    index,
                    0,
                    0,
                    System.nanoTime() / 1000,
                    MediaCodec.BUFFER_FLAG_END_OF_STREAM
                )

            }
            val bufferInfo = MediaCodec.BufferInfo()
            Log.i("camera2", "编码video  $index 写入buffer ${data.size}")

            var dequeueIndex = videoCodec!!.dequeueOutputBuffer(bufferInfo, TIME_OUT_US)

            if (dequeueIndex == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
                if (MuxThread.videoMediaFormat == null)
                    MuxThread.videoMediaFormat = videoCodec!!.outputFormat
            }

            if (dequeueIndex == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
                videoOutputBuffers = videoCodec!!.outputBuffers
            }

            while (dequeueIndex >= 0) {
                val outputBuffer = videoOutputBuffers[dequeueIndex]
                if (bufferInfo.flags and MediaCodec.BUFFER_FLAG_CODEC_CONFIG != 0) {
                    bufferInfo.size = 0
                }
                if (bufferInfo.size != 0) {
                    muxerThread?.addVideoData(outputBuffer, bufferInfo)
                }

                Log.i(
                    "camera2",
                    "编码后video $dequeueIndex buffer.size ${bufferInfo.size} buff.position ${outputBuffer.position()}"
                )

                videoCodec!!.releaseOutputBuffer(dequeueIndex, false)

                if (bufferInfo.flags and MediaCodec.BUFFER_FLAG_END_OF_STREAM != 0) {
                    dequeueIndex = videoCodec!!.dequeueOutputBuffer(bufferInfo, TIME_OUT_US)
                } else {
                    break
                }
            }
        }
    }

For audio processing, we don't need synchronous queue processing, and use synchronous encoding to process directly in one thread.

    inner class AudioEncodeThread : Thread() {

        //由于音频使用同步的方式编译，且在同一个线程内，所以不需要额外使用同步队列来处理数据
//        private val audioData = LinkedBlockingQueue<ByteArray>()

        override fun run() {
            super.run()
            prepareAudioRecord()
        }
    }

    /**
     * 准备初始化AudioRecord
     */
    private fun prepareAudioRecord() {
        initAudioFormat()

        // 初始化音频录制器
        audioRecorder = AudioRecord(
            MediaRecorder.AudioSource.MIC, AudioConfig.SAMPLE_RATE,
            AudioConfig.CHANNEL_CONFIG, AudioConfig.AUDIO_FORMAT, minSize
        )

        if (audioRecorder!!.state == AudioRecord.STATE_INITIALIZED) {

            audioRecorder?.run {
                //启动音频录制器开启录音
                startRecording()

                //读取音频录制器内的数据
                val byteArray = ByteArray(SAMPLES_PER_FRAME)
                var read = read(byteArray, 0, SAMPLES_PER_FRAME)

                //已经在录制了，并且读取到有效数据
                while (read > 0 && isRecording) {
                    //拿到音频原始数据去编译为音频AAC文件
                    encodeAudio(byteArray, read, getPTSUs())
                    //继续读取音频原始数据，循环执行
                    read = read(byteArray, 0, SAMPLES_PER_FRAME)
                }

                // 当录制完成之后，释放录音器
                audioRecorder?.release()

                //发送EOS编码结束信息
                encodeAudio(ByteArray(0), 0, getPTSUs())

                // 当编码完成之后，释放音频编码器
                audioCodec?.release()
            }
        }
    }

    //调用Codec硬编音频为AAC格式
    // dequeueInputBuffer 获取到索引，queueInputBuffer编码写入
    private fun encodeAudio(audioArray: ByteArray?, read: Int, timeStamp: Long) {
        val index = audioCodec!!.dequeueInputBuffer(TIME_OUT_US)
        val audioInputBuffers = audioCodec!!.inputBuffers

        if (index >= 0) {
            val byteBuffer = audioInputBuffers[index]
            byteBuffer.clear()
            byteBuffer.put(audioArray, 0, read)
            if (read != 0) {
                audioCodec!!.queueInputBuffer(index, 0, read, timeStamp, 0)
            } else {
                audioCodec!!.queueInputBuffer(
                    index,
                    0,
                    read,
                    timeStamp,
                    MediaCodec.BUFFER_FLAG_END_OF_STREAM
                )

            }

            val bufferInfo = MediaCodec.BufferInfo()
            Log.i("camera2", "编码audio  $index 写入buffer ${audioArray?.size}")
            var dequeueIndex = audioCodec!!.dequeueOutputBuffer(bufferInfo, TIME_OUT_US)
            if (dequeueIndex == MediaCodec.INFO_OUTPUT_FORMAT_CHANGED) {
                if (MuxThread.audioMediaFormat == null) {
                    MuxThread.audioMediaFormat = audioCodec!!.outputFormat
                }
            }
            var audioOutputBuffers = audioCodec!!.outputBuffers
            if (dequeueIndex == MediaCodec.INFO_OUTPUT_BUFFERS_CHANGED) {
                audioOutputBuffers = audioCodec!!.outputBuffers
            }
            while (dequeueIndex >= 0) {
                val outputBuffer = audioOutputBuffers[dequeueIndex]
                Log.i(
                    "camera2",
                    "编码后audio $dequeueIndex buffer.size ${bufferInfo.size} buff.position ${outputBuffer.position()}"
                )
                if (bufferInfo.flags and MediaCodec.BUFFER_FLAG_CODEC_CONFIG != 0) {
                    bufferInfo.size = 0
                }
                if (bufferInfo.size != 0) {
//                    Log.i("camera2", "音频时间戳  ${bufferInfo.presentationTimeUs / 1000}")

//                    bufferInfo.presentationTimeUs = getPTSUs()

                    val byteArray = ByteArray(bufferInfo.size + 7)
                    outputBuffer.get(byteArray, 7, bufferInfo.size)
                    addADTStoPacket(0x04, byteArray, bufferInfo.size + 7)
                    outputBuffer.clear()
                    val headBuffer = ByteBuffer.allocate(byteArray.size)
                    headBuffer.put(byteArray)
                    muxerThread?.addAudioData(outputBuffer, bufferInfo)  //直接加入到封装线程了

//                    prevOutputPTSUs = bufferInfo.presentationTimeUs

                }

                audioCodec!!.releaseOutputBuffer(dequeueIndex, false)
                if (bufferInfo.flags and MediaCodec.BUFFER_FLAG_END_OF_STREAM != 0) {
                    dequeueIndex = audioCodec!!.dequeueOutputBuffer(bufferInfo, TIME_OUT_US)
                } else {
                    break
                }
            }
        }

    }

    private fun initAudioFormat() {
        audioMediaFormat = MediaFormat.createAudioFormat(MediaFormat.MIMETYPE_AUDIO_AAC, AudioConfig.SAMPLE_RATE, 1)
        audioMediaFormat.setInteger(MediaFormat.KEY_BIT_RATE, 96000)
        audioMediaFormat.setInteger(MediaFormat.KEY_AAC_PROFILE, MediaCodecInfo.CodecProfileLevel.AACObjectLC)
        audioMediaFormat.setInteger(MediaFormat.KEY_MAX_INPUT_SIZE, minSize * 2)

        audioCodec = MediaCodec.createEncoderByType(MediaFormat.MIMETYPE_AUDIO_AAC)
        audioCodec!!.configure(audioMediaFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE)
        audioCodec!!.start()
    }

    private fun getPTSUs(): Long {

        var result = System.nanoTime() / 1000L

        if (result < prevOutputPTSUs)
            result += prevOutputPTSUs - result
        return result
    }

    /**
     * 添加ADTS头
     */
    private fun addADTStoPacket(sampleRateType: Int, packet: ByteArray, packetLen: Int) {
        val profile = 2 // AAC LC
        val chanCfg = 1 // CPE

        packet[0] = 0xFF.toByte()
        packet[1] = 0xF9.toByte()
        packet[2] = ((profile - 1 shl 6) + (sampleRateType shl 2) + (chanCfg shr 2)).toByte()
        packet[3] = ((chanCfg and 3 shl 6) + (packetLen shr 11)).toByte()
        packet[4] = (packetLen and 0x7FF shr 3).toByte()
        packet[5] = ((packetLen and 7 shl 5) + 0x1F).toByte()
        packet[6] = 0xFC.toByte()
    }

When our video encoding or audio encoding is completed, we can add the encoded data to the audio and video data buffer of MediaMuxer.

  class MuxThread(val file: File) : Thread() {

        //音频缓冲区
        private val audioData = LinkedBlockingQueue<EncodeData>()
        //视频缓冲区
        private val videoData = LinkedBlockingQueue<EncodeData>()

        companion object {
            var muxIsReady = false
            var videoMediaFormat: MediaFormat? = null
            var audioMediaFormat: MediaFormat? = null
            var muxExit = false
        }

        private lateinit var mediaMuxer: MediaMuxer

        /**
         * 需要先初始化Audio线程与资源，然后添加数据源到封装类中
         */
        fun addAudioData(byteBuffer: ByteBuffer, bufferInfo: MediaCodec.BufferInfo) {
            audioData.offer(EncodeData(byteBuffer, bufferInfo))
        }

        /**
         * 需要先初始化Video线程与资源，然后添加数据源到封装类中
         */
        fun addVideoData(byteBuffer: ByteBuffer, bufferInfo: MediaCodec.BufferInfo) {
            videoData.offer(EncodeData(byteBuffer, bufferInfo))
        }


        private fun initMuxer() {

            mediaMuxer = MediaMuxer(file.path, MediaMuxer.OutputFormat.MUXER_OUTPUT_MPEG_4)

            videoAddTrack = mediaMuxer.addTrack(videoMediaFormat!!)
            audioAddTrack = mediaMuxer.addTrack(audioMediaFormat!!)

            mediaMuxer.start()
            muxIsReady = true

        }

        private fun muxerParamtersIsReady() = audioMediaFormat != null && videoMediaFormat != null

        override fun run() {
            super.run()

            //校验音频编码与视频编码不为空
            while (!muxerParamtersIsReady()) {
            }

            initMuxer()

            while (!muxExit) {
                if (audioAddTrack != -1) {
                    if (audioData.isNotEmpty()) {
                        val poll = audioData.poll()
                        Log.i("camera2", "混合写入audio音频 ${poll.bufferInfo.size} ")
                        mediaMuxer.writeSampleData(audioAddTrack, poll.buffer, poll.bufferInfo)

                    }
                }
                if (videoAddTrack != -1) {
                    if (videoData.isNotEmpty()) {
                        val poll = videoData.poll()
                        Log.i("camera2", "混合写入video视频 ${poll.bufferInfo.size} ")
                        mediaMuxer.writeSampleData(videoAddTrack, poll.buffer, poll.bufferInfo)

                    }
                }
            }

            mediaMuxer.stop()
            mediaMuxer.release()

            Log.i("camera2", "合成器释放")
            Log.i("camera2", "未写入audio音频 ${audioData.size}")
            Log.i("camera2", "未写入video视频 ${videoData.size}")

        }
    }

When we start recording, start these threads, and then start to encode audio data and video data respectively. When the audio and video data are encoded and bound to present timestamps, they are finally added to MuxThread to start synthesis, and the synthesized MuxThread internally Hold the final audio and video data, internally start traversal and judge whether there is data, start to synthesize into MP4 files, set a Flag variable when playback stops, stop encoding and output files.

Effect:

[Note] This is only the Demo level, which is only used to demonstrate the use of the API. Do not use it for actual projects. There are many internal bugs and compatibility issues. When the recording is stopped, it will stop immediately. For example, the recording is 10 seconds but the actual video is only 8 seconds, because There is no stop buffering, and resource release and so on are not processed. If you want to write MediaCodec + MediaMuxer by yourself, you can recommend to see the source code implementation of VideoCapture below.

5. CameraX comes with video recording

If there is one or another problem with the synchronous hard-coded code we implemented above, what should I do if I want an out-of-the-box recording video? In fact, the recording of CameraX is enough for us. If there is no need for some special effects , we use CameraX's VideoCapture to fully meet the needs!

If according to the previous usage, we code ourselves in the callback, then we need to define the callback, get the Image object, and then write MediaCodec + MediaMuxer by ourselves, which is no different from the above usage, and the video recording function can be realized in the same way.

      ImageAnalysis imageAnalysis = new ImageAnalysis.Builder()
            .setTargetAspectRatio(screenAspectRatio)
            .setTargetRotation(rotation)
            .setBackpressureStrategy(ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST)
            .build();

        // 在每一帧上应用颜色矩阵
        imageAnalysis.setAnalyzer(Executors.newSingleThreadExecutor(), new MyAnalyzer(mContext));

    private class MyAnalyzer implements ImageAnalysis.Analyzer {

        private YuvUtils yuvUtils = new YuvUtils();

        public MyAnalyzer(Context context) {

        }

        @Override
        public void analyze(@NonNull ImageProxy image) {

            // 使用C库获取到I420格式，对应 COLOR_FormatYUV420Planar
            YuvFrame yuvFrame = yuvUtils.convertToI420(image.getImage());
            // 与MediaFormat的编码格式宽高对应
            yuvFrame = yuvUtils.rotate(yuvFrame, 90);

            // 旋转90度之后的I420格式添加到同步队列
            videoThread.addVideoData(yuvFrame.asArray());

        }
    }

    // 启动 AudioRecord 音频录制以及编码等逻辑

But for this series of encoding operations, CameraX has already written the use case VideoCapture for us to record video. It has already encapsulated the logic of MediaCodec + MediaMuxer for us inside. We need to use it very easily:

    //录制视频对象
    mVideoCapture = VideoCapture.Builder()
        .setTargetAspectRatio(screenAspectRatio)
        .setAudioRecordSource(MediaRecorder.AudioSource.MIC) //设置音频源麦克风
        //视频帧率
        .setVideoFrameRate(30)
        //bit率
        .setBitRate(3 * 1024 * 1024)
        .build()

    // 开始录制
    fun startCameraRecord(outFile: File) {
        mVideoCapture ?: return

        val outputFileOptions: VideoCapture.OutputFileOptions = VideoCapture.OutputFileOptions.Builder(outFile).build()

        mVideoCapture!!.startRecording(outputFileOptions, mExecutorService, object : VideoCapture.OnVideoSavedCallback {
            override fun onVideoSaved(outputFileResults: VideoCapture.OutputFileResults) {
                YYLogUtils.w("视频保存成功,outputFileResults:" + outputFileResults.savedUri)
                mCameraCallback?.takeSuccess()
            }

            override fun onError(videoCaptureError: Int, message: String, cause: Throwable?) {
                YYLogUtils.e(message)
            }
        })
    }

When we finish the configuration, we can use this use case to start recording video. Its internal implementation is different from our previous method. It does not encode through I420 or NV21 formats, but directly encodes through Surface. The key code is as follows:

 ...

 format.setInteger(MediaFormat.KEY_COLOR_FORMAT, CodecCapabilities.COLOR_FormatSurface);

 // 绑定到 Surface 

  Surface cameraSurface = mVideoEncoder.createInputSurface();
        mCameraSurface = cameraSurface;

        SessionConfig.Builder sessionConfigBuilder = SessionConfig.Builder.createFrom(config);

        if (mDeferrableSurface != null) {
            mDeferrableSurface.close();
        }
        mDeferrableSurface = new ImmediateSurface(mCameraSurface);
        mDeferrableSurface.getTerminationFuture().addListener(
                cameraSurface::release, CameraXExecutors.mainThreadExecutor()
        );
   sessionConfigBuilder.addSurface(mDeferrableSurface);

Created an input Surface object before and associated it with the video encoder. Use the sessionConfigBuilder.addSurface() method to add mDeferrableSurface to the session configuration to ensure that the video encoder uses this Surface for data input. In this way, the camera data can be input to the video encoder through the Surface for encoding processing.

The source code is in androidx.camera.core.VideoCapture. Google wrote it very well. I personally prefer this method. The compatibility and robustness of using COLOR_FormatSurface is better.

Summarize

This article generally talks about some hard-coding methods provided by Android itself, which are essentially MediaCodec and some tools based on its packaging. Including MediaRecorder and VideoCapture are implemented based on it.

There is a little more code in this article, and we can simply understand the different encoding methods of I420, NV21, and Surface data sources through the code. Different configurations of MediaCodec represent what kind of impact. We can also understand the different usages of several encoding methods, what is the difference between asynchronous callback and synchronous processing, and how to use the encapsulation synthesizer?

Comparing the encoding methods of several data sources, I personally prefer the Surface method (personal preference), which has better compatibility and later scalability, including the preview and recording of special effects in the later stage, and the recorded special effect video. , will be more convenient.

It should be noted that, regarding the implementation of different MediaCodecs, we have used different methods in this article, but they are all application methods of some APIs. There is no time to improve them, and their robustness is not good. You can use them for reference learning. It is not recommended that you use it directly. What is really recommended is the system's VideoCapture, which is completely sufficient for some simple recording effects. (If you want to use it directly, you can read the following article)

Since VideoCapture in Camerax is so good, can it be realized on Camera1 or Camera2 through its video recording method?

at last

If you want to become an architect or want to break through the 20-30K salary range, then don't be limited to coding and business, but you must be able to select models, expand, and improve programming thinking. In addition, a good career plan is also very important, and the habit of learning is very important, but the most important thing is to be able to persevere. Any plan that cannot be implemented consistently is empty talk.

If you have no direction, here I would like to share with you a set of "Advanced Notes on the Eight Major Modules of Android" written by the senior architect of Ali, to help you organize the messy, scattered and fragmented knowledge systematically, so that you can systematically and efficiently Master the various knowledge points of Android development.
insert image description here
Compared with the fragmented content we usually read, the knowledge points of this note are more systematic, easier to understand and remember, and are arranged strictly according to the knowledge system.

Full set of video materials:

1. Interview collection

insert image description here
2. Source code analysis collection

3. The collection of open source frameworks
insert image description here
welcomes everyone to support with one click and three links. If you need the information in the article, just click the CSDN official certification WeChat card at the end of the article to get it for free↓↓↓

Android recording video, several ways of hard coding API to realize recording