Wang Xuegang Video Coding——Basic Video Codec and MediaCodec Codec (corresponding to Section 1234)

Why study audio and video

Core competitiveness, lack of high-end talents, slow technology iteration,

Why audio and video learning is not good

There are relatively few materials, and the most difficult part of audio and video is encoding, and a complete system has not been formed

About audio and video encoding

superior

1. Video files: MP4, RMVB, AVI, FLV
2, now learn the difference between audio and video and before,
before: play local files,
now: play network streams (video stream and audio stream)
3, RMVB, MP4, etc. are encapsulation formats , is a container that contains audio streams and video streams
insert image description here
4. We do not transmit RMVB and MP4 encapsulation formats on the network, but we transmit audio streams and video streams.
5. The essence of encoding is compression. H264 is a video encoding format. Different compression methods generate various video formats. Other video formats are as follows
insert image description here
Audio formats are as follows
insert image description here

6. The original video data is collected from the camera, called yuv. The original audio data collected from the microphone is called pcm.
Encapsulate the audio stream and video stream into the same file, and the combination method is different, so there are different encapsulation formats7
insert image description here
, two organizations ITU-T and ISO
insert image description here

ITU-T research and development: H261 H262 H263
ISO research and development: Mpeg1 Mpeg2
joint research and development; H264/Mepg4-Avc, H265/HEVC. The former is the name from ITU-T, the latter is the name from ISO
android supports H265
Google research and development: VP8 VP9 , mainly used in video calls
The originator of audio and video coding: H621 (block structure coding)
8, MediaCodec, codec in Android
9, video coding history
insert image description here

10. Why is H261 so powerful, because it uses block-structure mixed encoding
Now there is a frame of picture, 200 long and 100 wide,
with a total of 200X100 pixels. If it is not compressed and saved to a file, it needs 20,000 pixels x 4 bytes. One pixel has four bytes, and 80,000 bytes are needed to save this picture. Now we know that this image is a gradient. We can store pictures like this.
We first save the width and height 200, 100 (two ints are required to save), and two ints are required to store the starting point and the ending point. It takes 2 ints to store the start point color and the end point color. This way we don't need 80,000 pixels.
When we zoom in wirelessly, we will find that the picture can be seen as a gradient in a certain range.
Video encoding is definitely lossy.
Movie theater video is lossless. Two hours of video requires thousands of G
11, H264 format image compression
insert image description here
insert image description here
12, using ffmpeg

      
提取音频  
ffmpeg -i input.mp4 -acodec copy -vn  output.aac
  
提取视频
ffmpeg -i input.mp4 -c:v copy -bsf:v h264_mp4toannexb -an out.h264

播放视频yuv
ffplay -f rawvideo -video_size 368x384 codec.h265
   

  
ffmpeg -i input.mp4 -codec:a  pcm_f32le -ar 44100 -ac 2 -f f32le output.pcm
 
播放直播
ffplay -i  rtmp://58.200.131.2:1935/livetv/cctv1
播放H265视频  
 ffplay -stats -f hevc  codec.h265
播放H264视频
ffplay -stats -f h264 codec.h264
       

  
ffmpeg -i input.mp4 -f mp3 -vn apple.mp3

ffplay -ar 48000 -channels 2 -f f32le -i output.pcm
1.视频倒放,无音频
ffmpeg.exe -i input.mp4 -filter_complex [0:v]reverse[v] -map [v] -preset superfast reversed.mp4
 
2.视频倒放,音频不变
ffmpeg.exe -i input.mp4 -vf reverse reversed.mp4
   
3.音频倒放,视频不变
ffmpeg.exe -i input.mp4 -map 0 -c:v copy -af "areverse" reversed_audio.mp4
 
4.音视频同时倒放
ffmpeg.exe -i input.mp4 -vf reverse -af areverse -preset superfast reversed.mp4
  
PDF转 Word
https://app.xunjiepdf.com/pdf2word/
视频裁剪
ffmpeg  -i ./input.mp4 -vcodec copy -acodec copy -ss 00:00:00 -to 00:05:00 ./cutout1.mp4 -y
ffmpeg  -i ./input.mp4 -vcodec copy -acodec copy -ss 00:05:00 -to 00:10:00 ./cutout2.mp4 -y
ffmpeg  -i ./input.mp4 -vcodec copy -acodec copy -ss 00:10:00 -to 00:14:50./cutout3.mp4 -y
opengl+rtmp

12. What is H264
insert image description here

13. Why use yuv instead of RGB for video coding?
insert image description here
yuv without uv is a black and white TV
insert image description here
with 4 ys and one v, one u, but macroscopically (that is, what the human eye sees) is the same effect as the one on the left,
rgb needs three channels, yuv only needs one channel.
The width and height of the yuv image depends on y, not on u or v.
The size of yuv with a frame of 4 to 1 to 1: w h+1/4w h+1/4w h = 3/2w h

14, yuv format
insert image description here
insert image description here

Most of Apple and others use nv12, and Android is special, using nv21. Audio and video development in Android requires conversion processing.

Down

1. h264 encoder:
insert image description here
CIF/QCIF means one frame of picture
2. Compression is to reduce redundancy, including intra-frame redundancy and inter-frame redundancy. The first frame is intra-frame encoded, mainly to reduce intra-frame redundancy. The second frame basically uses inter-frame coding to reduce inter-frame redundancy.
3. Intra-frame redundancy processing
Video source encoder: Divide N macroblocks, and perform prediction direction for each macroblock
Composite encoder: Organize residual (the remaining left and upper data) data and prediction direction
transmission buffer Device: Check whether the product is qualified
Outgoing encoder: Product placement
4, inter-frame redundancy processing
The difference between the first frame and the second frame is not much, for example, a car in the first frame has been encoded, and the second frame does not need to be recoded coding. If the position of the car in the second frame changes relative to the first frame, we can record the displacement vector of the macroblock without re-encoding.
Therefore, it will first judge whether the macroblock is encoded in the front. If it has been encoded, it is fine to use the motion vector directly.
The first frame is called the I frame, and the one using motion vectors is called the p frame.

5. GOP: Image sequence, which can be understood as a scene, and the objects in the scene are all similar. Within two I frames can be regarded as a GOP
Gop affects seek, performance optimization, etc.
6, I frame is the largest, P frame (motion vector) is smaller than I frame, B frame is the minimum calculated
7, live broadcast pays attention to the second opening rate . There will be more I frames.
8. When outputting, the first frame I will be output first, the second frame B will be buffered in the transmission buffer, the third frame B will still be buffered in the transmission buffer, and the fourth frame is p frame, which will be output. Then all buffered B-frames are output.
The output frame order is not the same as the playback frame order. The order of playback is in accordance with pts. For example, your frame interval is 10 milliseconds, the I frame is 0 milliseconds, the P frame is 40 milliseconds, and the two B frames are 20 and 30 milliseconds respectively. The player will display the I frame directly when it gets the I frame, but after getting the p frame, because it is 40 milliseconds (more than 10 milliseconds) away from the previous frame I frame, it will not output, it will be cached, and the B frame will be displayed for 30 milliseconds 40 ms of p-frames will be displayed after.
insert image description here

It can be seen that the output of B frames is out of order.
9. Encoding I frame is the easiest, while encoding B frame takes the longest time.
10. Decoding h262: the algorithm is the same height, quite complicated.
11. How to ensure the integrity of the frame?
Use separator, 0x0000001 or 0x00001.
What if the pixel happens to be 0x000001? Change 0x000001 to 0x000301,
what if the pixel happens to be 0x000301? It directly becomes 0x000001. Changing one pixel does not affect the user experience
12, and the size of the I frame, p frame, and b frame can be calculated according to the delimiter.
13. The first one is called the number as the frame type (as shown in the figure below). 67 represents sps, 68 represents pps, 65 represents I frame, 41 represents p frame, 01 represents b frame,
insert image description here
I frame is followed by p frame
insert image description here

The blue shadow in the figure below is the b frame
insert image description here

14. cpu decoding: soft decoding, high compatibility, because every mobile phone has cpu components
.
: No freeze, low power consumption, can analyze multi-channel video (monitoring)
insert image description hereinsert image description here
GPU does not decode, only decodes the display
15, dsp: Different hardware mobile phone manufacturers have different dsp. There are compatibility issues. Solution: hard solution first, hard solution does not support soft solution
16, MediaPlayer: hard solution, supports fewer playback formats.
Dsp can access disk (cpu can access, dsp can access), directly read sdcard data, dsp decodes hexadecimal data, and finally forms yuv, and yuv data is handed over to GPU for rendering. If the video has a length of two hours, then the CPU will no longer participate after executing the code, and the CPU will no longer participate after reading the second frame until the end of the file is read.
insert image description here
A bunch of numbers on the right are yuv data.
17. The java code cannot directly read the dsp, you need to use Mediacodec.
18, MediaCodec is hard decoding, just to call dsp
19, MediaCodec is based on process, it is difficult to learn

Write your own code to parse H264 below. (corresponding to the third lesson)

It is to restore the stuff in the picture below to the video
insert image description here
MediaCodec to call dsp. Although it is written in Java code, MediaCodec is based on the process.
Read the file in the cpu, and pass the data from the cpu to the dsp, which is cross-device (not in the same physical device). Therefore, MediaCodec (across the cpu and dsp) does not set a callback method, and the decoded data cannot be obtained from the callback method after successful decoding.
Therefore, MediaCode adopts another method. The dsp provides a queue with a quantity of 8, and each container has multiple states. After the container has data, it is put into the Codec for decoding. The data flows from the cpu to the dsp, and the decoded ypu is flowing to the cpu
insert image description here

layout file

<?xml version="1.0" encoding="utf-8"?>
<androidx.constraintlayout.widget.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    tools:context=".MainActivity">

    <SurfaceView
        android:id="@+id/preview"
        android:layout_width="0dp"
        android:layout_height="0dp"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintLeft_toLeftOf="parent"
        app:layout_constraintRight_toRightOf="parent"
        app:layout_constraintTop_toTopOf="parent" />

</androidx.constraintlayout.widget.ConstraintLayout>

Add permissions

 <uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE"></uses-permission>
    <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"></uses-permission>
MainActivity文件
package com.example.audiovideotest;

import androidx.annotation.NonNull;
import androidx.appcompat.app.AppCompatActivity;

import android.Manifest;
import android.content.pm.PackageManager;
import android.media.MediaCodec;
import android.os.Build;
import android.os.Bundle;
import android.os.Environment;
import android.view.Surface;
import android.view.SurfaceHolder;
import android.view.SurfaceView;

import java.io.File;

public class MainActivity extends AppCompatActivity {
    private H264Player h264Player;
    //动态获取读写权限
    public boolean checkPermission() {
        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.M && checkSelfPermission(
                Manifest.permission.WRITE_EXTERNAL_STORAGE) != PackageManager.PERMISSION_GRANTED) {
            requestPermissions(new String[]{
                    Manifest.permission.READ_EXTERNAL_STORAGE,
                    Manifest.permission.WRITE_EXTERNAL_STORAGE
            }, 1);

        }
        return false;
    }
    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
       checkPermission();
       initSurface();
    }
   //SurfaceView画框,Surface画布,surface绘制,surfaceview显示 
    private void initSurface() {
        SurfaceView surfaceView = findViewById(R.id.preview);
        surfaceView.getHolder().addCallback(new SurfaceHolder.Callback() {
            @Override
            public void surfaceCreated(@NonNull SurfaceHolder surfaceHolder) {
            //拿到surface,绘制、渲染是在surface上
                Surface surface = surfaceHolder.getSurface();
                h264Player = new H264Player(new File(Environment.getExternalStorageDirectory(),"out.h264").getAbsolutePath(),surface);
                h264Player.play();
            }

            @Override
            public void surfaceChanged(@NonNull SurfaceHolder surfaceHolder, int i, int i1, int i2) {

            }

            @Override
            public void surfaceDestroyed(@NonNull SurfaceHolder surfaceHolder) {

            }
        });
    }
}
package com.example.audiovideotest;

import android.media.MediaCodec;
import android.media.MediaFormat;
import android.util.Log;
import android.view.Surface;

import java.io.ByteArrayOutputStream;
import java.io.DataInputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.ByteBuffer;

//解码是耗时的,继承线程
public class H264Player implements Runnable {
    //数据源
    private String path;
    //解码器,把数据源解码成yuv,显示在surface上
    private MediaCodec mediaCodec;
    //显示目的地,就是surface
    private Surface surface;

    public H264Player(String path, Surface surface) {
        this.path = path;
        this.surface = surface;
        try {
            //1,创建解码器,解码器不区分音频视频解码器,如果创建
            //视频解码器就以video开头,如果创建音频解码器就以audio开头。
            //2,video后面接的是具体的编码格式,如h264,h265,vp8,vp9
            //3,avc就是mpeg4-avc就是h264
            // 4,MediaFormat.MIMETYPE_VIDEO_AVC就是"video/avc"
            //5,硬解码只支持几种编码格式,因为dsp硬件有限,不可能兼容所有格式(比如RV40 )
            // mediaCodec  = MediaCodec.createDecoderByType("video/avc");
            mediaCodec = MediaCodec.createDecoderByType(MediaFormat.MIMETYPE_VIDEO_AVC);
            //赋值自己的参数,告诉dsp的信息.
            //MediaFormat 封装了HashMap,与HashMap不同的是,MediaFormat 的key是死的。
            //构建自己的MediaFormat ,视频流是"video/avc",宽高自己随便写
            MediaFormat mediaFormat = MediaFormat.createVideoFormat("video/avc", 200, 200);
            //帧率
            mediaFormat.setInteger(MediaFormat.KEY_FRAME_RATE, 15);
            //surface如果不配置,就不渲染,第三个参数是加密,第四个参数是标志位,解码是0就可以了
            mediaCodec.configure(mediaFormat, surface, null, 0);
            Log.i("zhang_xin", "支持");
        } catch (IOException e) {
            e.printStackTrace();
            //如果不支持硬解这里会报异常
            Log.i("zhang_xin", e.getMessage());
        }
    }

    //一旦调用player,MediaCodec就开始工作了。
    public void play() {
        //开始解码
        mediaCodec.start();
        new Thread(this).start();
    }

    @Override
    public void run() {
        //cpu把数据传给dsp是跨设备,无法使用回调。
        //dsp会提供一个数量为8的容器,
        try {
            decodeH264();
        } catch (Exception e) {
            Log.i("david", "run: "+e.toString());
        }
    }

    private void decodeH264() {
        byte[] bytes = null;
        try {
        //数据来到了bytes数组
            bytes = getBytes(path);
        } catch (IOException e) {
            e.printStackTrace();
        }
        /** 过时写法
         //拿到所有的容器,不建议这么写,已经过时
         ByteBuffer[] byteBuffers = mediaCodec.getInputBuffers();
         //查询哪个容器可以用,inIndex小于0当前没有容器可以使用。
         //10000为等待时间。告诉dsp我要等10毫秒,单位是微秒
         int inIndex = mediaCodec.dequeueInputBuffer(10000);
         if(inIndex>=0){
         //拿到了容器的号码,根据号码找到了相应的容器
         ByteBuffer byteBuffer = byteBuffers[inIndex];
         }
         */
       
        int startIndex = 0;
        MediaCodec.BufferInfo info = new MediaCodec.BufferInfo();
        while (true) {//一直往容器里丢东西,当文件结束的时候停止
            //startIndex必须加大于2的数字,
            int nextFrame = findByFrame(bytes, startIndex+4, bytes.length);
              //上面的方法弃用(标注过时写法的地方)了,因为google认为,我们不应该拿到所有的容器,
             // 在处理完毕之后,一定要调用queueInputBuffer把这个ByteBuffer放回到队列中,这样才能正确释放缓存区。
            int inIndex = mediaCodec.dequeueInputBuffer(10000);
            if (inIndex >= 0) {
                //拿到可用的容器
                ByteBuffer byteBuffer = mediaCodec.getInputBuffer(inIndex);
                //每次往容器里丢一帧数据,注意帧的大小不固定。使用分隔符(00 00 01)来区别帧。注意不能丢全部数据,也不按照固定大小丢。
                //startIndex从0开始
                int length = nextFrame - startIndex;
                //丢byte数组,内容从startIndex开始,丢length个长度
                byteBuffer.put(bytes, startIndex, length);
                //把容器的编号丢给dsp,数据就从cpu流到了dsp,inIndex,就是容器索引
                //第二个参数是偏移,没有偏移
                //第四个参数是pts(时间戳),解码的时候按照视频中的pts解码,编码就不能为0
                //第五个参数flag是0 就可以
                mediaCodec.queueInputBuffer(inIndex,0,length,0,0);
                startIndex = nextFrame;
            }
            //判断解码是否完成。输入与输出不是同步的。传入数据,并不一定马上输出数据,因为解码是耗时的
            //如果索引大于0,就说明解码成功
            //info:假如我传进去是8K,解码完成肯定大于8k,通过info得到解码后的大小。info就是出参入参对象
            //第二个参数:The timeout in microseconds, a negative timeout indicates "infinite".
            int outIndex = mediaCodec.dequeueOutputBuffer(info,10000);
             //大于等于0解码完成,然后渲染出去
            if(outIndex>=0){
                try{
                //解决播放过快的问题
                //播放的视频帧数每秒30帧,每帧播放事件大概是33毫秒,
                Thread.sleep(33);
                }catch(Exception e){
                }
                //配置了surface,这里就是true,直接把数据渲染到配置了的surface
                //渲染到Surface,MediaCodec帮我们完成了
                mediaCodec.releaseOutputBuffer(outIndex,true);
            }
        }
    }
 //返回下一个分割符的位置
 //start:上一次分隔符的开始,start必须要大于起始位置,不然会返回起始位置,我们传人start参数的时候让它加了4
    private int findByFrame(byte[] bytes, int start, int totalSize) {
    //为什么减4,我们在i的基础上往后判断,避免越界
        for (int i = start; i <= totalSize - 4; i++) {
        //第0个等于0,第一个等于0,第二个等于0,第三个等于1,注意分隔符有两种
            if (((bytes[i] == 0x00) && (bytes[i + 1] == 0x00) && (bytes[i + 2] == 0x00) && (bytes[i + 3] == 0x01))
                    || ((bytes[i] == 0x00) && (bytes[i + 1] == 0x00) && (bytes[i + 2] == 0x01))) {
                return i;
            }
        }
        return -1;
    }

    //把数据从磁盘读到byte[]
    private byte[] getBytes(String path) throws IOException {
        InputStream is = new DataInputStream(new FileInputStream(new File(path)));
        int len;
        int size = 1024;
        byte[] buf;
        ByteArrayOutputStream bos = new ByteArrayOutputStream();
        buf = new byte[size];
        while ((len = is.read(buf, 0, size)) != -1)
            bos.write(buf, 0, len);
        buf = bos.toByteArray();
        return buf;
    }
}

If you don't understand the code, you can use the H264Player class directly.

   //startIndex必须加大于2的数字,
            int nextFrame = findByFrame(bytes, startIndex+3, bytes.length);

Here I want to explain that if startIndex is 0,
insert image description here
the first, second, third, and fourth match the judgment, and the index will be returned directly. At this time, the index is 0, so the number greater than 2 must be increased

In addition, if the file has only one I frame, it may not be played. Because in many players there will be a buffer. If there are no p frames and B frames, it may not be decoded. Some players only output pictures when they output p frames and b frames

Encode the picture into h264 (corresponding to the first lesson of the fourth lesson)

Data sources include cameras, screen recordings, video files, etc. How to encode h264 from different data sources
that generate h264 through screen recording . Screen recording is a dynamic application for permission, only 5.0 or above can record the screen First add permission


   <uses-permission android:name="android.permission.READ_EXTERNAL_STORAGE"></uses-permission>
    <uses-permission android:name="android.permission.WRITE_EXTERNAL_STORAGE"></uses-permission>
    <uses-permission android:name="android.permission.CAMERA"/>

layout file

<?xml version="1.0" encoding="utf-8"?>
<androidx.constraintlayout.widget.ConstraintLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    tools:context=".MainActivity">

    <TextView
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:text="开始录屏"
        android:onClick="click"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintEnd_toEndOf="parent"
        app:layout_constraintStart_toStartOf="parent"
        app:layout_constraintTop_toTopOf="parent" />

</androidx.constraintlayout.widget.ConstraintLayout>

MainActivity

package com.example.endecode;

import androidx.annotation.Nullable;
import androidx.appcompat.app.AppCompatActivity;

import android.Manifest;
import android.content.Intent;
import android.content.pm.PackageManager;
import android.media.projection.MediaProjection;
import android.media.projection.MediaProjectionManager;
import android.os.Build;
import android.os.Bundle;
import android.view.View;

public class MainActivity extends AppCompatActivity {
    //录屏工具类
    private MediaProjectionManager mediaProjectionManager;
    private  MediaProjection mediaProjection;
    
    public boolean checkPermission() {
        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.M && checkSelfPermission(
                Manifest.permission.WRITE_EXTERNAL_STORAGE) != PackageManager.PERMISSION_GRANTED) {
            requestPermissions(new String[]{
                    Manifest.permission.READ_EXTERNAL_STORAGE,
                    Manifest.permission.WRITE_EXTERNAL_STORAGE,
                    Manifest.permission.CAMERA
            }, 1);

        }
        return false;
    }
    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        checkPermission();
    }

    public void click(View view) {
        //录屏要动态申请权限,MEDIA_PROJECTION_SERVICE已经在Context中定义好了
        mediaProjectionManager = (MediaProjectionManager) getSystemService(MEDIA_PROJECTION_SERVICE);
        //请求用户是否同意录屏
        Intent captureIntent = mediaProjectionManager.createScreenCaptureIntent();
        startActivityForResult(captureIntent,1);
    }

    @Override
    protected void onActivityResult(int requestCode, int resultCode, @Nullable Intent data) {
        super.onActivityResult(requestCode, resultCode, data);
        if(resultCode!=RESULT_OK||requestCode!=1){return;}
        //通过录屏生成h264
        //通过mediaProjection 实现录屏
        mediaProjection = mediaProjectionManager.getMediaProjection(resultCode, data);
        H264EnCoude h264EnCoude = new H264EnCoude(mediaProjection);
        h264EnCoude.start();
    }
}

How does MediaProjection and MediaCodec work?
MediaCodec provides a Surface (unlike decoding, the decoded surface is obtained from SurfaceView), and the encoded surface is created by mediacodec. The data source will use surface
mediaProjection as input data, mediaCodec as encoded data, and the data recorded by mediaProjection to mediaCodec. Because they are all written by Google, we don't have to care. We only need to output the dsp data to the CPU, and the output also needs to go to the buffer.
The video bit rate is the number of data bits transmitted per unit time during data transmission. Generally, the unit we use is kbps, that is, thousands of bits per second. The popular understanding is the sampling rate. The larger the sampling rate per unit time, the higher the accuracy, and the closer the processed file is to the original file.

package com.example.endecode;

import android.hardware.display.DisplayManager;
import android.media.MediaCodec;
import android.media.MediaCodecInfo;
import android.media.MediaFormat;
import android.media.projection.MediaProjection;
import android.provider.MediaStore;
import android.view.Surface;

import java.io.IOException;
import java.nio.ByteBuffer;

public class H264EnCoude extends Thread{
    private int width =720;
    private int height = 1280;
    //数据源,既然可以通过录屏获得数据源,我们也可以通过openGL获取数据源,也可以通过射像头
    private MediaProjection mediaProjection;
    //编码器
    private MediaCodec mediaCodec;
    //输出文件,输出h264

    public H264EnCoude(MediaProjection mediaProjection1) {
        this.mediaProjection = mediaProjection1;
        //解码的时候不需要传宽高,但编码必须要有宽高这些基本信息,
        //因为解码会直接去解码h264的配置信息(sps/pps),但编码的时候没有这些配置信息
        MediaFormat format = MediaFormat.createVideoFormat(MediaFormat.MIMETYPE_VIDEO_AVC,width,height);
        try {
           //创建MediaCodec,这里用来编码
            mediaCodec = MediaCodec.createEncoderByType("video/avc");
            //帧率,告诉dsp 一秒钟20帧
            format.setInteger(MediaFormat.KEY_FRAME_RATE,20);
            //告诉dsp I帧间隔,30帧一个I帧
            format.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL,30);
            //码率,码率越高质量越清晰,一般是width*height
            format.setInteger(MediaFormat.KEY_BIT_RATE,width*height);
            //告诉dsp芯片编码器的数据来源,根据这些信息会生成配置帧 sps pps;我的数据是从Surface中来
            format.setInteger(MediaFormat.KEY_COLOR_FORMAT, MediaCodecInfo.CodecCapabilities.COLOR_FormatSurface);
            //surface和加密传null就可以
            //MediaCodec.CONFIGURE_FLAG_ENCODE表示mediaCodec用来是编码的,传0则表示用来解码  的。
            mediaCodec.configure(format,null,null,MediaCodec.CONFIGURE_FLAG_ENCODE);
            //创建surface(抢银行例子中的空地),mediaCodec(抢银行例子中的david老师,负责提供场地)。
            Surface inputSurface = mediaCodec.createInputSurface();
            //绑定,录到的数据可以显示到surfaceview,也可以创建虚拟的屏幕
            //name:关系,随便起。不能为Null,保证唯一
            //编码的宽高,最好和上面的宽高相等,一个输出,一个输入
            //3:1个DPI输出3个像素,越大越清晰
            //公开的
            //inputSurface:把从MediaCodec中拿到的inputSurface,提供给mediaProjection
            //回调,什么时候暂停,什么时候恢复,什么时候停止,可以传null
            //使用handler发送消息,这里传null。
            mediaProjection.createVirtualDisplay("jett-davaid",width,height,3, DisplayManager.VIRTUAL_DISPLAY_FLAG_PUBLIC,inputSurface,null,null);
            //mediaProjection是输入数据,mediaCodec是编码数据,mediaProjection录道的数据交给mediaCodec
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    @Override
    public void run() {
       super.run();
       //开启编码器
        mediaCodec.start();
        MediaCodec.BufferInfo info = new MediaCodec.BufferInfo();
        while (true){
            //输入(mediaProjection录到的数据交给mediaCodec)的地方不需要我们关心,系统帮我们实现了,不需要创建输入的buffer。我们只需要实现输出的代码(从dsp/mediaCodec到cpu),需要创建输出的buffer,
            int outIndex = mediaCodec.dequeueOutputBuffer(info,10000);
            //大于0代表成功
            if(outIndex>0){
                //被编码的数据,需要把容器中(ByteBuffer)的数据放到新建的bute[]中
                ByteBuffer byteBuffer = mediaCodec.getOutputBuffer(outIndex);
                byte[] ba = new byte[info.size];
                //把容器byteBuffer里的数据转移到ba数组里
                byteBuffer.get(ba);
                //写道文件中
                FileUtils.writeBytes(ba);//写字节
                //把字节转换为16进制字符串
                FileUtils.writeContent(ba);
                //释放,如果配置了surface,就传true,我们没有配置surface,所以我们传false
                mediaCodec.releaseOutputBuffer(outIndex,false);
            }
        }
    }
}

package com.example.endecode;

import android.os.Environment;
import android.util.Log;

import java.io.FileOutputStream;
import java.io.FileWriter;
import java.io.IOException;

public class FileUtils {
    private static final String TAG = "David";

    public  static  void writeBytes(byte[] array) {
        FileOutputStream writer = null;
        try {
            // 打开一个写文件器,构造函数中的第二个参数true表示以追加形式写文件
            writer = new FileOutputStream(Environment.getExternalStorageDirectory() + "/codec.h264", true);
            writer.write(array);
            writer.write('\n');


        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                if (writer != null) {
                    writer.close();
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

    public  static String writeContent(byte[] array) {
        char[] HEX_CHAR_TABLE = {
                '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'
        };
        StringBuilder sb = new StringBuilder();
        for (byte b : array) {
            sb.append(HEX_CHAR_TABLE[(b & 0xf0) >> 4]);
            sb.append(HEX_CHAR_TABLE[b & 0x0f]);
        }
        Log.i(TAG, "writeContent: " + sb.toString());
        FileWriter writer = null;
        try {
            // 打开一个写文件器,构造函数中的第二个参数true表示以追加形式写文件
            writer = new FileWriter(Environment.getExternalStorageDirectory() + "/codecH264.txt", true);
            writer.write(sb.toString());
            writer.write("\n");
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                if (writer != null) {
                    writer.close();
                }
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        return sb.toString();
    }


}

OK you're done.

sps pps

1. sps pps appear in pairs, which are configuration frames (basic configuration frame and full configuration frame)
2. The content of sps pps frame is related to MediaCodec configuration. The configuration is different, the content is different
3, the MediaCodec encoder will output sps pps at the first time, and only one sps pps will be output.
4, Multiple ssp pps will appear in the video. (Lenovo live broadcast, it started broadcasting before I came in), the first ssp pps is output by the MediaCodec encoder, and the subsequent ssp pps is output by the cache. Ideally, the I frame is output once, and the ssp pps is output once.
5. In addition to the I frame, P frame and B frame, there is also configuration frame
6. If the video width and height change or the screen rotates, the screen will appear black because the sps/pps has changed. At this time, the encoder needs to be reinitialized.

The data source camera is encoded (corresponding to the second lesson of the fourth lesson)

1, the camera has camera1, camera2, camerax, we use camera1 here
to look at the layout file first

<?xml version="1.0" encoding="utf-8"?>
<RelativeLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:app="http://schemas.android.com/apk/res-auto"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    android:background="#fff"
    tools:context=".MainActivity">
    <com.maniu.maniumediacodec.LocalSurfaceView
        android:id="@+id/preview"
        android:layout_width="match_parent"
        android:layout_height="match_parent"
        app:layout_constraintBottom_toBottomOf="parent"
        app:layout_constraintLeft_toLeftOf="parent"
        app:layout_constraintRight_toRightOf="parent"/>
</RelativeLayout>
package com.maniu.maniumediacodec;

import android.Manifest;
import android.content.Intent;
import android.content.pm.PackageManager;
import android.media.projection.MediaProjection;
import android.media.projection.MediaProjectionManager;
import android.os.Build;
import android.os.Bundle;
import android.view.View;

import androidx.appcompat.app.AppCompatActivity;

public class MainActivity1 extends AppCompatActivity {
 
    public boolean checkPermission() {
        if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.M && checkSelfPermission(
                Manifest.permission.WRITE_EXTERNAL_STORAGE) != PackageManager.PERMISSION_GRANTED) {
            requestPermissions(new String[]{
                    Manifest.permission.READ_EXTERNAL_STORAGE,
                    Manifest.permission.WRITE_EXTERNAL_STORAGE,
                    Manifest.permission.CAMERA
            }, 1);

        }
        return false;
    }
    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main1);
        checkPermission();
    }


}
package com.maniu.maniumediacodec;

import android.content.Context;
import android.hardware.Camera;
import android.util.AttributeSet;
import android.view.SurfaceHolder;
import android.view.SurfaceView;

import androidx.annotation.NonNull;

import java.io.IOException;

/**
 * camera1输出到SurfaceView
 *
 */
public class LocalSurfaceView  extends SurfaceView implements SurfaceHolder.Callback, Camera.PreviewCallback {
    H264Encode h264Encode;
//    mCamera--》surfaceveiw
    private Camera.Size size;
    private Camera mCamera;
//知道预览宽高,可以算出yuv,有多大
    byte[] buffer;
    public LocalSurfaceView(Context context) {
        super(context);
    }

    public LocalSurfaceView(Context context, AttributeSet attrs) {
        super(context, attrs);
        //设置监听,就会调用onCreate()
        getHolder().addCallback(this);
    }



    public LocalSurfaceView(Context context, AttributeSet attrs, int defStyleAttr) {
        super(context, attrs, defStyleAttr);
    }
    //打开预览
    private void startPreview() {
        //前置摄像头还是后置摄像头
        mCamera = Camera.open(Camera.CameraInfo.CAMERA_FACING_BACK);
        //拿到camara 参数
        Camera.Parameters parameters = mCamera.getParameters();
        //得到camera预览大小,预览尺寸。
        size = parameters.getPreviewSize();
        try {
            mCamera.setPreviewDisplay(getHolder());
            //旋转90度,这里旋转的是显示,预览画面并没有因为这个方法的调用
            mCamera.setDisplayOrientation(90);
//            知道预览宽高,可以算出yuv,有多大;width*height+1/4width*height+1/4width*height
            buffer = new byte[size.width * size.height * 3 / 2];
            //camara每次预览的时候把数据放到容器中
            mCamera.addCallbackBuffer(buffer);
            //会回调onPreviewFrame,每帧都会回调。会调用onPreviewFrame();
            mCamera.setPreviewCallbackWithBuffer(this);
            mCamera.startPreview();
        } catch (IOException e) {
            e.printStackTrace();
        }

    }

    @Override
    public void onPreviewFrame(byte[] data, Camera camera) {
        //data数组就是yuv,yuv就是图片,如果手机竖着拍照,手机画面拍到的是横着的(因为Android手机摄像头放置是竖着的)
        //旋转,宽高进行交换
        if (h264Encode == null) {
            this.h264Encode = new H264Encode(size.width, size.height);
            h264Encode.startLive();
        }
//        data就是数据,我们这里直接对data进行编码,没有对data进行处理。
        h264Encode.encodeFrame(data);
        //重新设置监听
        mCamera.addCallbackBuffer(data);
    }

    @Override
    public void surfaceCreated(@NonNull SurfaceHolder surfaceHolder) {
        //打开相机预览
        startPreview();
    }

    @Override
    public void surfaceChanged(@NonNull SurfaceHolder surfaceHolder, int i, int i1, int i2) {

    }

    @Override
    public void surfaceDestroyed(@NonNull SurfaceHolder surfaceHolder) {

    }
}

package com.maniu.maniumediacodec;

import android.media.MediaCodec;
import android.media.MediaCodecInfo;
import android.media.MediaFormat;

import java.io.IOException;
import java.nio.ByteBuffer;

public class H264Encode {
    MediaCodec mediaCodec;
    int index;
    int width;
    int height;
    public H264Encode(int width, int height) {
        this.width = width;
        this.height = height;
    }
    public void startLive()  {
        try {
            mediaCodec = MediaCodec.createEncoderByType("video/avc");
            MediaFormat mediaFormat = MediaFormat.createVideoFormat("video/avc", width, height);
            mediaFormat.setInteger(MediaFormat.KEY_BIT_RATE, width * height);
            mediaFormat.setInteger(MediaFormat.KEY_FRAME_RATE, 15);
            mediaFormat.setInteger(MediaFormat.KEY_I_FRAME_INTERVAL, 2); //每两秒一个I帧
            mediaFormat.setInteger(MediaFormat.KEY_COLOR_FORMAT,
                    MediaCodecInfo.CodecCapabilities.COLOR_FormatYUV420Flexible);//我们是数据传进来的,yuv420,不再是surface
            mediaCodec.configure(mediaFormat, null, null, MediaCodec.CONFIGURE_FLAG_ENCODE);
            mediaCodec.start();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    //把数据从cpu传到dsp,在从dsp传递到cpu;input的大小是固定的,摄像机捕获的是固定的
    //需要对数据进行处理,如果直接保存数据,虽然我们是竖着拍的,但播放会是横着的;需要处理
    //摄像头捕获的数据是nv21,只有Android摄像头支持,但MediaCoc需要接收nv12,需要把nv21转为nv12
    public int encodeFrame(byte[] input) {

        //把数据从cpu传到dsp
        int inputBufferIndex = mediaCodec.dequeueInputBuffer(10000);
        MediaCodec.BufferInfo bufferInfo = new MediaCodec.BufferInfo();
        if (inputBufferIndex >= 0) {
            ByteBuffer inputBuffer =   mediaCodec.getInputBuffer(inputBufferIndex);

            inputBuffer.clear();
            inputBuffer.put(input);
            //computPts():pts必须传,解码的时候不需要传,编码的时候必须传
            mediaCodec.queueInputBuffer(inputBufferIndex, 0, input.length, computPts(), 0);
            index++;
        }


//        在从dsp传递到cpu
        int outputBufferIndex =   mediaCodec.dequeueOutputBuffer(bufferInfo,100000);
        if (outputBufferIndex >= 0) {
            ByteBuffer  outputBuffer= mediaCodec.getOutputBuffer(outputBufferIndex);
            byte[] data = new byte[bufferInfo.size];
            outputBuffer.get(data);
            FileUtils.writeBytes(data);
            FileUtils.writeContent(data);
            mediaCodec.releaseOutputBuffer(outputBufferIndex, false);
        }
        return -1;
    }

//  我们设置的帧率是一秒钟15帧,第一帧的播放时间就是一秒钟/15;第二帧的播放时间就是一秒钟/15*2,依次类推
    //1000000是微妙(视频剪辑都是微妙),换算成秒是1秒,
    public int computPts() {
        return 1000000 / 15 * index;
    }
}

Here's the video I shot
insert image description here
Here's the video I played
insert image description here

fix bugs

Two places are wrong, one is the wrong color, and the other is the wrong direction.
The first step of camera streaming is to rotate. In addition, you need to convert nv21 (only Android cameras support nv21, a very old format) to yuv420 (yuv420 is also known as nv12).

//注17:旋转算法
    public static void portraitData2Raw(byte[] data,byte[] output,int width,int height) {

        int y_len = width * height;

        int uvHeight = height >> 1;
        int k = 0;
        for (int j = 0; j < width; j++) {
            for (int i = height - 1; i >= 0; i--) {
                output[k++] = data[ width * i + j];
            }
        }
        for (int j = 0; j < width; j += 2) {
            for (int i = uvHeight - 1; i >= 0; i--) {
                output[k++] = data[y_len + width * i + j];
                output[k++] = data[y_len + width * i + j + 1];
            }
        }
    }


    // 注13:nv21转变为nv12(又名yuv420)。nv21是yuv的一种子集,只有Android摄像头支持
    // nv21:yyyyyyyyyyyyyyyyyyyyyyy vuvuvuvu VU交叉排列
//    nv12:yyyyyyyyyyyyyyyyyyyy     uvuvuvuv uv交叉排列
    static  byte[] nv12;
    public static byte[]  nv21toNV12(byte[] nv21) {
//        注14:实例化一个容器
        int  size = nv21.length;
        nv12 = new byte[size];
        //注15:Y的范围是0-width*height,y的长度是size*2/3
        int len = size * 2 / 3;
//        注16:把Y拷贝到数组
        System.arraycopy(nv21, 0, nv12, 0, len);
        int i = len;
        while(i < size - 1){
//注17:把偶数的v(0,2,4,6,8等)复制到数组,把奇数的u复制到数组
            nv12[i] = nv21[i + 1];
            nv12[i + 1] = nv21[i];
            i += 2;
        }
        return nv12;
    }

Guess you like

Origin blog.csdn.net/qczg_wxg/article/details/125855393