Realization of video frame extraction

video frame

1. Understanding of basic concepts

A video is composed of video frames, and each frame is a picture image visible to the naked eye

1. Video frame

Frame type:

The type of frame mainly refers to video frame extraction processing

  • I frame, Intra Picture, intra coded frame, that is, key frame. Have complete image information. The I frame does not need to rely on the information of the previous and subsequent frames, and can be decoded independently.
    P frame, predictive-frame, forward predictive coding frame. The P frame needs to rely on the previous I frame or P frame to encode and decode, because it stores the difference between the current frame and the previous frame. Professionally, it compresses the time redundant information, or extracts the motion characteristics.
  • B frame, bi-directional interpolated prediction frame, bidirectional predictive interpolation coding frame. The B frame stores the difference between the current frame and the preceding and following frames, so the decoding logic of a video with a B frame will be more complicated, and the CPU overhead will be greater
  • IDR frame, Instantaneous Decoding Refresh, instant decoding refresh. The IDR frame is a special I frame, which is a concept proposed to serve the codec. The function of the IDR frame is to refresh immediately, so that the error will not be propagated. Starting from the IDR frame, a new sequence is recalculated to start encoding.

2. Explanation of terms

  • Video encapsulation format: equivalent to the video suffix format, mp4 mov mpeg, the encapsulation format has nothing to do with the video encoding format, the video encapsulation format is an identification symbol given to video files by video playback software in order to be able to play video files
  • Video encoding: A program or device that compresses video. Common encoding methods include H.26x series and MPEG series

The video files saved in the computer are actually coded and compressed. Modifying the suffix of the video file actually has no effect on the video itself. The suffix is ​​only the encapsulation format, and the compressed information of the video itself is determined by the encoding format.
视频原始流 -> 视频编码方式 -> 视频封装格式

  • Frame rate: the number of frames per second, now the video frame rate is generally between 22-25, that is, the number of frames per second is between 22-25
  • Bit rate: The data flow used by a video file per unit time, which determines the quality and size of the video. In practical applications, it also needs hardware processing power and bandwidth conditions to choose
  • Resolution: that is what we often call 1080p, which determines the size of the image display
  • When the resolution of the three
    is fixed, if the frame rate is higher, the number of frames per second is higher, and the amount of data to be carried is higher, that is, the bit rate is higher; when the bit rate is constant, if the frame rate is higher, in order to ensure The amount of data carried by the bit rate, the amount of data in each frame of the picture also needs to be compressed, resulting in a decrease in picture clarity; the frame rate is the number of frames per second, which includes three types of frames IPB (key frame, forward prediction frame, Bidirectional predictive interpolation coded frames), among which the I frame carries the largest amount of data; the key frame setting can be set by the client (such as pr software), and the encoding and compression are carried out when exporting the video, and written to the file for saving

Two, FFmpeg video frame extraction practice

That is to say, in the actual frame drawing process, not every frame is an independent individual, which also depends on the position of the key frame.
The author here uses the java programming language to achieve:

1. Import related dependencies

    <dependency>
        <groupId>org.bytedeco</groupId>
        <artifactId>javacv-platform</artifactId>
        <version>1.5.7</version>
    </dependency>

2. The main idea: get the total number of frames of the video, loop through it, and save the frames that meet certain conditions according to the quantitative relationship between the input frame rate and the video frame rate

 /**
     * 抽帧
     * @param frameRate 帧率
     * @param storePath 截图图片要存放的路径
     * @param filePath 要截图的视频存放路径
     */
    public static List<String> grabFrameByFilePath(double frameRate, String storePath, String filePath) throws Exception {
    
    
        File folder = new File(storePath);
        boolean success = true;
        if (!folder.exists() && !folder.isDirectory()) {
    
    
             success = folder.mkdirs();
        }
        if (!success){
    
    
            throw new FileNotFoundException("文件夹创建异常");
        }
        List<String> sourceFiles = new ArrayList<>();
        FFmpegFrameGrabber fFmpegFrameGrabber = new FFmpegFrameGrabber(filePath);
        fFmpegFrameGrabber.start();

        grabFrames(fFmpegFrameGrabber,frameRate,storePath,sourceFiles);
        fFmpegFrameGrabber.close();
        return sourceFiles;
    }

    private static void grabFrames(FFmpegFrameGrabber fFmpegFrameGrabber, double frameRate, String storePath, List<String> sourceFiles) throws Exception {
    
    
        double videoRate = fFmpegFrameGrabber.getFrameRate();
        if (frameRate >= videoRate){
    
    
            
            //每一帧都获取
            doGrabPerFrame(fFmpegFrameGrabber,storePath,sourceFiles);
        }else if (frameRate >= ApolloConfig.VIDEO_RATE){
    
    
            //逐帧遍历,只获取需要的
            doGrabFramesByTraverseFrames(fFmpegFrameGrabber,frameRate,storePath,sourceFiles,videoRate);
        }else {
    
    
            //稍后会说明
            doGrabFramesBySetTimestamp(fFmpegFrameGrabber,frameRate,storePath,sourceFiles,videoRate);
        }
    }
/**
* 产生的所有文件都暂时以本地存储为主
* fFmpegFrameGrabber ffmpeg抽帧工具
* videoRate 视频帧率
* storePath 为统一放置抽帧图片使用
* sourceFiles 抽帧之后图片存放本地位置
* 
*/

void doGrabFramesByTraverseFrames(FFmpegFrameGrabber fFmpegFrameGrabber, double frameRate, String storePath, List<String> sourceFiles, double videoRate) throws FFmpegFrameGrabber.Exception {
    
    
        Java2DFrameConverter converter = new Java2DFrameConverter();
        int lengthInVideoFrames = fFmpegFrameGrabber.getLengthInFrames();
        int frameGapCount = (int) Math.ceil(lengthInVideoFrames * frameRate / videoRate);
        int frameGap = lengthInVideoFrames / frameGapCount;

        Frame f;
        String path;
        boolean flag;
        for (int i = 1; i <= frameGapCount; i++){
    
    
            flag = false;
            // 每frameGap取1帧
            for (int j = 1; j <= frameGap; j++) {
    
    
                f = fFmpegFrameGrabber.grabImage();
                path = String.format("%s/%s", storePath, i + ".jpg");
                doExecuteFrame(f,path,converter);
                if (PicDetectionUtil.checkPicAvailableBySourcePath(path) && !flag){
    
    //图片检测,看你实际使用,并不必须
                    sourceFiles.add(path);
                    flag = true;
                }
            }
        }
        //考虑总帧数不能整除情况
        if (lengthInVideoFrames > frameGap * frameGapCount){
    
    
            int diff = lengthInVideoFrames - frameGap * frameGapCount;
            while (diff > 0){
    
    
                f = fFmpegFrameGrabber.grabImage();
                path = String.format("%s/%s", storePath, frameGapCount + 1 + ".jpg");
                doExecuteFrame(f,path,converter);
                if (PicDetectionUtil.checkPicAvailableBySourcePath(path)){
    
    
                    sourceFiles.add(path);
                    break;
                }
                diff--;
            }
        }
        converter.close();
    }

Based on this code and the actual situation, the input frame rate is generally 1, so if the entire video is long enough, is it a relatively slow method to read frame by frame?

1. Obtain the total number of frames of the video, loop through it, and save the frames that meet certain conditions according to the quantitative relationship between the input frame rate and the video frame rate, which is the solution I implemented at the beginning

2. Obtain the total duration of the video, calculate the time interval according to the quantitative relationship between the input frame rate and the video frame rate, and obtain the number of frames to be saved through setTimeStamp according to the time interval

The second solution is implemented:

private static void doGrabFramesBySetTimestamp(FFmpegFrameGrabber fFmpegFrameGrabber, double frameRate, String storePath, List<String> sourceFiles, double videoRate) throws FFmpegFrameGrabber.Exception {
    
    
        Java2DFrameConverter converter = new Java2DFrameConverter();
        int lengthInVideoFrames = fFmpegFrameGrabber.getLengthInFrames();
        int frameGapCount = (int) Math.ceil(lengthInVideoFrames * frameRate / videoRate);
        long lengthInTime = fFmpegFrameGrabber.getLengthInTime();
        Frame f;
        String path;

        double v = lengthInTime / ( 1.0 * frameGapCount) ;
        int idx = 1;
        for (long i = 1L; i <= lengthInTime; i += Math.ceil(v)){
    
    
            fFmpegFrameGrabber.setTimestamp(i);
            f = fFmpegFrameGrabber.grabImage();
            path = String.format("%s/%s", storePath, idx + ".jpg");
            idx++;
            doExecuteFrame(f,path,converter);
            if (PicDetectionUtil.checkPicAvailableBySourcePath(path)){
    
    
                sourceFiles.add(path);
            }
        }
        if (idx == frameGapCount){
    
    
            fFmpegFrameGrabber.setTimestamp(lengthInTime);
            f = fFmpegFrameGrabber.grabImage();
            path = String.format("%s/%s", storePath, idx + ".jpg");
            doExecuteFrame(f,path,converter);
            if (PicDetectionUtil.checkPicAvailableBySourcePath(path)){
    
    
            	//图片检测,不是必须	
                sourceFiles.add(path);
            }
        }
        converter.close();
    }

The efficiency of the two schemes is actually related to the number of key frames of the video. If the position of the key frame of the video frame to be extracted happens to happen every time the specified time stamp is located, the position of the key frame needs to be traced back. The efficiency will not be as good as the original plan.

The author did a simple experiment to compare the time consumption of the two schemes when the frame rate changed from 1 to the video frame rate. The experimental results show that the second scheme is more efficient when it is less than or equal to 1/2 of the video frame rate. , and when the frame rate exceeds 1/2, the first scheme is more efficient

This experiment is not particularly convincing. In the end, the author decided to draw a frame by setting a threshold.

double videoRate = fFmpegFrameGrabber.getFrameRate();
        if (frameRate >= videoRate){
    
    
            
            //每一帧都获取
            doGrabPerFrame(fFmpegFrameGrabber,storePath,sourceFiles);
        }else if (frameRate >= ApolloConfig.VIDEO_RATE){
    
    
            //逐帧遍历,只获取需要的
            doGrabFramesByTraverseFrames(fFmpegFrameGrabber,frameRate,storePath,sourceFiles,videoRate);
        }else {
    
    
            //定位时间戳
            doGrabFramesBySetTimestamp(fFmpegFrameGrabber,frameRate,storePath,sourceFiles,videoRate);
        }

Note: If other bigwigs who read this blog have knowledge about this aspect, welcome to give advice

Guess you like

Origin blog.csdn.net/weixin_47407737/article/details/128894801