Qt/C++ audio and video development 57-switch audio and video tracks/switch program streams/switch audio and video tracks separately

I. Introduction

Support for various audio and video file formats is a basic function of a player. Generally, audio and video files only have one stream. For example, audio files only have one audio stream, and video files have only one audio stream and one video stream. During practice, it was found that there is also a ts format file that may have multiple streams. This This format generally encapsulates multiple program streams into one file. Users can switch different programs according to their own needs. For example, CCTV1 and CCTV2 are both in one ts stream file. Users can choose to switch to CCTV1 or CCTV2. CCTV2, and the audio stream and video stream are separate indexes. You must also switch to the corresponding stream so that the audio and video appear to be consistent. Of course, you can also choose to switch to different audio. Some files have 3 video streams plus 6 channels. The audio stream also provides bilingual audio streams in Chinese and English. Therefore, the audio and video tracks cannot be hard-coded into one interface in the program interface. Instead, different settings should be supported, which also facilitates users to switch multi-language audio tracks.

During the ffmpeg decoding process, the specific number of streams can be obtained through formatCtx->nb_streams. Each stream is audio or video, and has a corresponding index. When the user needs to see which stream, in the decoded code AVStream *videoStream = formatCtx->stream[videoIndex]; Just pass in the corresponding index. Generally speaking, only one video is played each time. It can also be changed to decode and play multiple videos at the same time, which is equivalent to watching multiple videos at the same time. program.

2. Effect drawing

Insert image description here

3. Experience address

  1. Domestic station:https://gitee.com/feiyangqingyun
  2. International station: https://github.com/feiyangqingyun
  3. Personal works:https://blog.csdn.net/feiyangqingyun/article/details/97565652
  4. Experience address:https://pan.baidu.com/s/1d7TH_GEYl5nOecuNlWJJ7g Extraction code: 01jf File name: bin_video_demo.

4. Functional features

4.1. Basic functions

  1. Supports various audio and video file formats, such as mp3, wav, mp4, asf, rm, rmvb, mkv, etc.
  2. Supports local camera devices and local desktop collection, and supports multiple devices and multiple screens.
  3. Supports various video streaming formats, such as rtp, rtsp, rtmp, http, udp, etc.
  4. Local audio and video files and network audio and video files can automatically identify file length, playback progress, volume, mute status, etc.
  5. The file can specify the playback position, adjust the volume, set the mute status, etc.
  6. Supports double-speed playback of files, with optional speeds of 0.5x, 1.0x, 2.5x, 5.0x, etc., which is equivalent to slow playback and fast playback.
  7. Supports starting, stopping, pausing, and continuing playback.
  8. Supports capturing screenshots, you can specify the file path, and you can choose whether to automatically display the preview after the capture is completed.
  9. Support video storage, manually start recording, stop recording, some kernels support continuing recording after pausing recording, skipping the parts that are not needed for recording.
  10. Supports mechanisms such as senseless switching loop playback and automatic reconnection.
  11. Provides signals such as successful playback, playback completion, decoded picture received, captured picture received, video size change, recording status change, etc.
  12. Multi-thread processing, one decoding thread, no stuck on the main interface.

4.2. Features

  1. Supports multiple decoding kernels at the same time, including qmedia kernel (Qt4/Qt5/Qt6), ffmpeg kernel (ffmpeg2/ffmpeg3/ffmpeg4/ffmpeg5/ffmpeg6), vlc kernel (vlc2/vlc3), mpv kernel (mpv1/mp2), and mdk kernel , Hikvision sdk, easyplayer kernel, etc.
  2. With a very complete multiple base class design, adding a new decoding kernel only requires a very small amount of code to apply the entire mechanism, making it easy to expand.
  3. Supports multiple screen display strategies at the same time, automatic adjustment (the original resolution is smaller than the size of the display control, it will be displayed according to the original resolution, otherwise it will be proportionally scaled), proportional scaling (always proportional scaling), stretch filling (always stretched and filled) ). Three picture display strategies are supported in all cores and in all video display modes.
  4. Supports multiple video display modes at the same time, handle mode (the incoming control handle is handed over to the other party for drawing control), drawing mode (the callback gets the data and then converts it to QImage for drawing with QPainter), GPU mode (the callback gets the data and then converts it to yuv) QOpenglWidget drawing).
  5. Supports multiple hardware acceleration types, ffmpeg can choose dxva2, d3d11va, etc., vlc can choose any, dxva2, d3d11va, mpv can choose auto, dxva2, d3d11va, mdk can choose dxva2, d3d11va, cuda, mft, etc. Different system environments have different types to choose from. For example, Linux systems have vaapi and vdpau, and macos systems have videotoolbox.
  6. The decoding thread and the display form are separated, and any decoding core can be specified to be mounted to any display form and switched dynamically.
  7. Supports shared decoding thread, which is enabled by default and processed automatically. When the same video address is recognized, a decoding thread is shared, which can greatly save network traffic and streaming pressure on the other device in a network video environment. The top domestic video manufacturers all adopt this strategy. In this way, as long as one video stream is pulled, it can be shared to dozens or hundreds of channels for display.
  8. Automatically identify the video rotation angle and draw it. For example, videos shot on mobile phones are generally rotated 90 degrees. They must be automatically rotated during playback, otherwise they will be upside down by default.
  9. Automatically identify changes in resolution during video stream playback and automatically adjust the size on the video controls. For example, the camera can dynamically configure the resolution during use, and the corresponding video controls must also respond synchronously when the resolution is changed.
  10. Audio and video files automatically switch and play in a loop without any perception, and there will be no visible switching traces such as a black screen during switching.
  11. The video control also supports any decoding core, any picture display strategy, and any video display mode.
  12. The video control floating bar supports three modes: handle, drawing, and GPU at the same time, and non-absolute coordinates can be moved around.
  13. The local camera device supports specifying the device name, resolution, and frame rate for playback.
  14. Local desktop collection supports setting the collection area, offset value, specified desktop index, frame rate, simultaneous collection of multiple desktops, etc.
  15. Recording files also support open video files, local cameras, local desktops, network video streams, etc.
  16. Instant response to opening and closing, whether it is opening a non-existent video or network stream, detecting the presence of a device, waiting for a timeout during reading, and immediately interrupting the previous operation and responding after receiving a shutdown command.
  17. Supports opening various picture files, and supports drag-and-drop playback of local audio and video files.
  18. The video streaming communication method can be tcp/udp. Some devices may only provide a certain protocol communication such as tcp. You need to specify the protocol method to open.
  19. You can set the connection timeout (timeout for video stream detection) and read timeout (timeout during acquisition).
  20. Supports frame-by-frame playback, provides previous frame/next frame function interface, and can view the collected images frame by frame.
  21. Audio files automatically extract album information such as title, artist, album, and album cover, and automatically display the album cover.
  22. The video response has extremely low latency of about 0.2s, and the extremely fast response to opening the video stream is about 0.5s, which has been specially optimized.
  23. Supports H264/H265 encoding (more and more surveillance cameras now use H265 video stream format) to generate video files, and automatically recognizes and switches the encoding format internally.
  24. Supports playback of video streams containing special characters in user information (for example, characters such as +#@ in user information), with built-in parsing and escaping processing.
  25. Supports filters, various watermarks and graphic effects, supports multiple watermarks and images, and can write OSD tag information and various graphic information to MP4 files.
  26. Supports various audio formats in video streams, including AAC, PCM, G.726, G.711A, G.711Mu, G.711ulaw, G.711alaw, MP2L2, etc. It is recommended to choose AAC for the best cross-platform compatibility.
  27. The kernel ffmpeg uses pure qt+ffmpeg decoding and does not rely on third-party drawing and playback such as SDL. The gpu drawing uses qopenglwidget and the audio playback uses qaudiooutput.
  28. Kernel ffmpeg and kernel mdk support Android, among which mdk supports Android hard decoding, and the performance is very brutal.
  29. You can switch audio and video tracks, that is, program channels. Maybe the ts file contains multiple audio and video program streams. You can set which one to play respectively. You can set it before playing and set it dynamically during playback.
  30. The video rotation angle can be set before playback and dynamically set during playback.
  31. The video control floating bar comes with functions such as starting and stopping recording, muting the sound, taking screenshots, and closing the video.
  32. The audio component supports sound waveform value data analysis. Waveform curves and columnar sound bars can be drawn based on this value. Sound amplitude signals are provided by default.
  33. Labels and graphic information support three drawing methods: drawing to mask layer, drawing to picture, and source drawing (corresponding information can be stored in a file).
  34. By passing in a url address, the address can bring communication protocol, resolution, frame rate and other information without any other settings.
  35. Three strategies are supported for saving videos to files: automatic processing, file only, and all transcoding. The transcoding strategy supports automatic identification, conversion to 264, and conversion to 265. Encoding saving supports specified resolution scaling or equal scaling. For example, if you have requirements on the size of the saved file, you can specify scaling before saving.
  36. Supports encrypted saving files and decrypted playback files, and you can specify the secret key text.
  37. It supports electronic magnification. Switch to the electronic magnification mode on the floating bar, select the area that needs to be enlarged on the screen, and it will automatically enlarge after the selection. It can be reset by switching the magnification mode again.
  38. Extremely detailed printing information prompts in each component, especially error message prompts, and a unified printing format for packaging. It is extremely convenient and useful to test the complex equipment environment on site, which is equivalent to pinpointing the specific channel and step that went wrong.
  39. At the same time, simple examples, video players, multi-screen video monitoring, monitoring playback, frame-by-frame playback, multi-screen rendering and other separate form examples are provided to specifically demonstrate how to use the corresponding functions.
  40. The code framework and structure are optimized to the best, the performance is powerful, and it is continuously updated and upgraded.
  41. The source code supports Qt4, Qt5, and Qt6 and is compatible with all versions.

4.3. Video controls

  1. Any number of osd tag information can be added dynamically. The tag information includes name, visible or not, font size, text text, text color, background color, tag image, tag coordinates, tag format (text, date, time, date time, picture) , label position (upper left corner, lower left corner, upper right corner, lower right corner, center, custom coordinates).
  2. Any amount of graphic information can be added dynamically, which is very useful. For example, the graphic area information analyzed by the artificial intelligence algorithm can be directly sent to the video control. Graphic information supports arbitrary shapes and is drawn directly on the original image using absolute coordinates.
  3. Graphic information includes name, border size, border color, background color, rectangular area, path set, point coordinate set, etc.
  4. Each graphic information can specify one or more of the three areas, and the specified areas will be drawn.
  5. Built-in floating bar control, the floating bar position supports top, bottom, left and right.
  6. The parameters of the floating bar control include margins, spacing, background transparency, background color, text color, pressed color, position, button icon code set, button name identification set, and button prompt information set.
  7. The row of tool buttons in the floating bar control can be customized. Through the structure parameter settings, the icon can choose graphic fonts or custom pictures.
  8. The floating bar button internally implements functions such as video switching, capturing screenshots, mute switching, and closing videos. You can also add your own corresponding functions in the source code.
  9. The floating bar button corresponds to the button that has implemented the function, and has corresponding icon switching processing. For example, after pressing the video button, it will switch to the icon that is recording, and after the sound button is switched, it will become the mute icon, and it will be restored by switching again.
  10. When the floating bar button is clicked, it is sent out with a unique name as a signal, and can be associated with response processing by itself.
  11. Prompt information can be displayed in the blank area of ​​the floating bar. The current video resolution is displayed by default. Information such as frame rate and bit stream size can be added.
  12. Video control parameters include border size, border color, focus color, background color (default transparent), text color (default global text color), fill color (the blank space outside the video is filled with black), background text, background image (if set Pictures are taken first), whether to copy pictures, zoom display mode (automatic adjustment, proportional scaling, stretch filling), video display mode (handle, drawing, GPU), enable floating bar, floating bar size (horizontal is height, vertical is is the width), the position of the floating bar (top, bottom, left, right).

5. Related codes

void FFmpegHelper::getTracks(AVFormatContext *formatCtx, QList<int> &audioTracks, QList<int> &videoTracks)
{
    
    
    //获取音视频轨道信息(一般有一个音频或者一个视频/ts节目文件可能有多个)
    audioTracks.clear();
    videoTracks.clear();
    int count = formatCtx->nb_streams;
    for (int i = 0; i < count; ++i) {
    
    
        AVMediaType type = FFmpegHelper::getMediaType(formatCtx->streams[i]);
        if (type == AVMEDIA_TYPE_AUDIO) {
    
    
            audioTracks << i;
        } else if (type == AVMEDIA_TYPE_VIDEO) {
    
    
            videoTracks << i;
        }
    }
}

bool FFmpegThread::initVideo()
{
    
    
    //找到视频流索引
    videoIndex = av_find_best_stream(formatCtx, AVMEDIA_TYPE_VIDEO, -1, -1, NULL, 0);
    if (videoIndex < 0) {
    
    
        //有些没有视频流所以这里不用返回
        videoIndex = -1;
        debug(0, "无视频流", "");
    } else {
    
    
        //如果手动指定了轨道则取指定的(节目流有多个轨道可以指定某个)
        if (videoTrack >= 0 && videoTracks.contains(videoTrack)) {
    
    
            videoIndex = videoTrack;
        }

        //取出流获取对应的信息创建解码器
        int result = -1;
        AVStream *videoStream = formatCtx->stream[videoIndex];

        //如果主动设置过旋转角度则将旋转信息设置到流信息中以便保存那边也应用(不需要保存也旋转可以注释)
        if (rotate != -1) {
    
    
            FFmpegHelper::setRotate(videoStream, rotate);
        }

        //先获取旋转角度(如果有旋转角度则不能用硬件加速)
        this->getRotate();
        if (rotate != 0) {
    
    
            hardware = "none";
        }

        //查找视频解码器(如果上面av_find_best_stream第五个参数传了则这里不需要)
        AVCodecID codecID = FFmpegHelper::getCodecId(videoStream);
        if (codecID == AV_CODEC_ID_NONE) {
    
    
            debug(result, "无视解码", "");
            return false;
        }

        //获取默认的解码器
        videoCodec = avcodec_find_decoder(codecID);
        videoCodecName = videoCodec->name;

        //创建视频解码器上下文
        videoCodecCtx = avcodec_alloc_context3(NULL);
        if (!videoCodecCtx) {
    
    
            debug(result, "创建视解", "");
            return false;
        }

        result = FFmpegHelper::copyContext(videoCodecCtx, videoStream, false);
        if (result < 0) {
    
    
            debug(result, "视频参数", "");
            return false;
        }

        //初始化硬件加速(也可以叫硬解码/如果当前格式不支持硬解则立即切换到软解码)
        if (hardware != "none" && !initHardware()) {
    
    
            hardware = "none";
            videoCodec = avcodec_find_decoder(codecID);
        }

        if (!videoCodec) {
    
    
            return false;
        }

        //设置低延迟和加速解码等参数(设置max_lowres的话很可能画面采用最小的分辨率)
        if (!getIsFile()) {
    
    
            //videoCodecCtx->lowres = videoCodec->max_lowres;
            videoCodecCtx->flags |= AV_CODEC_FLAG_LOW_DELAY;
            videoCodecCtx->flags |= AV_CODEC_FLAG_GLOBAL_HEADER;
            videoCodecCtx->flags2 |= AV_CODEC_FLAG2_FAST;
        }

        //打开视频解码器
        result = avcodec_open2(videoCodecCtx, videoCodec, NULL);
        if (result < 0) {
    
    
            debug(result, "打开视解", "");
            return false;
        }

        if (videoCodecCtx->pix_fmt == AV_PIX_FMT_NONE) {
    
    
            debug(0, "格式为空", "");
            return false;
        }

        //获取分辨率大小
        FFmpegHelper::getResolution(videoStream, videoWidth, videoHeight);
        //如果没有获取到宽高则返回
        if (videoWidth == 0 || videoHeight == 0) {
    
    
            debug(0, "无分辨率", "");
            return false;
        }

        //记录首帧开始时间和解码器名称
        videoFirstPts = videoStream->start_time;
        videoCodecName = videoCodec->name;
        frameRate = FFmpegHelper::getFrameRate(videoStream, formatName);
        QString msg = QString("索引: %1 解码: %2 帧率: %3 宽高: %4x%5 角度: %6").arg(videoIndex).arg(videoCodecName).arg(frameRate).arg(videoWidth).arg(videoHeight).arg(rotate);
        debug(0, "视频信息", msg);
    }

    return true;
}

void MdkThread::setAudioTrack(int audioTrack)
{
    
    
    this->audioTrack = audioTrack;
    if (audioTrack >= 0 && audioTracks.count() > 1 && audioTracks.contains(audioTrack)) {
    
    
        mdkPlayer->setAudioTrack(audioTrack);
    }
}

void MdkThread::setVideoTrack(int videoTrack)
{
    
    
    this->videoTrack = videoTrack;
    if (videoTrack >= 0 && videoTracks.count() > 1 && videoTracks.contains(videoTrack)) {
    
    
        mdkPlayer->setVideoTrack(videoTrack);
    }
}

Guess you like

Origin blog.csdn.net/feiyangqingyun/article/details/134762923