Qt/C++ audio and video development 60-coordinate picking/press the mouse to obtain the rectangular area/convert to the real coordinates of the video source

I. Introduction

By picking the coordinates of the pressed mouse on the channel screen, and then moving the mouse until it is released, a rectangular area is drawn based on the released coordinates and the pressed coordinates as a hotspot or an area that requires electronic amplification, and this coordinate area is obtained , has many uses. It can directly enlarge the picture in the area, or send the location of the circled area to the device. The device will set the corresponding hotspot area as a centralized observation point, which can be used for artificial intelligence analysis, such as when The faces in this area can be judged as intrusion, the pictures in this area have been altered, and the objects in this area can be judged as illegal movement, etc. By applying various analysis algorithms, a lot of detection effects can be achieved. All of these have a premise, that is, users can freely select the areas they need in the video screen. This is the function to be achieved.

The collected video data may be displayed stretched and filled on the UI interface, or may be displayed proportionally scaled. The most important thing is that the displayed form is almost impossible to be exactly the same size as the resolution. So this involves a conversion relationship, that is, according to the size of the form and the size of the video, the coordinates of the current mouse click need to be converted into the coordinates corresponding to the video. The conversion formula is: video X = coordinate X / window width * video Width, video Y = coordinate Y / form height * video height. Therefore, just identify the mouse press/mouse move/mouse release event on the video form control and process it, and finally send the signal with the type (mouse press/mouse move/mouse release) and QPoint coordinates. Why bring the type? It is convenient for the user to process, such as recognizing the user's press and remembering the coordinates, drawing a box when moving, and sending the filter to perform cropping at the end, which is an electronic magnification operation.

2. Effect drawing

Insert image description here

3. Experience address

  1. Domestic site: https://gitee.com/feiyangqingyun
  2. International site: https://github.com/feiyangqingyun
  3. Personal work: https://blog.csdn.net/feiyangqingyun/article/details/97565652
  4. Experience address: https://pan.baidu.com/s/1d7TH_GEYl5nOecuNlWJJ7g Extraction code: 01jf File name: bin_video_demo.
  5. Video homepage: https://space.bilibili.com/687803542

4. Functional features

4.1. Basic functions

  1. Supports various audio and video file formats, such as mp3, wav, mp4, asf, rm, rmvb, mkv, etc.
  2. Supports local camera devices and local desktop collection, and supports multiple devices and multiple screens.
  3. Supports various video streaming formats, such as rtp, rtsp, rtmp, http, udp, etc.
  4. Local audio and video files and network audio and video files can automatically identify file length, playback progress, volume, mute status, etc.
  5. Files can specify the playback position, adjust the volume, set the mute status, etc.
  6. Supports double-speed playback of files, with optional speeds of 0.5x, 1.0x, 2.5x, 5.0x, etc., which is equivalent to slow playback and fast playback.
  7. Supports starting, stopping, pausing, and continuing playback.
  8. Supports capturing screenshots, you can specify the file path, and you can choose whether to automatically display a preview after the capture is completed.
  9. Support video storage, manually start recording, stop recording, some kernels support continuing recording after pausing recording, skipping the parts that are not needed for recording.
  10. Supports mechanisms such as senseless switching loop playback and automatic reconnection.
  11. Provides signals such as successful playback, playback completion, decoded picture received, captured picture received, video size change, recording status change, etc.
  12. Multi-thread processing, one decoding thread, no stuck on the main interface.

4.2. Features

  1. Supports multiple decoding kernels at the same time, including qmedia kernel (Qt4/Qt5/Qt6), ffmpeg kernel (ffmpeg2/ffmpeg3/ffmpeg4/ffmpeg5/ffmpeg6), vlc kernel (vlc2/vlc3), mpv kernel (mpv1/mp2), and mdk kernel , Hikvision sdk, easyplayer kernel, etc.
  2. With a very complete multiple base class design, adding a new decoding kernel only requires a very small amount of code to apply the entire mechanism, making it easy to expand.
  3. Supports multiple screen display strategies at the same time, automatic adjustment (the original resolution is smaller than the size of the display control, it will be displayed according to the original resolution, otherwise it will be proportionally scaled), proportional scaling (always proportional scaling), stretch filling (always stretched and filled) ). Three picture display strategies are supported in all cores and in all video display modes.
  4. Supports multiple video display modes at the same time, handle mode (the incoming control handle is handed over to the other party for drawing control), drawing mode (the callback gets the data and then converts it to QImage for drawing with QPainter), GPU mode (the callback gets the data and then converts it to yuv) QOpenglWidget drawing).
  5. Supports multiple hardware acceleration types, ffmpeg can choose dxva2, d3d11va, etc., vlc can choose any, dxva2, d3d11va, mpv can choose auto, dxva2, d3d11va, mdk can choose dxva2, d3d11va, cuda, mft, etc. Different system environments have different types to choose from. For example, Linux systems have vaapi and vdpau, and macos systems have videotoolbox.
  6. The decoding thread and the display form are separated, and any decoding core can be specified to be mounted to any display form and switched dynamically.
  7. Supports shared decoding thread, which is enabled by default and processed automatically. When the same video address is recognized, a decoding thread is shared, which can greatly save network traffic and streaming pressure on the other device in a network video environment. The top domestic video manufacturers all adopt this strategy. In this way, as long as one video stream is pulled, it can be shared to dozens or hundreds of channels for display.
  8. Automatically identify the video rotation angle and draw it. For example, videos shot on mobile phones are generally rotated 90 degrees. They must be automatically rotated during playback, otherwise they will be upside down by default.
  9. Automatically identify changes in resolution during video stream playback and automatically adjust the size on the video controls. For example, the camera can dynamically configure the resolution during use, and the corresponding video controls must also respond synchronously when the resolution is changed.
  10. Audio and video files automatically switch and play in a loop without any perception, and there will be no visible switching traces such as a black screen during switching.
  11. The video control also supports any decoding core, any picture display strategy, and any video display mode.
  12. The video control floating bar supports three modes: handle, drawing, and GPU at the same time, and non-absolute coordinates can be moved around.
  13. The local camera device supports specifying the device name, resolution, and frame rate for playback.
  14. Local desktop collection supports setting the collection area, offset value, specified desktop index, frame rate, simultaneous collection of multiple desktops, etc. It also supports specifying the window title to collect fixed windows.
  15. Recording files also support open video files, local cameras, local desktops, network video streams, etc.
  16. Instant response to opening and closing, whether it is opening a non-existent video or network stream, detecting the presence of a device, waiting for a timeout during reading, and immediately interrupting the previous operation and responding after receiving a shutdown command.
  17. Supports opening various picture files, and supports drag-and-drop playback of local audio and video files.
  18. The video streaming communication method can be tcp/udp. Some devices may only provide a certain protocol communication such as tcp. You need to specify the protocol method to open.
  19. You can set the connection timeout (timeout for video stream detection) and read timeout (timeout during collection).
  20. Supports frame-by-frame playback, provides previous frame/next frame function interface, and can view the collected images frame by frame.
  21. Audio files automatically extract album information such as title, artist, album, and album cover, and automatically display the album cover.
  22. The video response has extremely low latency of about 0.2s, and the extremely fast response to opening the video stream is about 0.5s, which has been specially optimized.
  23. Supports H264/H265 encoding (more and more surveillance cameras now use H265 video stream format) to generate video files, and automatically recognizes and switches the encoding format internally.
  24. Supports playback of video streams containing special characters in user information (for example, characters such as +#@ in user information), with built-in parsing and escaping processing.
  25. Supports filters, various watermarks and graphic effects, supports multiple watermarks and images, and can write OSD tag information and various graphic information to MP4 files.
  26. Supports various audio formats in video streams, including AAC, PCM, G.726, G.711A, G.711Mu, G.711ulaw, G.711alaw, MP2L2, etc. It is recommended to choose AAC for the best cross-platform compatibility.
  27. The kernel ffmpeg uses pure qt+ffmpeg decoding and does not rely on third-party drawing and playback such as SDL. The gpu drawing uses qopenglwidget and the audio playback uses qaudiooutput.
  28. Kernel ffmpeg and kernel mdk support Android, among which mdk supports Android hard decoding, and the performance is very brutal.
  29. You can switch audio and video tracks, that is, program channels. Maybe the ts file contains multiple audio and video program streams. You can set which one to play respectively. You can set it before playing and set it dynamically during playback.
  30. The video rotation angle can be set before playback and dynamically changed during playback.
  31. The video control floating bar comes with functions such as starting and stopping recording, muting the sound, taking screenshots, and closing the video.
  32. The audio component supports sound waveform value data analysis. Waveform curves and columnar sound bars can be drawn based on this value. Sound amplitude signals are provided by default.
  33. Labels and graphic information support three drawing methods: drawing to mask layer, drawing to picture, and source drawing (corresponding information can be stored in a file).
  34. By passing in a url address, the address can bring communication protocol, resolution, frame rate and other information without any other settings.
  35. Three strategies are supported for saving videos to files: automatic processing, file only, and all transcoding. The transcoding strategy supports automatic identification, conversion to 264, and conversion to 265. Encoding saving supports specified resolution scaling or equal scaling. For example, if you have requirements on the size of the saved file, you can specify scaling before saving.
  36. Supports encrypted saving files and decrypted playback files, and you can specify the secret key text.
  37. The provided monitoring layout class supports 64-channel simultaneous display, and also supports various special-shaped layouts, such as 13-channel, 6-row and 2-column layout on mobile phones. Various layouts can be freely defined.
  38. It supports electronic magnification. Switch to the electronic magnification mode on the floating bar, select the area that needs to be enlarged on the screen, and it will automatically enlarge after the selection. It can be reset by switching the magnification mode again.
  39. Extremely detailed printing information prompts in each component, especially error message prompts, and a unified printing format for packaging. It is extremely convenient and useful to test the complex equipment environment on site, which is equivalent to pinpointing the specific channel and step that went wrong.
  40. At the same time, simple examples, video players, multi-screen video monitoring, monitoring playback, frame-by-frame playback, multi-screen rendering and other separate form examples are provided to specifically demonstrate how to use the corresponding functions.
  41. Monitoring playback can select different manufacturer types, playback time periods, user information, and designated channels. Support switching playback progress.
  42. You can select a sound card to play sound from the sound card device drop-down box, and provide a corresponding function interface for switching sound cards.
  43. It supports compilation to mobile app and provides a special mobile app layout interface, which can be used as video surveillance on mobile phones.
  44. The code framework and structure are optimized to the best, with powerful performance, detailed annotations, and continuous iterative updates and upgrades.
  45. The source code supports windows, linux, mac, android, etc., and supports various domestic Linux systems, including but not limited to Tongxin UOS/Winning Kirin/Galaxy Kirin, etc. Embedded linux is also supported.
  46. The source code supports Qt4, Qt5, and Qt6 and is compatible with all versions.

4.3. Video controls

  1. Any number of osd label information can be added dynamically. The label information includes name, visible or not, font size, text text, text color, background color, label image, label coordinates, label format (text, date, time, date time, picture) , label position (upper left corner, lower left corner, upper right corner, lower right corner, center, custom coordinates).
  2. Any amount of graphic information can be added dynamically. For example, the graphic area information analyzed by the artificial intelligence algorithm can be directly sent to the video control. Graphic information supports arbitrary shapes and is drawn directly on the original image using absolute coordinates.
  3. Graphic information includes name, border size, border color, background color, rectangular area, path set, point coordinate set, etc.
  4. Each graphic information can specify one or more of the three areas, and the specified areas will be drawn.
  5. Built-in floating bar control, the floating bar position supports top, bottom, left and right.
  6. The parameters of the floating bar control include margins, spacing, background transparency, background color, text color, pressed color, position, button icon code set, button name identification set, and button prompt information set.
  7. The row of tool buttons in the floating bar control can be customized. Through the structure parameter settings, the icon can choose graphic fonts or custom pictures.
  8. The floating bar button internally implements functions such as video switching, capturing screenshots, mute switching, and closing videos. You can also add your own corresponding functions in the source code.
  9. The floating bar button corresponds to the button that has implemented the function, and has corresponding icon switching processing. For example, after pressing the video button, it will switch to the icon that is recording, and after the sound button is switched, it will become the mute icon, and it will be restored by switching again.
  10. When the floating bar button is clicked, it is sent out with a unique name as a signal, and can be associated with response processing by itself.
  11. Prompt information can be displayed in the blank area of ​​the floating bar. The current video resolution is displayed by default. Information such as frame rate and bit stream size can be added.
  12. Video control parameters include border size, border color, focus color, background color (default transparent), text color (default global text color), fill color (the blank space outside the video is filled with black), background text, background image (if set Pictures are taken first), whether to copy pictures, zoom display mode (automatic adjustment, proportional scaling, stretch filling), video display mode (handle, drawing, GPU), enable floating bar, floating bar size (horizontal is height, vertical is is the width), the position of the floating bar (top, bottom, left, right).

5. Related codes

void VideoWidget::btnClicked(const QString &btnName)
{
    
    
    QString flag = widgetPara.videoFlag;
    QString name = STRDATETIMEMS;
    if (!flag.isEmpty()) {
    
    
        name = QString("%1_%2").arg(flag).arg(name);
    }

    if (btnName.endsWith("btnRecord")) {
    
    
        QString fileName = QString("%1/%2.mp4").arg(recordPath).arg(name);
        this->recordStart(fileName);
    } else if (btnName.endsWith("btnStop")) {
    
    
        this->recordStop();
    } else if (btnName.endsWith("btnSound")) {
    
    
        this->setMuted(true);
    } else if (btnName.endsWith("btnMuted")) {
    
    
        this->setMuted(false);
    } else if (btnName.endsWith("btnSnap")) {
    
    
        QString snapName = QString("%1/%2.jpg").arg(snapPath).arg(name);
        this->snap(snapName, false);
    } else if (btnName.endsWith("btnCrop")) {
    
    
        if (videoThread) {
    
    
            if (videoPara.videoCore == VideoCore_FFmpeg) {
    
    
                QMetaObject::invokeMethod(videoThread, "setCrop", Q_ARG(bool, true));
            }
        }
    } else if (btnName.endsWith("btnReset")) {
    
    
        if (videoThread) {
    
    
            this->removeGraph("crop");
            if (videoPara.videoCore == VideoCore_FFmpeg) {
    
    
                QMetaObject::invokeMethod(videoThread, "setCrop", Q_ARG(bool, false));
            }
        }
    } else if (btnName.endsWith("btnAlarm")) {
    
    

    } else if (btnName.endsWith("btnClose")) {
    
    
        this->stop();
    }
}

void AbstractVideoWidget::appendGraph(const GraphInfo &graph)
{
    
    
    QMutexLocker locker(&mutex);
    listGraph << graph;
    this->update();
    emit sig_graphChanged();
}

void AbstractVideoWidget::removeGraph(const QString &name)
{
    
    
    QMutexLocker locker(&mutex);
    int count = listGraph.count();
    for (int i = 0; i < count; ++i) {
    
    
        if (listGraph.at(i).name == name) {
    
    
            listGraph.removeAt(i);
            break;
        }
    }

    this->update();
    emit sig_graphChanged();
}

void AbstractVideoWidget::clearGraph()
{
    
    
    QMutexLocker locker(&mutex);
    listGraph.clear();
    this->update();
    emit sig_graphChanged();
}

QString FilterHelper::getFilter(const GraphInfo &graph, bool hardware)
{
    
    
    //drawbox=x=10:y=10:w=100:h=100:c=#ffffff@1:t=2
    QString filter;
    //有个现象就是硬解码下的图形滤镜会导致原图颜色不对
    if (hardware) {
    
    
        return filter;
    }

    //暂时只实现了矩形区域
    QRect rect = graph.rect;
    if (rect.isEmpty()) {
    
    
        return filter;
    }

    //过滤关键字用于电子放大
    if (graph.name == "crop") {
    
    
        filter = QString("crop=%1:%2:%3:%4").arg(rect.width()).arg(rect.height()).arg(rect.x()).arg(rect.y());
        return filter;
    }

    QStringList list;
    list << QString("x=%1").arg(rect.x());
    list << QString("y=%1").arg(rect.y());
    list << QString("w=%1").arg(rect.width());
    list << QString("h=%1").arg(rect.height());

    QColor color = graph.borderColor;
    list << QString("c=%1@%2").arg(color.name()).arg(color.alphaF());

    //背景颜色不透明则填充背景颜色
    if (graph.bgColor == Qt::transparent) {
    
    
        list << QString("t=%1").arg(graph.borderWidth);
    } else {
    
    
        list << QString("t=%1").arg("fill");
    }

    filter = QString("drawbox=%1").arg(list.join(":"));
    return filter;
}

QString FilterHelper::getFilters(const QStringList &listFilter)
{
    
    
    //挨个取出图片滤镜对应的图片和坐标
    int count = listFilter.count();
    QStringList listImage, listPosition, listTemp;
    for (int i = 0; i < count; ++i) {
    
    
        QString filter = listFilter.at(i);
        if (filter.startsWith("movie=")) {
    
    
            QStringList list = filter.split(";");
            QString movie = list.first();
            QString overlay = list.last();
            movie.replace("[wm]", "");
            overlay.replace("[wm]", "");
            overlay.replace("[in]", "");
            overlay.replace("[out]", "");
            listImage << movie;
            listPosition << overlay;
        } else {
    
    
            listTemp << filter;
        }
    }

    //图片滤镜字符串在下面重新处理
    QString filterImage, filterAll;
    QString filterOther = listTemp.join(",");

    //存在图片水印需要重新调整滤镜字符串
    //1张图: movie=./osd.png[wm0];[in][wm0]overlay=0:0[out]
    //2张图: movie=./osd.png[wm0];movie=./osd.png[wm1];[in][wm0]overlay=0:0[a];[a][wm1]overlay=0:0[out]
    //3张图: movie=./osd.png[wm0];movie=./osd.png[wm1];movie=./osd.png[wm2];[in][wm0]overlay=0:0[a0];[a0][wm1]overlay=0:0[a1];[a1][wm2]overlay=0:0[out]
    count = listImage.count();
    if (count > 0) {
    
    
        //加上标识符和头部和尾部标识符
        for (int i = 0; i < count; ++i) {
    
    
            QString flag = QString("[wm%1]").arg(i);
            listImage[i] = listImage.at(i) + flag;
            listPosition[i] = flag + listPosition.at(i);
            listPosition[i] = (i == 0 ? "[in]" : QString("[a%1]").arg(i - 1)) + listPosition.at(i);
            listPosition[i] = listPosition.at(i) + (i == (count - 1) ? "[out]" : QString("[a%1]").arg(i));
        }

        QStringList filters;
        for (int i = 0; i < count; ++i) {
    
    
            filters << listImage.at(i);
        }
        for (int i = 0; i < count; ++i) {
    
    
            filters << listPosition.at(i);
        }

        //图片滤镜集合最终字符串
        filterImage = filters.join(";");

        //存在其他滤镜则其他滤镜在前面
        if (listTemp.count() > 0) {
    
    
            filterImage.replace("[in]", "[other]");
            filterAll = "[in]" + filterOther + "[other];" + filterImage;
        } else {
    
    
            filterAll = filterImage;
        }
    } else {
    
    
        filterAll = filterOther;
    }

    return filterAll;
}

QStringList FilterHelper::getFilters(const QList<OsdInfo> &listOsd, const QList<GraphInfo> &listGraph, bool noimage, bool hardware)
{
    
    
    //滤镜内容字符串集合
    QStringList listFilter;

    //加入标签信息
    foreach (OsdInfo osd, listOsd) {
    
    
        QString filter = FilterHelper::getFilter(osd, noimage);
        if (!filter.isEmpty()) {
    
    
            listFilter << filter;
        }
    }

    //加入图形信息
    foreach (GraphInfo graph, listGraph) {
    
    
        QString filter = FilterHelper::getFilter(graph, hardware);
        if (!filter.isEmpty()) {
    
    
            listFilter << filter;
        }
    }

    //加入其他滤镜
    QString filter = FilterHelper::getFilter();
    if (!filter.isEmpty()) {
    
    
        listFilter << filter;
    }

    return listFilter;
}

Guess you like

Origin blog.csdn.net/feiyangqingyun/article/details/135041561