How to play pts with USB capture card

1. Use the pts provided by the capture card
2. Manually print pts 1.
Problems with the pts of the usb capture device
2. The relationship between capture card drivers, UVC/UAC, and ffmpeg
3. How to print pts by yourself
4. Audio and video synchronization tuning
5. NTP Out-of-synchronization problems caused by online time adjustment tools

1. Use the pts provided by the capture card

We use the method of using the PC camera and the PC microphone sound card , and after collecting with ffmpeg, we will get the pts of the audio and video.
We should try our best to make good use of these pts, because their time intervals are very precise, such as 48000, each acquisition is 1024, and it is found that the interval between each frame is 21ms; the video acquisition is 1080p25, and the interval between each frame is 40ms.
Before using them, we need to synchronize them to the same timestamp. One of the methods is:
dif = Greenwich Mean Time - the video timestamp captured in the first frame
, and then each time pts = pts + dif.
V:

AVDictionary *options = NULL;
av_dict_set(&options, “video_size”,1920x1080”, 0);
av_dict_set(&options, “framerate”,30, 0);
//以上参数如果不设置的话,ffmpeg就会用默认值,但默认值摄像头不一定支持
int re = avformat_open_input(&ic,/dev/video0”, ifmt, &options);

A:

av_dict_set(&options, "sample_rate", "48000", 0); // 只能是48000不支持改动
av_dict_set(&options, "channels", "2", 0);        // 无法改,不支持改动
long long GetCurTime()
{
    
    
    struct timeval tv;
    gettimeofday(&tv, NULL);
    // ZlogInfo("second:%ld\n",tv.tv_sec);  //秒
    // ZlogInfo("millisecond:%ld\n",tv.tv_sec*1000 + tv.tv_usec/1000);  //毫秒
    // ZlogInfo("microsecond:%ld\n",tv.tv_sec*1000000 + tv.tv_usec);  //微秒
    long long temp_time = tv.tv_sec * 1000000 + tv.tv_usec;
    return temp_time;
}
if(this->dif_PTS == 0) this->dif_PTS = GetCurTime() - read_pkt.pts;
read_pkt.pts = read_pkt.pts + this->dif_PTS;

2. Manually print pts

1. Problems with the pts of the usb capture device.
Decklink's capture card will provide the pts of audio and video, and it is reliable. But small manufacturers are not necessarily so. I used a Taiwan and domestic manufacturer here, one is a pci capture card (sdi/hdmi capture to pci port), this capture card supports UVC and UAC protocols, and the other is a usb (hdmi to usb port) capture card. Both prices are around 1,000 yuan.

In the technical communication with the original factory of these two capture cards, the pts has the following problems
1: The pts of audio and video are not related, and there is no common starting point or related connection.
2. After a night of testing, pts found that there is a phenomenon of jumping.
3. When capturing 1080p30, the pts of the video only increases by 20ms between each frame instead of 33ms, and other resolutions do not have this problem.
Because most of the users of these devices are not developers, these problems will not be used. For developers, they need to be customized by the original factory.

2. Capture card driver, UVC/UAC, ffmpeg relationship

After wechat with the original factory, I have a preliminary understanding of the usb capture card. For audio, the driver of the capture card will not provide pts to the system, because for audio, the sampling rate and the number of channels must be determined. Bit depth, then on the playback side, the playback time corresponding to the number of samples is fixed, the acquisition card has only one crystal oscillator to control the acquisition frequency, the audio is collected from the acquisition card and provided to the system, and the delay can be ignored.
The driver of the video capture card will also not have pts, and its startup will only provide the upper layer with a label similar to the serial number of the video such as 1, 2, 3..., etc. The capture card may have several frames from capture to supply to the system, such as 4 frame delay.
Then the audio and video we collected with ffmpeg all contain pts, so what are these pts. The audio is us from 1970 to the present, and the video is us from the boot time. They are all pts of UVC and UAC protocols for the audio and video frames provided by the capture card driver.
Therefore, the data transmission is like this. Acquisition card -> acquisition card driver -> system kernel -> UVC/UAC -> ffmpeg
hardware manufacturers change a problem, several versions need to be changed, and other bugs are often introduced. It seems that programmers all over the world Same.

3. How to play pts by yourself

So how should we play pts ourselves?
Collect audio and video separately in two threads:

av_read_frame()

We assign the PTS of Greenwich to the audio and video data collected each time.
A:

re = av_read_frame(ic_a, &read_pkt_a);   
read_pkt_a.pts = GetCurTime();

V:

re = av_read_frame(ic, &read_pkt);   
read_pkt.pts = GetCurTime();
ZlogInfo("dif:%ld\n", (shm_->GetCurTime() - LAST_PTS) / 1000);
LAST_PTS = shm_->GetCurTime();

The time intervals for each video return from this function are:
//1080p25 39 40 39 40 39 40 39 40 39 40 39 40 ms
and the audio is:
//1080p25 41 0 42 0 42 0 41 0 41 0 42 0 ms
sometimes Yes
//1080p25 21 0 41 21 0 40 22 0 41 1 40 0 ​​42 0 41 22 0 41 0 42 0 42 1 40 22 0 41 21 1 40

0 is not no interval but less than 1ms, there is a loss of precision here.
It can be seen that the interval is not standard, because av_read_frame() maintains a cache, each time he gets one or more frames of audio or video data from the capture card and puts them into the cache, then subpackages, returns a frame of audio or video data, and downloads Call again to get it directly from the cache. If there is no cache, it will be blocked until the data is provided by the capture card.

We need to make their time intervals relatively uniform, then compare each pts with the last time, and sleep if it is less than a certain time.

ZlogInfo("dif:%ld\n", (shm_->GetCurTime() - LAST_PTS) / 1000);
if((shm_->GetCurTime() - LAST_PTS) / 1000 < 10) usleep(10000);
pkt->pts = shm_->GetCurTime();
ZlogInfo("dif_last:%ld\n", (shm_->GetCurTime() - LAST_PTS) / 1000);
LAST_PTS = shm_->GetCurTime();

Then print again as follows:
V:
//39 40 39 40 39 40 39 40 39 39 39 39 40 40
A:
//1080p25 31 12 30 10 32 10 31 11 31 10 32 11 30 10 32 10 32 10 30 10 31 11 31 10 31 10

4. Audio and video synchronization optimization
Because there may be a few frames of buffering before the video is collected and handed over to ffmpeg, so about 100ms can be subtracted from the video data each time to see the video effect after broadcasting.

pkt->pts = shm_->GetCurTime() - 100000;

The audio and video synchronization strategy of the player is generally based on the audio, and the audio is linear and uniform, so the pts of the audio itself is meaningless. The interval between two frames does not represent the playback time interval. Its meaning is mainly is used for video synchronization with him.
If you want the audio interval to be more even, you can adjust the following sleep time.

5. The out-of-sync problem caused by network time adjustment tools such as NTP.
Every time the PC is turned on and connected to the network, the system comes with NTP to check the time and change its own time, so the pts you type will change every time. The range of each adjustment of NTP may be about 500ms, you can turn it off, or change to a smaller tool, and the range of each adjustment is about 10ms.

ffmpeg is a must for audio and video, but even after several years in the industry, it still seems to have endless secrets. If you are interested, add the author on WeChat: YQW1163720468, and join the discussion in the ffmpeg WeChat group.  But remember to note: ffmpeg lovers

おすすめ

転載: blog.csdn.net/weixin_43466192/article/details/132102221
おすすめ