Video broadcast system and quality of the user experience monitoring program

Transfer: https://www.upyun.com/opentalk/396.html

Instructor Profile

Live streaming banners senior research and development engineer, in 2012 to enter the broadcast industry, worked in the Austrian point cloud, banners in live streaming media development; 2013 independently "RTMP protocol specification 1.0" Chinese translation; for Windows, Android, iOS have some research platform live and play framework.

June 24, they shoot the cloud Open Talk | 2018 audio and video technology Sharon Shanghai station is concluded, it is they shoot the cloud Open Talk | second station 2018 audio and video technology salon series activities. As they shoot the cloud to share housekeeping activities, this Open Talk invited Netease cloud music, cloud valley people, they shoot the cloud, banners and three other companies instructor. Four lecturers to come up with a special skill at the event, to the scene to watch the live audience contributed wonderful to share!

Live streaming banners senior R & D engineer large stone at the scene shared "video broadcast system and quality of the user experience monitoring solution," focuses on the live set up quality control program, as well as live Caton, delay monitor, the first screen in seconds open three aspects of optimization.

  • Air Quality Evaluation System
  • Structure and logic of broadcast quality monitoring program
  • Caton ten strategies for optimization
  • Delay Monitoring: custom extensions, digital watermarking
  • Optimization three seconds on the first screen direction

 

image.png

The following is a stone large share of finishing.

Hello everyone, I am a senior engineer from the banners streaming live large stone platform. Today I will focus on two elements: the first aspect, live, on-demand system overall user experience. Existing public speaking content overall user experience the system a little less than normal. On the other hand, "quality control program", which is rarely mentioned.

In essence, the user experience and quality control, to do one thing, namely: to some extent, to ensure that users watch live experience is the best.

I share the content is divided into four parts:

  1. Evaluation of the quality of live. User system covers what areas, the overall user experience consists of several parts of.
  1. Caton monitoring and optimization. Caton optimization depends on how the monitoring system found Caton phenomenon.
  1. Monitoring the delay problem. There is the problem of video surveillance delay, because now the industry for the "delay" of monitoring is still relatively lacking, commercial programs which have not seen relatively effective way.
  1. The first screen optimization points. The first screen optimization in the industry put more, I simply set out the main points.

Air Quality Evaluation System

Air quality assessment this one, will speak about quality evaluation system for audio and video.

Audio and video evaluation originated earlier. As early as 1996, ITU international organizations already have a subjective assessment of the quality of streaming audio and video media, when the key measures of call quality of the phone. Then in 2003 proposed a system based on individual MOS subjective evaluation, 2012, 2013 of the MOS system was supplemented by different aspects, launched vMos system.

Today I will focus on Huawei's U-vMOS subjective quality evaluation system. On the one hand, on the sets of hate the system, the domestic Chinese information more substantial; on the other hand, U-vMOS do extensions, U-vMOS entire quality system on the basis of the above vMOS, also in vMOS contents.

The main purpose of MOS quality evaluation is carried out scores of audio or video quality based on the subjective experience of the user. Its score, the conventional sense is divided into five points, the higher the score the better its quality.

 

level MOS scores customer satisfaction
excellent 5.0 Very good, heard very clearly, without distortion sense, no sense of delay
Good 4.0 Somewhat less, hear clearly, the delay is small, a little noise
in 3.0 You can also listen to is not clear, there is a delay, there is noise, there is distortion
difference 2.0 Reluctantly, not to listen to clear, there is a greater or intermittent noise, distortion serious
inferior 1.0 Poor, mute or completely inaudible, big noise

image.png 

Modeling quality evaluation U-vMOS

MOS quality evaluation system for audio quality. Evaluation of video quality can make an extension on this basis, specific look at the U-vMOS quality evaluation system.

U-vMOS video quality evaluation is divided into three parts:

  1. Video quality, refers to the resolution, frame rate, bit rate, level of the video encoder;
  1. Interactive experience, mainly refers to the length of the called video load time;
  1. Viewing experience, mainly referring to the video screen and Caton.

Video quality for a variety of factors, among which we can get to the "typical value" and "minute", mainly the screen size and resolution of the video playback device, the two correlation is relatively large.

Video quality score

image.png

Video quality: the relationship between resolution and screen size of the video

4 points or more can be regarded as a better viewing experience, look at this table, 4.5-inch, 5.5-inch mobile phone screen, you need to have at least 720P video stream to reach 4 points or more. When we do that the phone's video service, if the user for video requirements are not particularly demanding, in terms 720P usually sufficient; individual offer 1080P, in fact viewing experience and no major upgrade, upgrade only from 4.3 to 4.6 points this process not only has demand for bit rate, video frame rate, difficulty decoding will be much higher.

General TV wants to end more than 4.0 viewing experience, the need to 1080P video. This form of resolution for video streaming live business choice is to have certain reference significance.

Video experience, the first seconds on a standard screen

Interactive experience, mainly related to the conventional sense talking about "above the fold seconds to open." The first screen is often considered second opening in less than 100 milliseconds is considered perfect.

This requirement is a LAN environment, the public network environment Xiashou screen 100 milliseconds seconds apart little, or particularly low. The conventional sense, we will try to make time to do the first seconds on screen about 1 second is 1000 milliseconds.

We learned, existing as deft, betta, eye teeth this type of App, usually the first screen time will be done within three seconds. 3 seconds is a limit, we usually about 2 seconds.

image.png

 Interactive experience (first seconds on screen) Typical value

Viewing experience: the video is no longer, focusing on optimizing Caton

Viewing experience consists of two parts: the video and Caton.

Now they shoot broadcast platform in the cloud CDN service providers and other efforts, "Huaping" there have been very little, mainly affecting the viewing experience was a factor, "Caton" mainly refers to in a minute Caton appeared much times, each time when the number of long-Caton, Caton finally get out of a long accounting. Quality viewing experience evaluation system is obtained under laboratory environment.

image.png

Typical viewing experience scores (Caton statistical period 1 minute)

Structure and logic of broadcast quality monitoring program

For the implementation of the foreign system in this set may be some more, domestic enterprises now, the main concern may be Caton and above the fold seconds to open. Delay, it will be relatively less attention.

We highlight some optimization system Caton, after some optimization work carried out including the "Caton monitoring", and monitor the results after collecting.

image.png

Air quality monitoring system components

Caton is divided into four parts: data collection, data analysis, data display, warning system.

Data collection, and the collection anchor end side of the viewing device information, network environment. Device information mainly refers to the equipment models, user IP, and video streaming of the resolution, bit rate, including CPU usage during playback, GDP usage, memory usage. Network environment, mainly refers to the connection. Some need to learn to detect the data, such as: priority to collect mobile phone network when local router, and then collect the phone to the exit of the public network environment, network conditions and phone to CDN nodes. The third part of the normal monitoring data is needed, including Caton data, the first screen of data, delay data.

Data analysis After you gather, into a big data center to do some data filtering, comprehensive analysis; we actually treated as operating data needed to monitor the user ID and categorized.

The third is the data show, this one is a map showing the main. In a map above, the Caton rate and some other data to show up, that is more convenient to watch. This chart is to monitor the overall monitoring of banners Caton, Caton rate marked lower left corner from low to high, the lowest is "0", the highest is "15." It can be seen Caton rate banners should be within "4" point, generally more than 3:00.

image.png

Caton data shows a schematic diagram

The fourth part is the early warning system. This is a major operation and maintenance personnel and CDN vendors. The conventional sense, this warning usually operations staff directly to their company. But we do live, basically will be used to accelerate cloud CDN service vendors. If we find that users Caton, in fact, the final analysis will have to come out because of bad user Caton a CDN node, this analysis back to the CDN, CDN let be adjusted accordingly.

image.png

Caton logic operation monitoring system

The whole monitoring system, our side can simply be divided into five parts: the client, monitoring systems, operation and maintenance support, intelligent scheduling, CDN vendors.

The first is to monitor system operation and maintenance support, and then to CDN vendor, told a thing happened. Then to the intelligent scheduling system, alarm levels this part of the relatively low point, is for individual users of alarm. We can alarm for the user to do a smart scheduling according to his hardware and network equipment in the situation. For example: to detect bandwidth is not enough, there were Caton; scheduling system only need to give the client send a command, telling it not enough bandwidth, let the rate drop level possible Caton situation will ease.

Caton optimize the ten rules

For optimized for Caton, our side can be divided into ten parts:

  1. HTTP-DNS scheduling
  1. Local players end scheduling
  1. Intelligent scheduling server
  1. Manual scheduling server
  1. CDN manual scheduling
  1. Use UDP plug flow
  1. End plug flow monitoring fluency
  1. It offers a variety of choice clarity
  1. Player optimization
  1. User feedback system

HTTP-DNS scheduling: pollution in the country would be more serious DNS, DNS resolution may be put to the wrong node up. This resolution services, may put your domain name, such as the sea, Xuhui District, Pudong New Area of ​​domain name resolves to go. We try to avoid this error, this time on the need HTTP-DNS scheduling. Every time before we pull an address, use your own server do first resolve to ensure that every time you are back to the nearest service node, which will be relatively smooth.

Local players end scheduling: If you find that the user plays relatively Caton, we will have some local detection mechanisms. For example, CPU hardware detection is not high, ultra-high CPU utilization, we might put its resolution decoding rate down and let the CPU eased, this time Caton will ease. In addition, it may be caused by local Caton network, we can give him for a service node, or do some other processing.

Server-side intelligent scheduling, manual scheduling server: server-side intelligent scheduling, manual scheduling, this is mainly to do some adjustments in the back-end remotely. Intelligent scheduling system inside, we would do under the circumstances unified user. For example, the user hardware is not enough, we add a little to help him. If the user Caton, we must first determine the problem CDN nodes, or the user's own problems. If the problem is CDN node, we help him automatically transferred to the next node.

CDN manual scheduling: refers to the way of manual intervention. For example: the user now occur Caton, intelligent scheduling systems such as found in a particularly bad service node Xuhui District of Shanghai, we can put this node manually pull the black out, users will not have access to the relatively poor quality of this node.

Fourth, fifth point there will be a certain degree of coincidence, since the CDN manual scheduling according to data intelligent scheduling system is collected, issued to some of the corresponding data processing.

UDP plug-flow: TCP protocol and disaster recovery mechanisms leading to its ability to slow network jitter resistance will be relatively poor. To solve the problem because Caton network jitter caused, we will use the "Custom" or UDP protocol of its own to deal with this problem. Have launched similar "class UDP" various SDK package, use UDP instead of TCP against network jitter, TCP can put disaster recovery mechanisms, recovery mechanisms circumvented, can quickly restore the network before Google. UDP also has a role in the case of loss of "anti-dropping ability" is better, it can own their own decisions, "how much data packet retransmission."

Fluency end plug flow monitor: This is a major anchor plug flow monitor whether there is or is not synchronized audio and video frame rate would not be enough. If the anchor end Caton, then all users will be dispatched to any node Caton, so the end plug flow monitoring will be relatively important than others. For monitoring to anchor the end of our real-time feedback to the model, so that anchor to switch networks or make some adjustments.

Offers a variety of choice clarity: clarity offers a variety of choice, the main target in the user manual operation. Typically provide a variety of standard resolution of the situation, HD, and other ultra-clear case. Rate, coding complexity corresponding to a different resolution is not the same. The higher the resolution, the higher the degree of difficulty will be the appropriate decoding. Allows the user to manually choose a better time in all the circumstances, he can improve clarity. When the event Caton, you can manually lower-resolution or lower resolution, can solve part of the problem Caton.

Player optimization: optimization of the player, for in Caton deal primarily with two things: one is the audio and video out of sync due to Caton. The second part, mainly for processing in the buffer, the buffer relative to the confrontation "network jitter" is more useful.

User feedback system: This is an initiative to provide some user comments or questions Caton, user feedback system can be used as a complement to our overall monitoring system that can help us improve the monitoring system.

Delay Monitoring: custom extensions, digital watermarking

Here to talk about "latency monitoring." "Delay control" I will focus on the contents of two parts:

Calculation and optimization of the first delay section, the delay of the development phase;

The second part, the delay calculation phase of release.

Under normal circumstances, the development phase will live to delay the development of "Beijing" way to do comparison.

image.png

The left chart is a plug-flow end local "Beijing", is the right player to play out, "Beijing."

Development phase delay, plug-flow tool, playing on the same machine tool. The time minus the time left to the right, in fact, live delay. We can see the live figure is 3 seconds delay 2 milliseconds.

The development phase to the release phase delay calculation no longer held, because under normal circumstances can not be staring at the real-time mobile phone users to see "how much delay", it is impossible to embed a "Beijing" in the video stream inside.

Release phase delay calculation need the help of some other means, a method is a custom extension, one way is digital watermarking.

Custom extensions to achieve the delay monitor

Extended use of live custom protocol inside some custom fields do delay monitoring.

One option is to metadata fields FLV agreement. FLV protocol field itself, may be embedded, and then send "GMT" at the metadata into the plug flow field inside, after receiving it, and the local "GMT" make a difference.

Second place can be extended, is H.264, H.265 encoded SEI field, which may be custom made extension, delay calculation method is the same, there is in this field embedding "GMT" on it.

Two ways to customize the extension of benefits - relatively simple configuration.

Of course there are more difficult to place. CDN itself because there transcoding system and distribution system, and if not, then CDN vendors emphasize that all custom fields from over CDN system again after all will be deleted.

Another problem areas, that CDN to distribute video streams will default all video streams, no matter what the time from the start to pull flow, will "start from scratch." This time we are embedded in a field inside the "Beijing", in fact, there is no reference to the subject, because we are under time frames of each video stream inside, and from time to embed custom fields, as well as the local time of the three do "poor" get delay, this part would be affected.

Digital Watermark delay monitoring

Based on the foregoing drawbacks of both methods, then we extend the method of calculating the delay based on "digital watermark" to embed data. "Watermarking" appears early, the original audio and video copyright for confirmation, audio and video embedded in the invisible, inaudible data, which is relatively small impact on the overall quality audio and video, but can be embedded into by an algorithm, It can be extracted, mainly used in this respect.

I simply talk about the principle of the "digital watermarking", the digital watermark can be embedded in more places.

Implant YUV digital watermark to understand the raw data or original data by modifying the PCM. Resolution video stream to 720P as a reference, per a screen is 1280 × 720 pixels, each pixel is composed of a Y and 1 / 4U, 1 / 4V thereof. Typically in which Y, Y of each pixel is 8 bits of data. That is, data can range from -127 to 127. A total of 8-bit data Y, we can erase it the end of the three, and then embed data we want. For example: The first Y pixels we want inside, embedded in a digital "0" or "1", we can erase 8 bits 3, which is embedded in the three is "0-7" we embed a "3" stands for "0", embedded in a "6" stands for "1." After embedding, then according to the same manner as this embodiment again extracted, and then this data reduction, which can be obtained YUV data we embedded. Embed this way, it will have some impact. Because under normal circumstances, we know that Y data is "-127" to "127." After the end of the three color change will affect. The higher the accuracy of the data, the more the number of bits required to erase, the greater the damage to the video. We want higher accuracy, but also need more number of bits, this time on the need to make some trade-offs.

PCM is the same way, the last one is not important, relatively small data above to make some changes.

Look through the digital watermarking AAC quantized sub-band or parameter H.264 DCT block.

AAC quantized sub-band consisting of a series of parameters, we can rewrite the parameters first.

H.264 DCT block parameter, this parameter corresponding modification method is the same, a change of unimportant data.

Objectively speaking, just inside several ways, in the embedded data, the extraction process will be affected CDN transcoding system. Because under normal circumstances, we know from YUV data to H.264 re-decoded YUV, when the YUV data is actually some changes, but this change is controlled within a certain range, the vast majority of cases not see the difference.

This time, on top of these algorithms just mentioned, extends a number of algorithms for these parameters in the data; we just talked about is to make changes directly in the data inside, do stretching treatment method is to put these data intake, compression like "discrete pre-selection" argument often used. This algorithm is relatively speaking, to modify the original data will be smaller, and then extracted the success rate will be higher.

Optimization three seconds on the first screen direction

Simple optimization over the main points to open the fold seconds. The first screen optimization topics may be particularly much, I will not here one by one to explain; providing optimized three seconds on the direction of the fold here, have optimized demand, then, in accordance with the plug-flow, forward, played in three directions to do optimization certainly can achieve the desired result.

Optimization Streaming - anchor end:

  • GOP reasonable values ​​(recommendation 2 seconds)
  • Reduced inter-dependency, without the use of P-frame
  • X264 without delay coding
  • Reasonable resolution, bit rate, frame rate
  • UDP is used against network jitter

Forwarding optimization --CDN

  • GOP data cache
  • Warm-up ahead of resources
  • TCP acceleration unilateral
  • Providing stream multiplexer

Play optimization - play end

  • Priority loading video data stream
  • Variable cache, after the first big small
  • Use HTTPDNS distribution node
  • FFmpeg video streams to optimize probe
  • Parallel network requests loading

Q & A section

Q1: live events, live games, when anchor plug flow will be delayed. Yesterday I met a anchor, there were already 20 seconds delay, that is to say after the coding is over, there will be 20 seconds before launch, in this case, the digital watermark is not accurate. I think there are a lot of games banners anchor, how to solve this problem?

Shi Shuo: This is divided into two aspects. If we do delay calculation, use the appropriate tool for our live is controllable. Just like the situation you are talking, this should be the anchor using a plug flow OBS open source tools. To push for OBS workflow tools, we need to do some special handling. OBS banners for the development of a plug-in that is embedded video watermark. To get this plug-in after the second OBS buffer delay, this data acquisition, for an embedded video watermark will do some special treatment. If you live with your own software, for the corresponding local buffer delay is all set up their own, corresponding to do some processing just fine.

Q2: I have just two days before the end of the line website H.265, bit rate is relatively low; Caton discovered the main reason comes from the case of hardware, such as CPU or GPU not enough. I consider the end of the page to add some kind of things, to get the user's hardware, and then found to be able to get the hardware is very limited, with the H.265 mobile terminal in a completely different state. Domestic mainstream browser can obtain the user's memory situation, but can not get GPU and other information, even if there is a problem difficult to locate. Banners in this one, currently is how to solve?

Shi Shuo: This is a good question. Now the mainstream browsers, the more common browser, usually if you do live with HTML5, will have to go to the election by default more hardware. Limited by the browser, we can not get GPU information. This time, we'll get some further peripheral data information, such as CPU models get, then get GPU information. Fortunately, a few GPU solution 1080P does not, at this time we can try to put it in resolution from 1080 down to 720P, this time it's the GPU is able to stand.

OT official website of the two-dimensional code .png

 

 

Guess you like

Origin www.cnblogs.com/haojile/p/12578538.html