The Bandwidth value in the MPD file generated by mp4dash and its impact on the client's bit rate selection

Phenomenon

In the process of building the DSAH video system (server & player) , it is found that the video bit rate (ie) in the MPD file used MP4Boxand mp4dashgenerated bandwidthis different.

ffmpegThe output video bit rate (kbps) of each resolution is:

  • 1920x1080(1080p):3988.49
  • 1280x720(720p):1983.08
  • 896x504(480p):1131.42
  • 640x360(360p):676.67
  • 256x144(144p):147.76

MP4BoxThe video bit rate (kbps) of each resolution in the generated MPD file is:

  • 1920x1080(1080p):3988.497
  • 1280x720(720p):1983.089
  • 896x504(480p):1131.432
  • 640x360(360p):676.676
  • 256x144(144p):147.765

It can be seen that MP4Boxthe ffmpegvideo bit rate is basically the same as that of the encoded output (see Installation of GPAC (MP4Box) under Ubuntu | Building a DASH Video System Based on MP4Box for details ).

mp4dashThe video bit rate (kbps) of each resolution in the generated MPD file is :

  • 1920x1080(1080p):16079.970
  • 1280x720(720p):7753.362
  • 896x504(480p):4310.870
  • 640x360(360p):2391.544
  • 256x144(144p):402.408

mp4dashThe code rate in the generated MPD file is all higher than the previous code rate, which is about 2~4 times of the latter.


reason

Fortunately, on Bento4's Github project, someone found a similar problem and got the answer from the project author, refer to:

2016.08.11:

The required bandwidth calculation is somewhat complicated. What this value represents is the bandwidth value for which, if the throughput remains constant as that value there should never be an underflow situation. The client is only required to buffer @minBufferTime worth of data. In theory, a precise calculation for this would require looking at every frame, and taking the possible frame reordering into account. But the current method isn’t quite that complicated. It looks at the minBufferTime value and the individual segment sizes, and finds a value for which the client buffer would never go empty. This is better than just taking the average segment bitrate (which would be wrong, since there are often peaks), but not quite as precise as looking at individual frames.

2017.06.29:

The reason you are seeing a different value for the bandwidth in the MPD and from ffmpeg/mediainfo is because the MPD value is computed in a different way, in order to comply with the specification. The bandwidth reported by ffmpeg or mediainfo is the average bandwidth for the stream (number of bytes divided by duration), whereas for the MPD the calculation is done by segment, and also based on the buffer model, that takes the minBufferTime into account. To be more specific, the value is an indication to the player that if it starts playing after having buffered ‘minBufferTime’, and if the network bandwidth is exactly the value in the MPD, then the buffer will never completely empty. So if your encoder creates segments for which the average bitrate for the segment is higher, or if within a segment you have a higher bitrate at some point of the the segment than other points, you will see the MPD bandwidth value be somewhat different from the stream’s average bandwidth.

2018.01.02:

The bandwidth calculation for DASH streams is actually not that straightforward. The current version of the packager uses method that should be fairly close to what’s expected: it is based on the minBufferTime value for the MPD, and the size of the frames found in the media. The rule is that the bandwidth value should be such that if a player had exactly that constant bandwidth, and respected the minBufferTime value, it would never underflow. So the peak media bandwidth isn’t really a good indicator, nor is the average bandwidth. Bento4 looks at a variable buffer over time, and tries to compute what value of the bandwidth would be able to guarantee never to underflow. If you change the minBufferTime value, the bandwidth calculation will change.

The details are not well understood, and the calculation rules seem to change slightly with the iteration of the version, which probably means that minBufferTime and the size of the video frame are considered when mp4dashcalculating the MPD .bandwidth

The specific calculation rules are: consider the change of the buffer over time, and calculate the bandwidth of the player to ensure that the buffer is not empty according to the value of minBufferTime (assuming that the bandwidth remains unchanged) . That is to say, when the bandwidth of the player is always in the MPD bandwidth, the buffer will never be empty after the buffer exceeds minBufferTime. Therefore, bandwidthneither the stream's peak bitrate nor its average bitrate.

Note that, in mp4dashthe options available in mp4dash --min-buffer-time=<duration>(see mp4dash ), this value is automatically calculated by default. Changing the default value with this option will also affect bandwidththe value in the MPD file.


One More Thing

But I can't help but be a little curious when I write here, bandwidthwhat kind of impact will the value in the MPD file have on the client's bit rate decision?

In order to clarify this problem, I specifically checked the DASH-IF Guidelines: Guidelines for Implementation: DASH-IF Interoperability Points , which happens to have relevant content.

First, a description of MPD minBufferTimeneutralization bandwidth(P40 3.2.8):

The MPD contains a pair of values for a bandwidth and buffering description, namely the Minimum Buffer Time ( M B T MBT MBT) expressed by the value of MPD@minBufferTime and bandwidth ( B W BW BW) expressed by the value of Representation@bandwidth. The following holds:

  • the value of the minimum buffer time does not provide any instructions to the client on how long to buffer the media. The value however describes how much buffer a client should have under ideal network conditions. As such, M B T MBT MBT is not describing the burstiness or jitter in the network, it is describing the burstiness or jitter in the content encoding. Together with the B W BW BW value, it is a property of the content. Using the “leaky bucket” model, it is the size of the bucket that makes B W BW BW true, given the way the content is encoded.
  • The minimum buffer time provides information that for each Stream Access Point (and in the case of DASH-IF therefore each start of the Media Segment), the property of the stream: If the Representation (starting at any segment) is delivered over a constant bitrate channel with bitrate equal to value of the B W BW BW attribute, then each presentation time P T PT PT is available at the client latest at time with a delay of at most P T + M B T PT + MBT PT+MBT.
  • In the absence of any other guidance, the M B T MBT MBT should be set to the maximum GOP size (coded video sequence) of the content, which quite often is identical to the maximum segment duration for the live profile or the maximum subsegment duration for the On-Demand profile. The M B T MBT MBT may be set to a smaller value than maximum (sub)segment duration, but should not be set to a higher value.

There are several points worthy of attention in the above content:

  • minBufferTimeInstead of instructing the client how long the video should be buffered, it tells the client how much buffer should be reserved under ideal network conditions . minBufferTimeDescribes the bit rate jitter generated by encoding , not network jitter. My understanding is that when the network bandwidth is constant and equal bandwidth, and the client buffer size is equal minBufferTime, the video playback will not freeze.
  • In general, minBufferTimethe value of should be less than or equal to the maximum segment (live) or subsegment (on-demand) duration .

Then, about how the client selects an appropriate video bitrate level (P41 3.2.8):

A DASH client decides downloading the next segment based on the following status information:

  • the currently available buffer in the media pipeline, b u f f e r buffer buffer
  • the currently estimated download rate, r a t e rate rate
  • the value of the attribute @minBufferTime, M B T MBT MBT
  • the set of values of the @bandwidth attribute for each Representation i i i, B W [ i ] BW[i] BW[i]

The task of the client is to select a suitable Representation i i i.
The relevant issue is that starting from a SAP (Stream Access Point) on, the DASH client can continue to playout the data. This means that at the current time it does have b u f f e r buffer buffer data in the buffer. Based on this model the client can download a Representation i i i for which B W [ i ] ≤ r a t e ∗ b u f f e r / M B T BW[i] ≤ rate*buffer/MBT BW[i]ratebuffer/MBT without emptying the buffer.

Note that the formula for the client to select the bit rate level (Representation) is mentioned here, namely:
BW [ i ] ≤ rate ∗ buffer / MBT (1) \tag{1} BW[i] ≤ rate*buffer/MBTBW[i]ratebuffer/MBT(1)

How to understand this formula? Let's transform this formula first:
BW [ i ] rate ≤ buffer MBT (2) \tag{2} \frac{BW[i]}{rate} ≤ \frac{buffer}{MBT}rateBW[i]MBTbuffer(2)

There are three cases here:
1 ≤ BW [ i ] rate ≤ buffer MBT (3) \tag{3} 1 ≤ \frac{BW[i]}{rate} ≤ \frac{buffer}{MBT}1rateBW[i]MBTbuffer(3)

B W [ i ] r a t e ≤ 1 ≤ b u f f e r M B T (4) \tag{4} \frac{BW[i]}{rate} ≤ 1 ≤ \frac{buffer}{MBT} rateBW[i]1MBTbuffer(4)

B W [ i ] r a t e ≤ b u f f e r M B T ≤ 1 (5) \tag{5} \frac{BW[i]}{rate} ≤ \frac{buffer}{MBT} ≤ 1 rateBW[i]MBTbuffer1(5)

For the convenience of analysis, from formula (4) (4)(4)入手,有:
b u f f e r ≥ M B T , r a t e ≥ B W [ i ] (4) \tag{4} buffer ≥ MBT, rate ≥ BW[i] bufferMBT,rateBW[i](4)

This is easy to understand: when the client buffer is not lower than MBT MBTM B T and the bandwidth is not lower thanBW [ i ] BW[i]When B W [ i ] , selectBW [ i ] BW[i]Representation iicorresponding to B W [ i ]i

We know that the goal of the client to select the bit rate is to choose a higher video bit rate as much as possible on the basis of avoiding stuttering 1 (that is, BW BWB W Higher Representation). And the safest situation to avoid Caton is(4) (4)( 4 ) Corresponding situation:the video cached in the client buffer will not be exhausted due to the jitter of the video segment bit rate, and the network bandwidth is sufficient to complete the download of the new video segment within the playing time of the video segment. Rate selection will never cause lag.

After analyzing this special case, let us look back at equation (3) (3)( 3 ) , formula( 5 ) (5)( 5 ) . Using the same idea, it can be seen that formula(3) (3)(3)中有 b u f f e r ≥ M B T , r a t e ≤ B W [ i ] buffer ≥ MBT, rate ≤ BW[i] bufferMBT,rateB W [ i ] ,and(5)(5)(5)中有 b u f f e r ≤ M B T , r a t e ≥ B W [ i ] buffer ≤ MBT, rate ≥ BW[i] bufferMBT,rateB W [ i ] . What both have in common is thatthere are both insurance factors to avoid lag and risk factors that cause lag. So how to ensure that the bit rate selection will not cause stuttering? The answer isto make the degree of insurance against lagging greater than the risk of causing lagging. That is to say, there are two factors here: the client buffer and the network bandwidth. When the buffer is small (or the bandwidth is low), the bandwidth must be high enough (or the buffer must be large enough) to offset the small buffer (or the bandwidth Low) causes the risk of lagging, and guarantees no lagging. This is the formula( 1 ) (1)( 1 ) The complete meaning of the expression.

Finally, back to our question: how does the MPD file bandwidthaffect the client's bit rate selection? After the above analysis, we can see that if the value is increased in the MPD file bandwidth, the client will tend to choose the Representation with a lower real bit rate, which will make it easier to avoid freezes, but will also result in lower video quality . Generally speaking, mp4dashthis approach has advantages and disadvantages.


  1. For the sake of simplicity, it is considered here that the video quality is positively correlated with the video bit rate. In fact, this is not the case, because the improvement of bit rate has a marginal effect on the improvement of quality. ↩︎

Guess you like

Origin blog.csdn.net/LvGreat/article/details/103621022