This article is reproduced from: https://blog.jianchihu.net/video-freeze-rtsp-over-tcp-vlc.html
Thank you for the selfless sharing of the original author.
Following the completion of the development of the rtmp server, the rtsp server has recently been written, which can convert the national standard ps stream and other format protocol code streams to rtsp protocol output. Many player tests were used in the intermediate development process, the most commonly used is vlc. Using the vlc test process, I encountered many problems. Today I will record a strange problem.
When playing in rtp over udp mode, there is no problem, but when playing in rtp over tcp mode, the picture suddenly freezes after tens of seconds after playing vlc. I looked at the debug message of vlc and found no abnormality. I tested it with ffplay, live555, and potplayer. Later, I changed a different version of vlc to test, and it was even stranger, vlc3.0.0 and before, 3.0.5 and later versions are all normal. It should be that vlc has done special treatment for rtp over tcp. At this time, the packet was captured and analyzed the rtsp interactive data, and it was found that the problematic version of vlc would not only send OPTIONS commands at regular intervals, but also a string of special bytes starting with'$'. After sending this playback screen, it was stuck. Why is it stuck and not playing? You can only look at the vlc source code to find the problem.
After reading the relevant source code, I finally located the cause. This is caused by the keep-alive mechanism of vlc. Because vlc uses live555 for rtsp processing, the corresponding processing code is in the file modules/access/live555.cpp. Let's talk about the reasons in combination with the code below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
static void TimeoutPrevention( void *p_data ) { demux_t *p_demux = (demux_t *) p_data; demux_sys_t *p_sys = (demux_sys_t *)p_demux->p_sys; char *bye = NULL;
if( var_GetBool( p_demux, "rtsp-tcp" ) ) return;
/* Protect Live555 from us calling their functions simultaneously with Demux() or Control() */ vlc::threads::mutex_locker locker( p_sys->timeout_mutex );
/* If the timer fires while the demuxer owns the lock, and the demuxer * then torns the session down, the pointers will become NULL. By the time * this timer callback obtains the callback, either a new session was * created and the timer is rescheduled, or the pointers are still NULL * and the timer is descheduled. In the second case, bail out (then wait * for the timer to be rescheduled or destroyed). In the first case, this * might send an early refresh - that´s harmless but suboptimal (FIXME). */ if( p_sys->rtsp == NULL || p_sys->ms == NULL ) return;
bool use_get_param = p_sys->b_get_param;
/* Use GET_PARAMETERS if supported. wmserver dialect supports * it, but does not report this properly. */ if( var_GetBool( p_demux, "rtsp-wmserver" ) ) use_get_param = true;
if( use_get_param ) p_sys->rtsp->sendGetParameterCommand( *p_sys->ms, default_live555_callback, bye ); else p_sys->rtsp->sendOptionsCommand( default_live555_callback, NULL );
if( !wait_Live555_response( p_demux ) ) { msg_Err( p_demux, "keep-alive failed: %s", p_sys->env->getResultMsg() ); /* Just continue, worst case is we get timed out later */ } } |
The above function is the rtsp timeout processing code of vlc, the vlc version that has the problem is not
1 2 |
if( var_GetBool( p_demux, "rtsp-tcp" ) ) return; |
这两行代码,我们先把这两行代码注释,分析下为什么会出现播放画面突然不动的现象。
1)rtsp交互开始vlc客户端会发送OPTIONS请求,我们服务器需要回应支持的方法。如果我们服务器回应包括GET_PARAMETER方法(可选),use_get_param
就为true,然后keep-alive机制就会定时sendGetParameterCommand
,否则sendOptionsCommand
,我这边服务没去做GET_PARAMETER方法的支持,所以会定时收到vlc发的OPTIONS命令请求。vlc发送完OPTIONS请求命令后,开始wait_Live555_response(p_demux)
。看下这个函数:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
/* return true if the RTSP command succeeded */ static bool wait_Live555_response( demux_t *p_demux, int i_timeout = 0 /* ms */ ) { TaskToken task; demux_sys_t * p_sys = (demux_sys_t *)p_demux->p_sys; p_sys->event_rtsp = 0; if( i_timeout > 0 ) { /* Create a task that will be called if we wait more than timeout ms */ task = p_sys->scheduler->scheduleDelayedTask( i_timeout*1000, TaskInterruptRTSP, p_demux ); } p_sys->event_rtsp = 0; p_sys->b_error = true; p_sys->i_live555_ret = 0; p_sys->scheduler->doEventLoop( &p_sys->event_rtsp ); //here, if b_error is true and i_live555_ret = 0 we didn't receive a response if( i_timeout > 0 ) { /* remove the task */ p_sys->scheduler->unscheduleDelayedTask( task ); } return !p_sys->b_error; } |
传入的参数中i_timeout
为默认值0,所以没有超时时间,会一直等服务器响应请求。
2)我这边服务器有个命令解析类,只处理标准的命令(OPTIONS,DESCRIBE,PLAY等)。由于vlc会定时发送’$’开头数据,跟OPTIONS请求数据混在一起送到我的命令解析里,导致我这边没能正确解析,所以也没有回应vlc keep-alive机制的OPTIONS请求。我们再看下TimeoutPrevention
函数,该函数进入后会:
1 |
vlc::threads::mutex_locker locker( p_sys->timeout_mutex ); |
由于我的服务器没有回应OPTIONS请求,所以这个锁会一直阻塞,我们看下这个锁用在哪个地方:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
/***************************************************************************** * Demux: *****************************************************************************/ static int Demux( demux_t *p_demux ) { demux_sys_t *p_sys = (demux_sys_t *)p_demux->p_sys; TaskToken task;
bool b_send_pcr = true; int i;
/* Protect Live555 from simultaneous calls in TimeoutPrevention() during pause */ vlc::threads::mutex_locker locker( p_sys->timeout_mutex );
for( i = 0; i < p_sys->i_track; i++ ) |
可知由于TimeoutPrevention
一直阻塞,所以Demux过程不能执行了,所以播放画面不动了。
新版vlc已经通过
1 2 |
if( var_GetBool( p_demux, "rtsp-tcp" ) ) return; |
取消了rtp over tcp的keep-alive机制,所以3.0.5以及之后版本没有出现问题。我的rtsp服务器后面也针对’$’开头数据做了处理,测了下,一切都正常了。
‘$’开头数据是做什么的呢?在我服务器发RTCP数据时才用到,没想到客户端也有类似机制。在rfc2326中,’$'(0x24)开头数据叫做:Embedded (Interleaved) Binary Data,称为嵌入式二进制数据。测试的那么多播放器,只有vlc实现了这个。而且这个Embedded (Interleaved) Binary Data只工作在rtp over tcp下。这个数据有什么作用呢?rfx2326 10.12这么介绍的:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
10.12 Embedded (Interleaved) Binary Data
Certain firewall designs and other circumstances may force a server to interleave RTSP methods and stream data. This interleaving should generally be avoided unless necessary since it complicates client and server operation and imposes additional overhead. Interleaved binary data SHOULD only be used if RTSP is carried over TCP.
Stream data such as RTP packets is encapsulated by an ASCII dollar sign (24 hexadecimal), followed by a one-byte channel identifier, followed by the length of the encapsulated binary data as a binary, two-byte integer in network byte order. The stream data follows immediately afterwards, without a CRLF, but including the upper-layer protocol headers. Each $ block contains exactly one upper-layer protocol data unit, e.g., one RTP packet.
The channel identifier is defined in the Transport header with the interleaved parameter(Section 12.39).
When the transport choice is RTP, RTCP messages are also interleaved by the server over the TCP connection. As a default, RTCP packets are sent on the first available channel higher than the RTP channel. The client MAY explicitly request RTCP packets on another channel. This is done by specifying two channels in the interleaved parameter of the Transport header(Section 12.39).
RTCP is needed for synchronization when two or more streams are interleaved in such a fashion. Also, this provides a convenient way to tunnel RTP/RTCP packets through the TCP control connection when required by the network configuration and transfer them onto UDP when possible.
C->S: SETUP rtsp://foo.com/bar.file RTSP/1.0 CSeq: 2 Transport: RTP/AVP/TCP;interleaved=0-1
S->C: RTSP/1.0 200 OK CSeq: 2 Date: 05 Jun 1997 18:57:18 GMT Transport: RTP/AVP/TCP;interleaved=0-1
Schulzrinne, et. al. Standards Track [Page 40] RFC 2326 Real Time Streaming Protocol April 1998
Session: 12345678
C->S: PLAY rtsp://foo.com/bar.file RTSP/1.0 CSeq: 3 Session: 12345678
S->C: RTSP/1.0 200 OK CSeq: 3 Session: 12345678 Date: 05 Jun 1997 18:59:15 GMT RTP-Info: url=rtsp://foo.com/bar.file; seq=232433;rtptime=972948234
S->C: $\000{2 byte length}{"length" bytes data, w/RTP header} S->C: $\000{2 byte length}{"length" bytes data, w/RTP header} S->C: $\001{2 byte length}{"length" bytes RTCP packet} |
rtp over tcp模式下,就一个socket端口进行命令控制以及流传输,不像rtp over udp,另开udp socket传输数据。由于防火墙以及其他外部因素,可能造成rtsp方法与rtp流数据交织混在一起。为了避免这个,才有这个设计。通过:
1 |
'$'+信道编号(0或1)+数据 |
对控制信息以及流数据进行区分。具体介绍可以参考:
RTP over RTSP包混合发送的解决办法:https://blog.csdn.net/myslq/article/details/79819179
由于Embedded (Interleaved) Binary Data是在是在服务器回应PLAY推流后vlc才这样处理的,我这边没注意到,所以导致解析出现错误。不过除了vlc,其他播放器都没支持Embedded (Interleaved) Binary Data,因为推流是服务器端,前面命令交互完,服务器就开始推流了,对于客户端我觉得用处不大。