live555 RTSP PAUSE will not resume after two keepalives (~= 2 mins pause)
We've come across a problem around handling of PAUSE and GET_PARAMETER keepalives against our RTSP server. The user symptom is that a pause of more than a minute or so cannot be resumed and VLC appears to lock up if you try.
Version
Revision 2.2.0-rc1-118-g22fda39 (built from 2.2.0 branch today, default config) on Debian Jessie. Same symptom in 2.1.x (which is where we first noticed it) Relevant code has not changed in current master
== To reproduce ==
- Start an RTSP stream against a server which supports GET_PARAMETER for keepalives (e.g. Packet Ship, ask me for an eval copy)
- Pause it (PAUSE command)
- Wait for at least two GET_PARAMETER keepalives to be sent while paused (max 2 minutes with standard session timeout)
- Unpause it (PLAY command)
The video does not restart and VLC will not close.
VLC log (RTSP only)
(normal OPTIONS/DESCRIBE/SETUP and PLAY all fine, removed)
First GET_PARAMETER while playing normally - notice response is logged:
Sending request: GET_PARAMETER rtsp://127.0.0.1/media/bunny-mid.ts RTSP/1.0
CSeq: 6
User-Agent: LibVLC/2.2.0-rc2 (LIVE555 Streaming Media v2014.01.13)
Session: 2ab2f048bdd50021
Received 156 new bytes of response data.
Received a complete GET_PARAMETER response:
RTSP/1.0 200 OK
Content-Length: 0
CSeq: 6
Server: Packet Ship RTSP Server v3.1.10
Date: Tue, Jun 23 2015 10:22:38 GMT
Content-type: text/parameters
PAUSE request:
Sending request: PAUSE rtsp://127.0.0.1/media/bunny-mid.ts RTSP/1.0
CSeq: 7
User-Agent: LibVLC/2.2.0-rc2 (LIVE555 Streaming Media v2014.01.13)
Session: 2ab2f048bdd50021
Received 153 new bytes of response data.
Received a complete PAUSE response:
RTSP/1.0 200 OK
Content-Length: 0
CSeq: 7
Server: Packet Ship RTSP Server v3.1.10
Date: Tue, Jun 23 2015 10:22:43 GMT
Range: npt=4.73564-596.182
GET_PARAMETER requests while paused - note no response logged:
Sending request: GET_PARAMETER rtsp://127.0.0.1/media/bunny-mid.ts RTSP/1.0
CSeq: 8
User-Agent: LibVLC/2.2.0-rc2 (LIVE555 Streaming Media v2014.01.13)
Session: 2ab2f048bdd50021
Sending request: GET_PARAMETER rtsp://127.0.0.1/media/bunny-mid.ts RTSP/1.0
CSeq: 9
User-Agent: LibVLC/2.2.0-rc2 (LIVE555 Streaming Media v2014.01.13)
Session: 2ab2f048bdd50021
PLAY request on pressing . Note responses to GET_PARAMETERs above are now logged but not the response to PLAY (which was sent, see wireshark log below):
Sending request: PLAY rtsp://127.0.0.1/media/bunny-mid.ts RTSP/1.0
CSeq: 10
User-Agent: LibVLC/2.2.0-rc2 (LIVE555 Streaming Media v2014.01.13)
Session: 2ab2f048bdd50021
Received 312 new bytes of response data.
Received a complete GET_PARAMETER response:
RTSP/1.0 200 OK
Content-Length: 0
CSeq: 8
Server: Packet Ship RTSP Server v3.1.10
Date: Tue, Jun 23 2015 10:23:26 GMT
Content-type: text/parameters
(plus 156 additional bytes)
Received 154 new bytes of response data.
Received a complete GET_PARAMETER response:
RTSP/1.0 200 OK
Content-Length: 0
CSeq: 9
Server: Packet Ship RTSP Server v3.1.10
Date: Tue, Jun 23 2015 10:24:14 GMT
Content-type: text/parameters
Then after a further timeout period another GET_PARAMETER is sent, and Live555 complains (wrongly) about CSeq error:
Sending request: GET_PARAMETER rtsp://127.0.0.1/media/bunny-mid.ts RTSP/1.0
CSeq: 11
User-Agent: LibVLC/2.2.0-rc2 (LIVE555 Streaming Media v2014.01.13)
Session: 2ab2f048bdd50021
Received 157 new bytes of response data.
WARNING: The server did not respond to our "PLAY" request (CSeq: 10). The server appears to be buggy (perhaps not handling pipelined requests properly).
Received a complete GET_PARAMETER response:
RTSP/1.0 200 OK
Content-Length: 0
CSeq: 11
Server: Packet Ship RTSP Server v3.1.10
Date: Tue, Jun 23 2015 10:25:02 GMT
Content-type: text/parameters
Wireshark log
Note all commands have responses with correct CSeq. This doesn't show the timing but the responses are instant in every case.
GET_PARAMETER rtsp://127.0.0.1/media/bunny-mid.ts RTSP/1.0
CSeq: 6
User-Agent: LibVLC/2.2.0-rc2 (LIVE555 Streaming Media v2014.01.13)
Session: 2ab2f048bdd50021
RTSP/1.0 200 OK
Content-Length: 0
CSeq: 6
Server: Packet Ship RTSP Server v3.1.10
Date: Tue, Jun 23 2015 10:22:38 GMT
Content-type: text/parameters
PAUSE rtsp://127.0.0.1/media/bunny-mid.ts RTSP/1.0
CSeq: 7
User-Agent: LibVLC/2.2.0-rc2 (LIVE555 Streaming Media v2014.01.13)
Session: 2ab2f048bdd50021
RTSP/1.0 200 OK
Content-Length: 0
CSeq: 7
Server: Packet Ship RTSP Server v3.1.10
Date: Tue, Jun 23 2015 10:22:43 GMT
Range: npt=4.73564-596.182
GET_PARAMETER rtsp://127.0.0.1/media/bunny-mid.ts RTSP/1.0
CSeq: 8
User-Agent: LibVLC/2.2.0-rc2 (LIVE555 Streaming Media v2014.01.13)
Session: 2ab2f048bdd50021
RTSP/1.0 200 OK
Content-Length: 0
CSeq: 8
Server: Packet Ship RTSP Server v3.1.10
Date: Tue, Jun 23 2015 10:23:26 GMT
Content-type: text/parameters
GET_PARAMETER rtsp://127.0.0.1/media/bunny-mid.ts RTSP/1.0
CSeq: 9
User-Agent: LibVLC/2.2.0-rc2 (LIVE555 Streaming Media v2014.01.13)
Session: 2ab2f048bdd50021
RTSP/1.0 200 OK
Content-Length: 0
CSeq: 9
Server: Packet Ship RTSP Server v3.1.10
Date: Tue, Jun 23 2015 10:24:14 GMT
Content-type: text/parameters
PLAY rtsp://127.0.0.1/media/bunny-mid.ts RTSP/1.0
CSeq: 10
User-Agent: LibVLC/2.2.0-rc2 (LIVE555 Streaming Media v2014.01.13)
Session: 2ab2f048bdd50021
RTSP/1.0 200 OK
Content-Length: 0
CSeq: 10
Server: Packet Ship RTSP Server v3.1.10
Date: Tue, Jun 23 2015 10:24:20 GMT
Range: npt=4.73564-596.182
GET_PARAMETER rtsp://127.0.0.1/media/bunny-mid.ts RTSP/1.0
CSeq: 11
User-Agent: LibVLC/2.2.0-rc2 (LIVE555 Streaming Media v2014.01.13)
Session: 2ab2f048bdd50021
RTSP/1.0 200 OK
Content-Length: 0
CSeq: 11
Server: Packet Ship RTSP Server v3.1.10
Date: Tue, Jun 23 2015 10:25:02 GMT
Content-type: text/parameters
Analysis
I've only just started looking at the code (modules/access/live555.cpp) but there have obviously been some problems in this area before :-)
/* Voodoo (= no) thread safety here! *Ahem* */
Normally the GET_PARAMETER keepalives are sent by Demux() (live555.cpp:1280) based on a signal (b_timeout_call) from the TimeoutPrevention thread, but obviously Demux() isn't being called while paused. Hence on pause the TimeoutPrevention thread (live555.cpp:2105) is told to do it itself (via b_handle_keepalive) instead.
Comment (:1643) confirms this:
/* When we Pause, we'll need the TimeoutPrevention thread to
* handle sending the "Keep Alive" message to the server.
* Unfortunately Live555 isn't thread safe and so can't
* do this normally while the main Demux thread is handling
* a live stream. We end up with the Timeout thread blocking
* waiting for a response from the server. So when we PAUSE
* we set a flag that the TimeoutPrevention function will check
* and if it's set, it will trigger the GET_PARAMETER message */
What I can't figure out is how the response from GET_PARAMETER is logged and handled normally - maybe inside Live555? Every other RTSP command is followed by a call to wait_Live555_response(), but the GET_PARAMETER calls are not, in either case. I'm guessing that this is why the responses are not logged and it then loses the PLAY response.
Happy to try to fix and provide patch but need guidance!