5 live555 source code analysis (5) - live555 RTSP workflow (2)

In the previous article, we obtained the RTSP communication process through WireShark packet capture. This article uses code to analyze the working principle of each process.
The inheritance relationship of live555 is too complicated, so I made a picture to simply record the class inheritance relationship related to h264 file transfer
insert image description here

1. OPTION

OPTION is relatively simple, which is the available method for the client to request the server. After receiving the OPTION command from the client, the server calls the function handleCmd_OPTIONS for processing

void RTSPServer::RTSPClientConnection::handleCmd_OPTIONS() {
    
    
  snprintf((char*)fResponseBuffer, sizeof fResponseBuffer,
	   "RTSP/1.0 200 OK\r\nCSeq: %s\r\n%sPublic: %s\r\n\r\n",
	   fCurrentCSeq, dateHeader(), fOurRTSPServer.allowedCommandNames());
}

Server-side processing is to send the commands it supports back to the client as appropriate. You can see that the RTSP commands supported by live555 include
OPTIONS, DESCRIBE, SETUP, TEARDOWN, PLAY, PAUSE, GET_PARAMETER, SET_PARAMETER.

char const* RTSPServer::allowedCommandNames() {
    
    
  return "OPTIONS, DESCRIBE, SETUP, TEARDOWN, PLAY, PAUSE, GET_PARAMETER, SET_PARAMETER";
}

Two, DESCRIBE

The key variables of many classes of live555 are named the same, so I made a picture to simply record the actual types of key variables when simulating h264RTP. When you feel confused when analyzing the code later, you can look back at this picture.
insert image description here

The process of DESCRIBE is that the client requests the media description file from the server, and the server replies the sdp information to the client.

1. Processing the DESCRIBE command

void RTSPServer::RTSPClientConnection
::handleCmd_DESCRIBE(char const* urlPreSuffix, char const* urlSuffix, char const* fullRequestStr) {
    
    
  char urlTotalSuffix[2*RTSP_PARAM_STRING_MAX];
      // enough space for urlPreSuffix/urlSuffix'\0'
  urlTotalSuffix[0] = '\0';
  if (urlPreSuffix[0] != '\0') {
    
    
    strcat(urlTotalSuffix, urlPreSuffix);
    strcat(urlTotalSuffix, "/");
  }
  strcat(urlTotalSuffix, urlSuffix);
    
  if (!authenticationOK("DESCRIBE", urlTotalSuffix, fullRequestStr)) return;
    
  // We should really check that the request contains an "Accept:" #####
  // for "application/sdp", because that's what we're sending back #####
    
  // Begin by looking up the "ServerMediaSession" object for the specified "urlTotalSuffix":
  fOurServer.lookupServerMediaSession(urlTotalSuffix, DESCRIBELookupCompletionFunction, this);
}

As you can see, the handleCmd_DESCRIBE function on the server side. The first is to find the corresponding MediaSession through urlTotalSuffix, and call the ESCRIBELookupCompletionFunction function after finding it.

void RTSPServer::RTSPClientConnection
::DESCRIBELookupCompletionFunction(void* clientData, ServerMediaSession* sessionLookedUp) {
    
    
  RTSPServer::RTSPClientConnection* connection = (RTSPServer::RTSPClientConnection*)clientData;
  connection->handleCmd_DESCRIBE_afterLookup(sessionLookedUp);
}

void RTSPServer::RTSPClientConnection
::handleCmd_DESCRIBE_afterLookup(ServerMediaSession* session) {
    
    
  char* sdpDescription = NULL;
  char* rtspURL = NULL;
  do {
    
    
    if (session == NULL) {
    
    
      handleCmd_notFound();
      break;
    }
    
    // Increment the "ServerMediaSession" object's reference count, in case someone removes it
    // while we're using it:
    session->incrementReferenceCount();

    // Then, assemble a SDP description for this session:
    sdpDescription = session->generateSDPDescription(fAddressFamily);
    if (sdpDescription == NULL) {
    
    
      // This usually means that a file name that was specified for a
      // "ServerMediaSubsession" does not exist.
      setRTSPResponse("404 File Not Found, Or In Incorrect Format");
      break;
    }
    unsigned sdpDescriptionSize = strlen(sdpDescription);
    
    // Also, generate our RTSP URL, for the "Content-Base:" header
    // (which is necessary to ensure that the correct URL gets used in subsequent "SETUP" requests).
    rtspURL = fOurRTSPServer.rtspURL(session, fClientInputSocket);
    
    snprintf((char*)fResponseBuffer, sizeof fResponseBuffer,
	     "RTSP/1.0 200 OK\r\nCSeq: %s\r\n"
	     "%s"
	     "Content-Base: %s/\r\n"
	     "Content-Type: application/sdp\r\n"
	     "Content-Length: %d\r\n\r\n"
	     "%s",
	     fCurrentCSeq,
	     dateHeader(),
	     rtspURL,
	     sdpDescriptionSize,
	     sdpDescription);
  } while (0);
  
  if (session != NULL) {
    
    
    // Decrement its reference count, now that we're done using it:
    session->decrementReferenceCount();
    if (session->referenceCount() == 0 && session->deleteWhenUnreferenced()) {
    
    
      fOurServer.removeServerMediaSession(session);
    }
  }

  delete[] sdpDescription;
  delete[] rtspURL;
}

2. Generate SDP information

After finding the session, let the session generate SDP description information. Call the generateSDPDescription function

    for (subsession = fSubsessionsHead; subsession != NULL;
	 subsession = subsession->fNext) {
    
    
      char const* sdpLines = subsession->sdpLines(addressFamily);
      if (sdpLines == NULL) continue; // the media's not available
      sdpLength += strlen(sdpLines);
    }

When generatingSDPDescription, find the subsession of the session, and add the sdp information of each subsession.
In our example, there is only one subsession, which is H264VideoFileServerMediaSubsession.
And H264VideoFileServerMediaSubsession inherits from OnDemandServerMediaSubsession. So next, the sdpLines of OnDemandServerMediaSubsession will be called to generate sdp information.

3. Get the sdp information of the subsession

char const*
OnDemandServerMediaSubsession::sdpLines(int addressFamily) {
    
    
  if (fSDPLines == NULL) {
    
    
    // We need to construct a set of SDP lines that describe this
    // subsession (as a unicast stream).  To do so, we first create
    // dummy (unused) source and "RTPSink" objects,
    // whose parameters we use for the SDP lines:
    unsigned estBitrate;
    FramedSource* inputSource = createNewStreamSource(0, estBitrate);
    if (inputSource == NULL) return NULL; // file not found

    Groupsock* dummyGroupsock = createGroupsock(nullAddress(addressFamily), 0);
    unsigned char rtpPayloadType = 96 + trackNumber()-1; // if dynamic
    RTPSink* dummyRTPSink = createNewRTPSink(dummyGroupsock, rtpPayloadType, inputSource);
    if (dummyRTPSink != NULL && dummyRTPSink->estimatedBitrate() > 0) estBitrate = dummyRTPSink->estimatedBitrate();

    setSDPLinesFromRTPSink(dummyRTPSink, inputSource, estBitrate);
    Medium::close(dummyRTPSink);
    delete dummyGroupsock;
    closeStreamSource(inputSource);
  }

  return fSDPLines;
}

Since we do not know the sps and pps information of H264VideoFileServerMediaSubsession during initialization, live555 first reads the H264 file and parses it to obtain sps and pps information by simulating the way of sending RTP streams.
The first is to create an input source;
then create a simulated Groupsock, the simulated Groupsock is an empty address, and the port is also set to 0;
then use this Groupsock and input source to create an RTP consumer, because the IP ports of this RTPSink are all It is false, so the RTP stream will not be actually sent.
Finally, get the SDP information of this subsession through setSDPLinesFromRTPSink.
Since this process is only for obtaining the SDP information, all the created simulated media resources are released after the SDP information is obtained.

4. Create a media input source

This Subsession is of H264VideoFileServerMediaSubsession type, so createNewStreamSource of H264VideoFileServerMediaSubsession will be called.

FramedSource* H264VideoFileServerMediaSubsession::createNewStreamSource(unsigned /*clientSessionId*/, unsigned& estBitrate) {
    
    
  estBitrate = 500; // kbps, estimate

  // Create the video source:
  ByteStreamFileSource* fileSource = ByteStreamFileSource::createNew(envir(), fFileName);
  if (fileSource == NULL) return NULL;
  fFileSize = fileSource->fileSize();

  // Create a framer for the Video Elementary Stream:
  return H264VideoStreamFramer::createNew(envir(), fileSource);
}

In this function, the file source will be created first, and a byte stream file source of type ByteStreamFileSource will be created.
Then use this file source to create a H264VideoStreamFramer.

H264VideoStreamFramer
::H264VideoStreamFramer(UsageEnvironment& env, FramedSource* inputSource, Boolean createParser,
			Boolean includeStartCodeInOutput, Boolean insertAccessUnitDelimiters)
  : H264or5VideoStreamFramer(264, env, inputSource, createParser,
			     includeStartCodeInOutput, insertAccessUnitDelimiters) {
    
    
}

H264VideoStreamFramer inherits from H264or5VideoStreamFramer.

H264or5VideoStreamFramer
::H264or5VideoStreamFramer(int hNumber, UsageEnvironment& env, FramedSource* inputSource,
			   Boolean createParser,
			   Boolean includeStartCodeInOutput, Boolean insertAccessUnitDelimiters)
  : MPEGVideoStreamFramer(env, inputSource),
    fHNumber(hNumber), fIncludeStartCodeInOutput(includeStartCodeInOutput),
    fInsertAccessUnitDelimiters(insertAccessUnitDelimiters),
    fLastSeenVPS(NULL), fLastSeenVPSSize(0),
    fLastSeenSPS(NULL), fLastSeenSPSSize(0),
    fLastSeenPPS(NULL), fLastSeenPPSSize(0) {
    
    
  fParser = createParser
    ? new H264or5VideoStreamParser(hNumber, this, inputSource, includeStartCodeInOutput)
    : NULL;
  fFrameRate = 30.0; // We assume a frame rate of 30 fps, unless we learn otherwise (from parsing a VPS or SPS NAL unit)
}

A parser will be created here to parse h264 files.

5. Get SDPLine

void OnDemandServerMediaSubsession
::setSDPLinesFromRTPSink(RTPSink* rtpSink, FramedSource* inputSource, unsigned estBitrate) {
    
    
  if (rtpSink == NULL) return;

  char const* mediaType = rtpSink->sdpMediaType();
  unsigned char rtpPayloadType = rtpSink->rtpPayloadType();
  struct sockaddr_storage const& addressForSDP = rtpSink->groupsockBeingUsed().groupAddress();
  portNumBits portNumForSDP = ntohs(rtpSink->groupsockBeingUsed().port().num());

  AddressString ipAddressStr(addressForSDP);
  char* rtpmapLine = rtpSink->rtpmapLine();
  char const* rtcpmuxLine = fMultiplexRTCPWithRTP ? "a=rtcp-mux\r\n" : "";
  char const* rangeLine = rangeSDPLine();
  char const* auxSDPLine = getAuxSDPLine(rtpSink, inputSource);
  if (auxSDPLine == NULL) auxSDPLine = "";

  char const* const sdpFmt =
    "m=%s %u RTP/AVP %d\r\n"
    "c=IN %s %s\r\n"
    "b=AS:%u\r\n"
    "%s"
    "%s"
    "%s"
    "%s"
    "a=control:%s\r\n";
  unsigned sdpFmtSize = strlen(sdpFmt)
    + strlen(mediaType) + 5 /* max short len */ + 3 /* max char len */
    + 3/*IP4 or IP6*/ + strlen(ipAddressStr.val())
    + 20 /* max int len */
    + strlen(rtpmapLine)
    + strlen(rtcpmuxLine)
    + strlen(rangeLine)
    + strlen(auxSDPLine)
    + strlen(trackId());
  char* sdpLines = new char[sdpFmtSize];
  sprintf(sdpLines, sdpFmt,
	  mediaType, // m= <media>
	  portNumForSDP, // m= <port>
	  rtpPayloadType, // m= <fmt list>
	  addressForSDP.ss_family == AF_INET ? "IP4" : "IP6", ipAddressStr.val(), // c= address
	  estBitrate, // b=AS:<bandwidth>
	  rtpmapLine, // a=rtpmap:... (if present)
	  rtcpmuxLine, // a=rtcp-mux:... (if present)
	  rangeLine, // a=range:... (if present)
	  auxSDPLine, // optional extra SDP line
	  trackId()); // a=control:<track-id>
  delete[] (char*)rangeLine; delete[] rtpmapLine;

  delete[] fSDPLines; fSDPLines = strDup(sdpLines);
  delete[] sdpLines;
}

In this way, information such as sdp media type, payload type, IP address, etc. is generated.
The focus is on the getAuxSDPLine function, which will generate the Sdp information related to the h264 file.
Since we are H264VideoFileServerMediaSubsession, go back and call the getAuxSDPLine function of H264VideoFileServerMediaSubsession.

char const* H264VideoFileServerMediaSubsession::getAuxSDPLine(RTPSink* rtpSink, FramedSource* inputSource) {
    
    
  if (fAuxSDPLine != NULL) return fAuxSDPLine; // it's already been set up (for a previous client)

  if (fDummyRTPSink == NULL) {
    
     // we're not already setting it up for another, concurrent stream
    // Note: For H264 video files, the 'config' information ("profile-level-id" and "sprop-parameter-sets") isn't known
    // until we start reading the file.  This means that "rtpSink"s "auxSDPLine()" will be NULL initially,
    // and we need to start reading data from our file until this changes.
    fDummyRTPSink = rtpSink;

    // Start reading the file:
    fDummyRTPSink->startPlaying(*inputSource, afterPlayingDummy, this);

    // Check whether the sink's 'auxSDPLine()' is ready:
    checkForAuxSDPLine(this);
  }

  envir().taskScheduler().doEventLoop(&fDoneFlag);

  return fAuxSDPLine;
}

This function is used to judge whether the sdpline has been generated at present. If not, the sdpline is generated by starting the simulated RTPSink. And checkForAuxSDPLine to detect whether it is generated.
Let's take a look at the checkForAuxSDPLine function first

void H264VideoFileServerMediaSubsession::checkForAuxSDPLine1() {
    
    
  nextTask() = NULL;

  char const* dasl;
  if (fAuxSDPLine != NULL) {
    
    
    // Signal the event loop that we're done:
    setDoneFlag();
  } else if (fDummyRTPSink != NULL && (dasl = fDummyRTPSink->auxSDPLine()) != NULL) {
    
    
    fAuxSDPLine = strDup(dasl);
    fDummyRTPSink = NULL;

    // Signal the event loop that we're done:
    setDoneFlag();
  } else if (!fDoneFlag) {
    
    
    // try again after a brief delay:
    int uSecsToDelay = 100000; // 100 ms
    nextTask() = envir().taskScheduler().scheduleDelayedTask(uSecsToDelay,
			      (TaskFunc*)checkForAuxSDPLine, this);
  }
}

It can be seen that this function is to continuously check whether SDPLine or SDPLine that simulates RTPSink is generated. If it is not generated, it will continue to detect after 100ms, and set fDoneFlag until it is generated. At this time, the task scheduler will stop working.

char const* H264VideoRTPSink::auxSDPLine() {
    
    
  // Generate a new "a=fmtp:" line each time, using our SPS and PPS (if we have them),
  // otherwise parameters from our framer source (in case they've changed since the last time that
  // we were called):
  H264or5VideoStreamFramer* framerSource = NULL;
  u_int8_t* vpsDummy = NULL; unsigned vpsDummySize = 0;
  u_int8_t* sps = fSPS; unsigned spsSize = fSPSSize;
  u_int8_t* pps = fPPS; unsigned ppsSize = fPPSSize;
  if (sps == NULL || pps == NULL) {
    
    
    // We need to get SPS and PPS from our framer source:
    if (fOurFragmenter == NULL) return NULL; // we don't yet have a fragmenter (and therefore not a source)
    framerSource = (H264or5VideoStreamFramer*)(fOurFragmenter->inputSource());
    if (framerSource == NULL) return NULL; // we don't yet have a source

    framerSource->getVPSandSPSandPPS(vpsDummy, vpsDummySize, sps, spsSize, pps, ppsSize);
    if (sps == NULL || pps == NULL) return NULL; // our source isn't ready
  }

  // Set up the "a=fmtp:" SDP line for this stream:
  u_int8_t* spsWEB = new u_int8_t[spsSize]; // "WEB" means "Without Emulation Bytes"
  unsigned spsWEBSize = removeH264or5EmulationBytes(spsWEB, spsSize, sps, spsSize);
  if (spsWEBSize < 4) {
    
     // Bad SPS size => assume our source isn't ready
    delete[] spsWEB;
    return NULL;
  }
  u_int32_t profileLevelId = (spsWEB[1]<<16) | (spsWEB[2]<<8) | spsWEB[3];
  delete[] spsWEB;

  char* sps_base64 = base64Encode((char*)sps, spsSize);
  char* pps_base64 = base64Encode((char*)pps, ppsSize);

  char const* fmtpFmt =
    "a=fmtp:%d packetization-mode=1"
    ";profile-level-id=%06X"
    ";sprop-parameter-sets=%s,%s\r\n";
  unsigned fmtpFmtSize = strlen(fmtpFmt)
    + 3 /* max char len */
    + 6 /* 3 bytes in hex */
    + strlen(sps_base64) + strlen(pps_base64);
  char* fmtp = new char[fmtpFmtSize];
  sprintf(fmtp, fmtpFmt,
          rtpPayloadType(),
	  profileLevelId,
          sps_base64, pps_base64);

  delete[] sps_base64;
  delete[] pps_base64;

  delete[] fFmtpSDPLine; fFmtpSDPLine = fmtp;
  return fFmtpSDPLine;
}

It can be seen that this function is to detect whether there is SPS and PPS information, and if so, the corresponding SDPLine can be generated according to SPS and PPS.
Then we go back and continue to look at how to generate SPS and PPS information after starting RTPSink.

6. Start RTPSink

Boolean MediaSink::startPlaying(MediaSource& source,
				afterPlayingFunc* afterFunc,
				void* afterClientData) {
    
    
  // Make sure we're not already being played:
  if (fSource != NULL) {
    
    
    envir().setResultMsg("This sink is already being played");
    return False;
  }

  // Make sure our source is compatible:
  if (!sourceIsCompatibleWithUs(source)) {
    
    
    envir().setResultMsg("MediaSink::startPlaying(): source is not compatible!");
    return False;
  }
  fSource = (FramedSource*)&source;

  fAfterFunc = afterFunc;
  fAfterClientData = afterClientData;
  return continuePlaying();
}

After the playback starts, the virtual function continuePlaying will be called. Our RTPSink is of H264or5VideoRTPSink type, so call continuePlaying of H264or5VideoRTPSink

Boolean H264or5VideoRTPSink::continuePlaying() {
    
    
  // First, check whether we have a 'fragmenter' class set up yet.
  // If not, create it now:
  if (fOurFragmenter == NULL) {
    
    
    fOurFragmenter = new H264or5Fragmenter(fHNumber, envir(), fSource, OutPacketBuffer::maxSize,
					   ourMaxPacketSize() - 12/*RTP hdr size*/);
  } else {
    
    
    fOurFragmenter->reassignInputSource(fSource);
  }
  fSource = fOurFragmenter;

  // Then call the parent class's implementation:
  return MultiFramedRTPSink::continuePlaying();
}

A slice manager for H264 or H265 is created here. Then call the parent class MultiFramedRTPSink::continuePlaying;

Boolean MultiFramedRTPSink::continuePlaying() {
    
    
  // Send the first packet.
  // (This will also schedule any future sends.)
  buildAndSendPacket(True);
  return True;
}
...
void MultiFramedRTPSink::buildAndSendPacket(Boolean isFirstPacket) {
    
    
  nextTask() = NULL;
  fIsFirstPacket = isFirstPacket;

  // Set up the RTP header:
  unsigned rtpHdr = 0x80000000; // RTP version 2; marker ('M') bit not set (by default; it can be set later)
  rtpHdr |= (fRTPPayloadType<<16);
  rtpHdr |= fSeqNo; // sequence number
  fOutBuf->enqueueWord(rtpHdr);

  // Note where the RTP timestamp will go.
  // (We can't fill this in until we start packing payload frames.)
  fTimestampPosition = fOutBuf->curPacketSize();
  fOutBuf->skipBytes(4); // leave a hole for the timestamp

  fOutBuf->enqueueWord(SSRC());

  // Allow for a special, payload-format-specific header following the
  // RTP header:
  fSpecialHeaderPosition = fOutBuf->curPacketSize();
  fSpecialHeaderSize = specialHeaderSize();
  fOutBuf->skipBytes(fSpecialHeaderSize);

  // Begin packing as many (complete) frames into the packet as we can:
  fTotalFrameSpecificHeaderSizes = 0;
  fNoFramesLeft = False;
  fNumFramesUsedSoFar = 0;
  packFrame();
}
void MultiFramedRTPSink::packFrame() {
    
    
  // Get the next frame.

  // First, skip over the space we'll use for any frame-specific header:
  fCurFrameSpecificHeaderPosition = fOutBuf->curPacketSize();
  fCurFrameSpecificHeaderSize = frameSpecificHeaderSize();
  fOutBuf->skipBytes(fCurFrameSpecificHeaderSize);
  fTotalFrameSpecificHeaderSizes += fCurFrameSpecificHeaderSize;

  // See if we have an overflow frame that was too big for the last pkt
  if (fOutBuf->haveOverflowData()) {
    
    
    // Use this frame before reading a new one from the source
    unsigned frameSize = fOutBuf->overflowDataSize();
    struct timeval presentationTime = fOutBuf->overflowPresentationTime();
    unsigned durationInMicroseconds = fOutBuf->overflowDurationInMicroseconds();
    fOutBuf->useOverflowData();

    afterGettingFrame1(frameSize, 0, presentationTime, durationInMicroseconds);
  } else {
    
    
    // Normal case: we need to read a new frame from the source
    if (fSource == NULL) return;
    fSource->getNextFrame(fOutBuf->curPtr(), fOutBuf->totalBytesAvailable(),
			  afterGettingFrame, this, ourHandleClosure, this);
  }
}

continuePlaying uses this to call the packFrame function, which will call fSource->getNextFrame when insufficient data is detected in this function. Remember that this fSource is the fragment manager created before, and the type is H264or5Fragmenter.
The getNextFrame function will call the virtual function doGetNextFrame, which is implemented by each subclass.
So the doGetNextFrame function of H264or5Fragmenter will be called here.

  if (fNumValidDataBytes == 1) {
    
    
    // We have no NAL unit data currently in the buffer.  Read a new one:
    fInputSource->getNextFrame(&fInputBuffer[1], fInputBufferSize - 1,
			       afterGettingFrame, this,
			       FramedSource::handleClosure, this);
  }

H264or5Fragmenter's doGetNextFrame will call InputSource->getNextFrame again.
And the type of this InputSource is H264VideoStreamFramer, you can go back and see the key variable type diagram.
So just call H264VideoStreamFramer::doGetNextFrame. It inherits from H264or5VideoStreamFramer and does not rewrite the virtual function itself, so call H264or5VideoStreamFramer::doGetNextFrame.

 else {
    
    
    // Do the normal delivery of a NAL unit from the parser:
    MPEGVideoStreamFramer::doGetNextFrame();
  }

It then calls MPEGVideoStreamFramer::doGetNextFrame()

void MPEGVideoStreamFramer::doGetNextFrame() {
    
    
  fParser->registerReadInterest(fTo, fMaxSize);
  continueReadProcessing();
}

continueReadProcessing calls the parsing function of the parser again.

unsigned acquiredFrameSize = fParser->parse();

This fParser is of H264or5VideoStreamParser type, so it will call H264or5VideoStreamParser::parse.
In the parse function, you can see that the file data is read and parsed, and after parsing, it is judged whether it is an SPS or PPS type nalu, and if so, the SPS and PPS information is set. So far, we have obtained the SPS and PPS information.

usingSource()->saveCopyOfSPS(fStartOfFrame + fOutputStartCodeSize, curFrameSize() - fOutputStartCodeSize);
...
usingSource()->saveCopyOfPPS(fStartOfFrame + fOutputStartCodeSize, curFrameSize() - fOutputStartCodeSize);

After obtaining the SPS and PPS information, combined with the previous H264VideoRTPSink::auxSDPLine() function, the corresponding SDPLine information can be generated. Thus, SDP information is generated and sent to the client. Complete the entire process of DESCRIBE.

Guess you like

Origin blog.csdn.net/qq_36383272/article/details/121008143