WebRTC audio system audio sending and receiving


A complete audio transmission involves audio acquisition, audio enhancement, audio mixing, audio format (sampling rate, number of channels), codec, RTP packet, and network transmission. In this process, WebRTC is structurally clear and decoupled , each layer is abstracted, because the data at the collection end is continuous, so the data collection has a separate thread implementation.

3.1 Audio data stream sending process

The modules involved in the audio sending process are shown in the figure below
Please add a picture description
Figure 3-1 WebRTC audio sending data flow

3.2 Encoding and RTP packaging in sending

webrtc::AudioTransportIt is more like a two-way bridge, which supports both sending and receiving. After sending, it also connects one or more data stream sending modules, which webrtc::AudioSendStreamconnect the two ends of data collection and sending, but the sending end needs to denoise the data before sending , echo removal, codec, and RTP packaging. The noise reduction enhancement of data is realized by the APM module. The details of each algorithm processing of APM can be found in the "Real-time Voice Processing Practice Guide". In this section, see/middle audio data encoding, RTP packet webrtc::AudioSenderpackaging webrtc::AudioSendStreamand Design and implementation of send control. The implementation of webrtc::AudioSender/ is located at / , and the related class hierarchy is as follows: Figure 3-2 SendSteam UML relationship diagramwebrtc::AudioSendStreamwebrtc/audio/audio_send_stream.hwebrtc/audio/audio_send_stream.cc
Please add a picture description

In the interactive real-time communication scenario, the encoding and sending of audio data in real-time communication is different from the streaming schemes such as RTMP in the live broadcast scenario. UDP/RTP protocol, and RTMP/TCP protocol is used in the live streaming scene. The real-time performance of real-time communication is prioritized, which does not mean that the quality requirements are not high. Packet loss and jitter disorder based on UDP transmission will bring about a decline in communication quality. Therefore, WebRTC adopts NetEQ technology at the receiving end, while at the sending end, it is necessary to dynamically adjust and control the encoding code rate according to the detected network conditions and the RTCP packets sent back from the receiving end.

webrtc::AudioSendStreamIt is an interface class that implements streaming audio transmission. The main functions of this class are listed in the methods in the UML diagram. The main points are as follows:

  • Use to bool SetupSendCodec(const Config& new_config)set the encoder type and configure the encoder encoding target bit rate;
  • Set the maximum, minimum and default priority bit rate of SendStream, and the dynamically updated allocated bit rate.
  • webrtc::AudioSendStreamThe life cycle control of is realized by the method in the start()diagram stop();
  • Perform volume control, noise reduction, and encoding on the data collected by the ADM module;
  • Receive and process the returned RTCP packets, and adjust the encoding bit rate;

The encoding, rtp packet, and pacing in the sending process are all webrtc::AudioSendStreamimplemented in . The data flow is shown by the numbered numbers in the figure. The PCM data collected by ADM is processed by webrtc::AudioTransportAPM and passed to the ACM module webrtc::AudioSendStream. webrtc::AudioSendStreamThe ACM module code is first called internally. Then use the rtp_rtcp module packaging interface to pack the encoded bit stream into RTP packaging, then use the pacing module to do smoothing and priority transmission control, and finally send the interface call through the rtp_rtcp module webrtc::Transportto deliver

3.3 AudioSendStream class relationship

webrtc::AudioSendStreamwebrtc:internal::AudioSendStreamis an interface class, which implements the audio data processing stream structure in its derived class Figure X.2 . The definition and construction process of the audio data stream processing structure is as follows. The relationship betweenwebrtc::AudioSendStream and is shown in the following UML, Figure 3-3 AudioSendSteam UML diagramwebrtc::AudioTransport
Please add a picture description

AudioTransportImplThe type in the 3-3 figure audio_senders_is vector type AduioSender, which also confirms that one AudioTransport can correspond to multiple AudioSendStreams mentioned above. webrtc::AudioTransportAfter the data is collected, RecordedDataIsAvailablethe method will be called. The implementation of this method is as follows:

// Not used in Chromium. Process captured audio and distribute to all sending
// streams, and try to do this at the lowest possible sample rate.
int32_t AudioTransportImpl::RecordedDataIsAvailable(
    const void* audio_data,
    const size_t number_of_frames,
    const size_t bytes_per_sample,
    const size_t number_of_channels,
    const uint32_t sample_rate,
    const uint32_t audio_delay_milliseconds,
    const int32_t /*clock_drift*/,
    const uint32_t /*volume*/,
    const bool key_pressed,
    uint32_t& /*new_mic_volume*/,
    const int64_t
        estimated_capture_time_ns) {
    
      // NOLINT: to avoid changing APIs
  RTC_DCHECK(audio_data);
  RTC_DCHECK_GE(number_of_channels, 1);
  RTC_DCHECK_LE(number_of_channels, 2);
  RTC_DCHECK_EQ(2 * number_of_channels, bytes_per_sample);
  RTC_DCHECK_GE(sample_rate, AudioProcessing::NativeRate::kSampleRate8kHz);
  // 100 = 1 second / data duration (10 ms).
  RTC_DCHECK_EQ(number_of_frames * 100, sample_rate);
  RTC_DCHECK_LE(bytes_per_sample * number_of_frames * number_of_channels,
                AudioFrame::kMaxDataSizeBytes);

  int send_sample_rate_hz = 0;
  size_t send_num_channels = 0;
  bool swap_stereo_channels = false;
  {
    
    
    MutexLock lock(&capture_lock_);
    send_sample_rate_hz = send_sample_rate_hz_;
    send_num_channels = send_num_channels_;
    swap_stereo_channels = swap_stereo_channels_;
  }

  std::unique_ptr<AudioFrame> audio_frame(new AudioFrame());
  InitializeCaptureFrame(sample_rate, send_sample_rate_hz, number_of_channels,
                         send_num_channels, audio_frame.get());
  voe::RemixAndResample(static_cast<const int16_t*>(audio_data),
                        number_of_frames, number_of_channels, sample_rate,
                        &capture_resampler_, audio_frame.get());
  ProcessCaptureFrame(audio_delay_milliseconds, key_pressed,
                      swap_stereo_channels, audio_processing_,
                      audio_frame.get());
  audio_frame->set_absolute_capture_timestamp_ms(estimated_capture_time_ns /
                                                 1000000);

  RTC_DCHECK_GT(audio_frame->samples_per_channel_, 0);
  if (async_audio_processing_)
    async_audio_processing_->Process(std::move(audio_frame));
  else
    SendProcessedData(std::move(audio_frame));

  return 0;
}

RecordedDataIsAvailable()The method will call each method inaudio_senders_ the vector one by one . The function is to call the appropriate encoder to encode the data and send the encoded bitstream data, which means that a piece of recorded data can be encoded using different encoders and Use different sending control strategies to send, such as using UDP protocol to transmit RTP protocol packets in video conferences, or using TCP protocol to transmit RTMP protocol packets in live streaming. Figure 3-3 shows the relationship between these two classes With these two core APIs, there is a problem here, which is when it is set in , which is executed when the lifecycle function of is called. The process of adding is roughly as follows:webrtc::internal::AudioSendStreamSendAudioData()SendAudioData()audio_senders_webrtc::AudioTransportwebrtc::AudioSendStreamStart()

//webrtc/audio/audio_transport_impl.cc
#0 webrtc::AudioTransportImpl::UpdateAudioSenders(std::vector<webrtc::AudioSender*, std::allocator<webrtc::AudioSender*> >, int, unsigned long) ()
//webrtc/audio/audio_state.cc
#1  webrtc::internal::AudioState::UpdateAudioTransportWithSendingStreams() () 
//webrtc/audio/audio_state.cc
#2 webrtc::internal::AudioState::AddSendingStream(webrtc::AudioSendStream*, int, unsigned long) ()
webrtc/audio/audio_send_stream.cc:
#3  webrtc::internal::AudioSendStream::Start() () 

webrtc::AudioSendStreamAdd itself webrtc::AudioState, and webrtc::AudioStateadd the newly added webrtc::AudioSendStreamand the previously added webrtc::AudioSendStreampass UpdateAudioSenders to webrtc::AudioTransportit. If the newly added webrtc::AudioSendStreamis the first webrtc::AudioSendStream, the webrtc::AudioStatewill also initialize the device and start recording.

3.4 webrtc::AudioSendStreamCreation and initialization

webrtc::AudioSendStreamThe internal data processing components are shown in Figure X.1. It can be seen from the code that creating webrtc::AudioSendStreaman object will eventually call webrtc::voe::(anonymous namespace)::ChannelSendthe object to create some key objects/modules, and establish some connections between various objects/modules. The calling process is shown in the figure below:
Please add a picture description
Figure 3-4 webrtc::AudioSendStream initialization function call stack

In WebRTC, audio starts from VoiceEngine, and VoiceEngine WebRTCVoiceMediaChannelis a very important object. It calls to webrtc::Callcreate a new AudioSendStream object. The parameters passed during creation config_are webrtc::AudioSendStream::Configof type , including codec, bit rate, RTP, encryption and webrtc::Transportother configurations .

webrtc::voe::(anonymous namespace)::ChannelSendThe object's constructor creates most and send related objects/modules, its constructor is as follows:

    Clock* clock,
    TaskQueueFactory* task_queue_factory,
    Transport* rtp_transport,
    RtcpRttStats* rtcp_rtt_stats,
    RtcEventLog* rtc_event_log,
    FrameEncryptorInterface* frame_encryptor,
    const webrtc::CryptoOptions& crypto_options,
    bool extmap_allow_mixed,
    int rtcp_report_interval_ms,
    uint32_t ssrc,
    rtc::scoped_refptr<FrameTransformerInterface> frame_transformer,
    TransportFeedbackObserver* feedback_observer,
    const FieldTrialsView& field_trials)
    : ssrc_(ssrc),
      event_log_(rtc_event_log),
      _timeStamp(0),  // This is just an offset, RTP module will add it's own
                      // random offset
      input_mute_(false),
      previous_frame_muted_(false),
      _includeAudioLevelIndication(false),
      rtcp_observer_(new VoERtcpObserver(this)),
      feedback_observer_(feedback_observer),
      //创建一个 `RtpPacketSenderProxy` 对象
      rtp_packet_pacer_proxy_(new RtpPacketSenderProxy()),
      retransmission_rate_limiter_(
          new RateLimiter(clock, kMaxRetransmissionWindowMs)),
      frame_encryptor_(frame_encryptor),
      crypto_options_(crypto_options),
      fixing_timestamp_stall_(
          field_trials.IsDisabled("WebRTC-Audio-FixTimestampStall")),
      encoder_queue_(task_queue_factory->CreateTaskQueue(
          "AudioEncoder",
          TaskQueueFactory::Priority::NORMAL)) {
          //创建一个 webrtc::AudioCodingModule 对象
  audio_coding_.reset(AudioCodingModule::Create(AudioCodingModule::Config()));

  RtpRtcpInterface::Configuration configuration;
  configuration.bandwidth_callback = rtcp_observer_.get();
  configuration.transport_feedback_callback = feedback_observer_;
  configuration.clock = (clock ? clock : Clock::GetRealTimeClock());
  configuration.audio = true;
  configuration.outgoing_transport = rtp_transport;

  configuration.paced_sender = rtp_packet_pacer_proxy_.get();

  configuration.event_log = event_log_;
  configuration.rtt_stats = rtcp_rtt_stats;
  configuration.retransmission_rate_limiter =
      retransmission_rate_limiter_.get();
  configuration.extmap_allow_mixed = extmap_allow_mixed;
  configuration.rtcp_report_interval_ms = rtcp_report_interval_ms;
  configuration.rtcp_packet_type_counter_observer = this;

  configuration.local_media_ssrc = ssrc;
//建了一个 webrtc::ModuleRtpRtcpImpl2 对象
  rtp_rtcp_ = ModuleRtpRtcpImpl2::Create(configuration);
  rtp_rtcp_->SetSendingMediaStatus(false);
//创建 `webrtc::RTPSenderAudio` 对象
  rtp_sender_audio_ = std::make_unique<RTPSenderAudio>(configuration.clock,
                                                       rtp_rtcp_->RtpSender());

  // Ensure that RTCP is enabled by default for the created channel.
  rtp_rtcp_->SetRTCPStatus(RtcpMode::kCompound);

  int error = audio_coding_->RegisterTransportCallback(this);
  RTC_DCHECK_EQ(0, error);
  if (frame_transformer)
    InitFrameTransformerDelegate(std::move(frame_transformer));
}

webrtc::voe::(anonymous namespace)::ChannelSendThe main tasks of the object's constructor are as follows:

  • Create RtpPacketSenderProxyan object ; its method EnqueuePacketsis to add the RTP packet to the pacing queue, and then send the RTP packet according to the target sending bit rate and sending scheduling priority
  • Create webrtc::AudioCodingModulean object , corresponding to the connection in the AudioCodingModule direction labeled 7 in Figure X.1;
  • Create webrtc::ModuleRtpRtcpImpl2an object , the configuration item of its Createmethod configurationparameter outgoing_transportpoints to the incoming one webrtc::Transport, corresponding to the connection labeled 13 in Figure X.1, and configurationthe parameter paced_senderpoints to the previously created RtpPacketSenderProxyobject, corresponding to the connection labeled 10 in Figure X.1
  • Create webrtc::RTPSenderAudiothe object , passrtp_sender_audio_config in the object obtained from the object through the object, corresponding to the connections labeled 8 and 9 in Figure X.1;rtp_senderwebrtc::ModuleRtpRtcpImpl2webrtc::RTPSender
  • thisRegister as webrtc::AudioPacketizationCallbackto webrtc::AudioCodingModulethe object , which corresponds to the number 7 in Figure X.1 and connect in the direction of ChannelSendInterface.

At this stage, acm2 and rtp_rtcp have been connected webrtc::voe::(anonymous namespace)::ChannelSendto the module of , and the connection of the pacing module is implemented in the functionChannelSend of , and its call stack is shown in Figure X.4. The object's function is implemented as follows:RegisterSenderCongestionControlObjects()ChannelSendRegisterSenderCongestionControlObjects()

//webrtc/audio/channel_send.cc
706 void ChannelSend::RegisterSenderCongestionControlObjects(
707    RtpTransportControllerSendInterface* transport,
708    RtcpBandwidthObserver* bandwidth_observer) {
    
    
709  RTC_DCHECK_RUN_ON(&worker_thread_checker_);
710  RtpPacketSender* rtp_packet_pacer = transport->packet_sender();
711  PacketRouter* packet_router = transport->packet_router();

713  RTC_DCHECK(rtp_packet_pacer);
714  RTC_DCHECK(packet_router);
715  RTC_DCHECK(!packet_router_);
716  rtcp_observer_->SetBandwidthObserver(bandwidth_observer);
717  rtp_packet_pacer_proxy_->SetPacketPacer(rtp_packet_pacer);
718  rtp_rtcp_->SetStorePacketsStatus(true, 600);
719  packet_router_ = packet_router;
720}

ChannelSend::RegisterSenderCongestionControlObjectsLines 710 and 711 webrtc::RtpTransportControllerSendInterfaceextract webrtc::RtpPacketSenderthe instance and webrtc::PacketRouterinstance in the object, and webrtc::RtpPacketSenderset the instance to the previously created RtpPacketSenderProxyobject , thus establishing the actual connection marked 11 in the previous figure, and 11 will no longer be a dotted line from now on, and the obtained webrtc::PacketRoutersaved Get up and spare.

Next, let's take a look at how the ACM module (acm2) module in Figure 3-1 associates the encoder interface layer with a specific encoder. The audio encoder is created when the webrtc::AudioSendStreamconfiguration interface of Reconfigure()is called, and the encoder is registered webrtc::AudioCodingModule. The process is as follows:

//third_party/webrtc/audio/audio_send_stream.cc
void AudioSendStream::ConfigureStream(
    const webrtc::AudioSendStream::Config& new_config,
    bool first_time,
    SetParametersCallback callback) {
    
    
    if (!ReconfigureSendCodec(new_config)) {
    
    
    RTC_LOG(LS_ERROR) << "Failed to set up send codec state.";

    webrtc::InvokeSetParametersCallback(
        callback, webrtc::RTCError(webrtc::RTCErrorType::INTERNAL_ERROR,
                                   "Failed to set up send codec state."));
  }

}

Please add a picture description
Figure 3-5 Encoder initialization process

webrtc::AudioSendStreamCall MakeAudioEncoder to create an audio encoder. In addition to specific encoders (such as OPUS, AAC, G7XX), there are comfort noise encoders and redundant frame RED encoders. So far, the general interface class of the encoder in the acm2 box has been connected to the specific encoder implementation, and which encoding to enable depends on the result of the SDP negotiation.

// Apply current codec settings to a single voe::Channel used for sending.
bool AudioSendStream::SetupSendCodec(const Config& new_config) {
    
    
  RTC_DCHECK(new_config.send_codec_spec);
  const auto& spec = *new_config.send_codec_spec;

  RTC_DCHECK(new_config.encoder_factory);
  //创建特定类型的(由payload_type指定,如113是opus 单声道音频编码器)具体音频编码器
  std::unique_ptr<AudioEncoder> encoder =
      new_config.encoder_factory->MakeAudioEncoder(
          spec.payload_type, spec.format, new_config.codec_pair_id);

  if (!encoder) {
    
    
    RTC_DLOG(LS_ERROR) << "Unable to create encoder for "
                       << rtc::ToString(spec.format);
    return false;
  }

  // If a bitrate has been specified for the codec, use it over the
  // codec's default.
  if (spec.target_bitrate_bps) {
    
    
    encoder->OnReceivedTargetAudioBitrate(*spec.target_bitrate_bps);
  }

  // Enable ANA if configured (currently only used by Opus).
  if (new_config.audio_network_adaptor_config) {
    
    
    if (encoder->EnableAudioNetworkAdaptor(
            *new_config.audio_network_adaptor_config, event_log_)) {
    
    
      RTC_LOG(LS_INFO) << "Audio network adaptor enabled on SSRC "
                       << new_config.rtp.ssrc;
    } else {
    
    
      RTC_LOG(LS_INFO) << "Failed to enable Audio network adaptor on SSRC "
                       << new_config.rtp.ssrc;
    }
  }

  // Wrap the encoder in an AudioEncoderCNG, if VAD is enabled.
  if (spec.cng_payload_type) {
    
    
    AudioEncoderCngConfig cng_config;
    cng_config.num_channels = encoder->NumChannels();
    cng_config.payload_type = *spec.cng_payload_type;
    cng_config.speech_encoder = std::move(encoder);
    cng_config.vad_mode = Vad::kVadNormal;
    encoder = CreateComfortNoiseEncoder(std::move(cng_config));

    RegisterCngPayloadType(*spec.cng_payload_type,
                           new_config.send_codec_spec->format.clockrate_hz);
  }

  // Wrap the encoder in a RED encoder, if RED is enabled.
  if (spec.red_payload_type) {
    
    
    AudioEncoderCopyRed::Config red_config;
    red_config.payload_type = *spec.red_payload_type;
    red_config.speech_encoder = std::move(encoder);
    encoder = std::make_unique<AudioEncoderCopyRed>(std::move(red_config),
                                                    field_trials_);
  }

  // Set currently known overhead (used in ANA, opus only).
  // If overhead changes later, it will be updated in UpdateOverheadForEncoder.
  {
    
    
    MutexLock lock(&overhead_per_packet_lock_);
    size_t overhead = GetPerPacketOverheadBytes();
    if (overhead > 0) {
    
    
      encoder->OnReceivedOverhead(overhead);
    }
  }

  StoreEncoderProperties(encoder->SampleRateHz(), encoder->NumChannels());
  channel_send_->SetEncoder(new_config.send_codec_spec->payload_type,
                            std::move(encoder));

  return true;
}

webrtc::PacketRouterwebrtc::ModuleRtpRtcpImpl2and are connected when webrtc::AudioSendStreamthe lifecycle function of Start()is called , that is, webrtc::internal::AudioSendStream::Start()the call realizes webrtc::voe::(anonymous namespace)::ChannelSend::StartSend()the connection between the two. ChannelSend::StartSend()The function is implemented as follows:

void ChannelSend::StartSend() {
    
    
  RTC_DCHECK_RUN_ON(&worker_thread_checker_);
  RTC_DCHECK(!sending_);
  sending_ = true;

  RTC_DCHECK(packet_router_);
  packet_router_->AddSendRtpModule(rtp_rtcp_.get(), /*remb_candidate=*/false);
  rtp_rtcp_->SetSendingMediaStatus(true);
  int ret = rtp_rtcp_->SetSendingStatus(true);
  RTC_DCHECK_EQ(0, ret);

  // It is now OK to start processing on the encoder task queue.
  encoder_queue_.PostTask([this] {
    
    
    RTC_DCHECK_RUN_ON(&encoder_queue_);
    encoder_queue_is_active_ = true;
  });
}

At this point, the analysis of the relationship established between the various modules in Figure 3-1 is completed.

3.5 createCreateChannels

It is still an example of peerconnection_client based on WebRTC's Native layer. The audio is sent and received using the rtp protocol.

webrtc::PeerConnection::Initialize(const webrtc::PeerConnectionInterface::RTCConfiguration&, webrtc::PeerConnectionDependencies)The function will call InitializeTransportController_nthe method, and finally call to webrtc::JsepTransportController::JsepTransportControllercreate JsepTransportController, jesp is the abbreviation of JavaScript Session Establishment Protocol, because webrtc is based on the web, and the web is mostly developed with java, so for these general protocol requirements, WebRTC implements a set of services Based on the protocol and implementation of the Web, this enables Web-based development to directly call the JS (JavaScript) that has been implemented, making program writing much easier, and here the JESP function is implemented in c++. With JsepTransportController, the callback call webrtc::JsepTransportCollection::RegisterTransportcreates and registers the transport. The reason why the transport is created at this layer is that WebRTC supports the DTLS (Datagram Transport Layer Security) transport protocol. Under this protocol, ICE and other protocols are also hidden. The next step is the details of the multimedia data stream using this transport to achieve specific sending and receiving. up.
Please add a picture description
During SDP negotiation, the offer end will call SdpOfferAnswerHandler::ApplyLocalDescription()and the answer end will call SdpOfferAnswerHandler::ApplyRemoteDescription(). Both APIs will be called to SdpOfferAnswerHandler::UpdateTransceiversAndDataChannels()create an audio channel. SdpOfferAnswerHandler::CreateChannels()In PeerConnectionget the RTP transport created by the JESP protocol according to the mid (media ID).

//webrtc/pc/sdp_offer_answer.cc
//参数desc中包括了媒体的类型,WebRTC目前支持视频、音频以及数据三种类型的Channel创建。
RTCError SdpOfferAnswerHandler::CreateChannels(const SessionDescription& desc) {
    
    
  TRACE_EVENT0("webrtc", "SdpOfferAnswerHandler::CreateChannels");
  // Creating the media channels. Transports should already have been created
  // at this point.
  RTC_DCHECK_RUN_ON(signaling_thread());
  //对于Voice类型的多媒体,其返回值为MEDIA_TYPE_AUDIO
  const cricket::ContentInfo* voice = cricket::GetFirstAudioContent(&desc);
  if (voice && !voice->rejected &&
      //对于还没设置local/remote description的transceiver其channel是还没创建的,因而会执行下面的CreateChannel()方法
      !rtp_manager()->GetAudioTransceiver()->internal()->channel()) {
    
    
    //CreateChannel()最后一个参数std::function<RtpTransportInternal*(absl::string_view)> transport_lookup使用Lambda表达式获取。其return返回的是transport的引用。
    auto error =
        rtp_manager()->GetAudioTransceiver()->internal()->CreateChannel(
            voice->name, pc_->call_ptr(), pc_->configuration()->media_config,
            pc_->SrtpRequired(), pc_->GetCryptoOptions(), audio_options(),
            video_options(), video_bitrate_allocator_factory_.get(),
            [&](absl::string_view mid) {
    
    
              RTC_DCHECK_RUN_ON(network_thread());
              return transport_controller_n()->GetRtpTransport(mid);
            });
    if (!error.ok()) {
    
    
      return error;
    }
  }

  //video媒体类型的channel创建
  const cricket::ContentInfo* video = cricket::GetFirstVideoContent(&desc);
  if (video && !video->rejected &&
      !rtp_manager()->GetVideoTransceiver()->internal()->channel()) {
    
    
    auto error =
        rtp_manager()->GetVideoTransceiver()->internal()->CreateChannel(
            video->name, pc_->call_ptr(), pc_->configuration()->media_config,
            pc_->SrtpRequired(), pc_->GetCryptoOptions(),

            audio_options(), video_options(),
            video_bitrate_allocator_factory_.get(), [&](absl::string_view mid) {
    
    
              RTC_DCHECK_RUN_ON(network_thread());
              return transport_controller_n()->GetRtpTransport(mid);
            });
    if (!error.ok()) {
    
    
      return error;
    }
  }
//data媒体类型的channel创建
  const cricket::ContentInfo* data = cricket::GetFirstDataContent(&desc);
  if (data && !data->rejected &&
      !data_channel_controller()->data_channel_transport()) {
    
    
    if (!CreateDataChannel(data->name)) {
    
    
      return RTCError(RTCErrorType::INTERNAL_ERROR,
                      "Failed to create data channel.");
    }
  }

  return RTCError::OK();
}

CreateChannel() 的最后一个参数使用c++ 11特性的lambda表达式获取,该lambda表达式返回的是transport的引用, JsepTransportController` obtains the RTP transport object according to the mid, and its implementation is as follows:

//webrtc/pc/jsep_transport_controller.cc
RtpTransportInternal* JsepTransportController::GetRtpTransport(
    absl::string_view mid) const {
    
    
  RTC_DCHECK_RUN_ON(network_thread_);
  auto jsep_transport = GetJsepTransportForMid(mid);
  if (!jsep_transport) {
    
    
    return nullptr;
  }
  //根据不同安全传输协议或非加密传输返回对应的transport对象
  //const std::unique_ptr<webrtc::RtpTransport> unencrypted_rtp_transport_;
  //const std::unique_ptr<webrtc::SrtpTransport> sdes_transport_;
  //const std::unique_ptr<webrtc::DtlsSrtpTransport> dtls_srtp_transport_;
  return jsep_transport->rtp_transport();
}

RtpTransceiver::CreateChannel()The implementation is as follows, this function is applicable to both video and audio:

//pc/rtp_transceiver.cc
RTCError RtpTransceiver::CreateChannel(
    absl::string_view mid,
    Call* call_ptr,
    const cricket::MediaConfig& media_config,
    bool srtp_required,
    CryptoOptions crypto_options,
    const cricket::AudioOptions& audio_options,
    const cricket::VideoOptions& video_options,
    VideoBitrateAllocatorFactory* video_bitrate_allocator_factory,
    std::function<RtpTransportInternal*(absl::string_view)> transport_lookup) {
    
    
  RTC_DCHECK_RUN_ON(thread_);
  //判断media_engine_对象是否已经存在
  if (!media_engine()) {
    
    
    // TODO(hta): Must be a better way
    return RTCError(RTCErrorType::INTERNAL_ERROR,
                    "No media engine for mid=" + std::string(mid));
  }
  std::unique_ptr<cricket::ChannelInterface> new_channel;
  //audio channel创建
  if (media_type() == cricket::MEDIA_TYPE_AUDIO) {
    
    
    // TODO(bugs.webrtc.org/11992): CreateVideoChannel internally switches to
    // the worker thread. We shouldn't be using the `call_ptr_` hack here but
    // simply be on the worker thread and use `call_` (update upstream code).
    RTC_DCHECK(call_ptr);
    RTC_DCHECK(media_engine());
    // TODO(bugs.webrtc.org/11992): Remove this workaround after updates in
    // PeerConnection and add the expectation that we're already on the right
    // thread.
    context()->worker_thread()->BlockingCall([&] {
    
    
      RTC_DCHECK_RUN_ON(context()->worker_thread());

      cricket::VoiceMediaChannel* media_channel =
          media_engine()->voice().CreateMediaChannel(
              call_ptr, media_config, audio_options, crypto_options);
      if (!media_channel) {
    
    
        return;
      }

      new_channel = std::make_unique<cricket::VoiceChannel>(
          context()->worker_thread(), context()->network_thread(),
          context()->signaling_thread(), absl::WrapUnique(media_channel), mid,
          srtp_required, crypto_options, context()->ssrc_generator());
    });
    //video channel创建
  } else {
    
    
    RTC_DCHECK_EQ(cricket::MEDIA_TYPE_VIDEO, media_type());

    // TODO(bugs.webrtc.org/11992): CreateVideoChannel internally switches to
    // the worker thread. We shouldn't be using the `call_ptr_` hack here but
    // simply be on the worker thread and use `call_` (update upstream code).
    context()->worker_thread()->BlockingCall([&] {
    
    
      RTC_DCHECK_RUN_ON(context()->worker_thread());
      cricket::VideoMediaChannel* media_channel =
          media_engine()->video().CreateMediaChannel(
              call_ptr, media_config, video_options, crypto_options,
              video_bitrate_allocator_factory);
      if (!media_channel) {
    
    
        return;
      }

      new_channel = std::make_unique<cricket::VideoChannel>(
          context()->worker_thread(), context()->network_thread(),
          context()->signaling_thread(), absl::WrapUnique(media_channel), mid,
          srtp_required, crypto_options, context()->ssrc_generator());
    });
  }
  if (!new_channel) {
    
    
    // TODO(hta): Must be a better way
    return RTCError(RTCErrorType::INTERNAL_ERROR,
                    "Failed to create channel for mid=" + std::string(mid));
  }
  SetChannel(std::move(new_channel), transport_lookup);
  return RTCError::OK();
}

3.6 Set the transport

After the audio channel is created in Section 3.1, the SetChannel function will be called to complete the setting of the tranport object.

//webrtc/pc/rtp_transceiver.cc
void RtpTransceiver::SetChannel(
    std::unique_ptr<cricket::ChannelInterface> channel,
    std::function<RtpTransportInternal*(const std::string&)> transport_lookup) {
    
    
  RTC_DCHECK_RUN_ON(thread_);
  RTC_DCHECK(channel);
  RTC_DCHECK(transport_lookup);
  RTC_DCHECK(!channel_);
  // Cannot set a channel on a stopped transceiver.
  if (stopped_) {
    
    
    return;
  }

  RTC_LOG_THREAD_BLOCK_COUNT();

  RTC_DCHECK_EQ(media_type(), channel->media_type());
  signaling_thread_safety_ = PendingTaskSafetyFlag::Create();

  std::unique_ptr<cricket::ChannelInterface> channel_to_delete;

  // An alternative to this, could be to require SetChannel to be called
  // on the network thread. The channel object operates for the most part
  // on the network thread, as part of its initialization being on the network
  // thread is required, so setting a channel object as part of the construction
  // (without thread hopping) might be the more efficient thing to do than
  // how SetChannel works today.
  // Similarly, if the channel() accessor is limited to the network thread, that
  // helps with keeping the channel implementation requirements being met and
  // avoids synchronization for accessing the pointer or network related state.
  context()->network_thread()->BlockingCall([&]() {
    
    
    if (channel_) {
    
    
      channel_->SetFirstPacketReceivedCallback(nullptr);
      channel_->SetRtpTransport(nullptr);
      channel_to_delete = std::move(channel_);
    }
//保存channel对象
    channel_ = std::move(channel);
//设置tranport对象,这是RTP层级的tranport,这些tranport的类型可以是如下几种:
  //   * An RtpTransport without encryption.
  //   * An SrtpTransport for SDES.
  //   * A DtlsSrtpTransport for DTLS-SRTP.
    channel_->SetRtpTransport(transport_lookup(channel_->mid()));
    //通过lambda表达式设置OnFirstPacketReceived()为第一个数据包接收的回调函数
    channel_->SetFirstPacketReceivedCallback(
        [thread = thread_, flag = signaling_thread_safety_, this]() mutable {
    
    
          thread->PostTask(
              SafeTask(std::move(flag), [this]() {
    
     OnFirstPacketReceived(); }));
        });
  });
  PushNewMediaChannelAndDeleteChannel(nullptr);

  RTC_DCHECK_BLOCK_COUNT_NO_MORE_THAN(2);
}

The core implementation of the tranport layer setting is as follows,

bool BaseChannel::SetRtpTransport(webrtc::RtpTransportInternal* rtp_transport) {
    
    
  TRACE_EVENT0("webrtc", "BaseChannel::SetRtpTransport");
  RTC_DCHECK_RUN_ON(network_thread());
  if (rtp_transport == rtp_transport_) {
    
    
    return true;
  }

  if (rtp_transport_) {
    
    
    DisconnectFromRtpTransport_n();
    // Clear the cached header extensions on the worker.
    worker_thread_->PostTask(SafeTask(alive_, [this] {
    
    
      RTC_DCHECK_RUN_ON(worker_thread());
      rtp_header_extensions_.clear();
    }));
  }

  rtp_transport_ = rtp_transport;
  if (rtp_transport_) {
    
    
    if (!ConnectToRtpTransport_n()) {
    
    
      return false;
    }

    RTC_DCHECK(!media_channel_->HasNetworkInterface());
    //SetInterface()函数的参数MediaChannelNetworkInterface* iface
    media_channel_->SetInterface(this);


    media_channel_->OnReadyToSend(rtp_transport_->IsReadyToSend());
    UpdateWritableState_n();

    // Set the cached socket options.
    for (const auto& pair : socket_options_) {
    
    
      rtp_transport_->SetRtpOption(pair.first, pair.second);
    }
    if (!rtp_transport_->rtcp_mux_enabled()) {
    
    
      for (const auto& pair : rtcp_socket_options_) {
    
    
        rtp_transport_->SetRtcpOption(pair.first, pair.second);
      }
    }
  }

  return true;
}

The parameter of the MediaChannel :: SetInterface ( MediaChannelNetworkInterface * iface ) function is the implementation of MediaChannel::NetworkInterface, which is used to send and receive data packets for MediaChannel.

void MediaChannel::SetInterface(MediaChannelNetworkInterface* iface) {
    
    
  RTC_DCHECK_RUN_ON(network_thread_);
  iface ? network_safety_->SetAlive() : network_safety_->SetNotAlive();
  network_interface_ = iface;
  UpdateDscp();
}

bool MediaChannel::DoSendPacket(rtc::CopyOnWriteBuffer* packet,
                                bool rtcp,
                                const rtc::PacketOptions& options) {
    
    
  RTC_DCHECK_RUN_ON(network_thread_);
  if (!network_interface_)
    return false;

  return (!rtcp) ? network_interface_->SendPacket(packet, options)
                 : network_interface_->SendRtcp(packet, options);
}

BaseChannelImplement MediaChannel::NetworkInterfacethe interface , BaseChannel::SetRtpTransport() connects MediaChannel, BaseChanneland RtpTransportInternalthese three components.

3.7 Audio data packet sending processing

The transmission of audio data packets is divided into processes such as collection, encoding, sub-packet transmission, and socket transmission. The relationship between the functions involved in the collection process of the Linux platform and their module calls is shown in the figure below.
Please add a picture description

The acquisition module mainly involves two modules, webrtc/modules/audio_device and webrtc/audio, and the acquisition starts from the initialization of the Linux platform device. During the initialization process, the collection and playback threads will be created, and the callback function of the transport layer will be called to pass it to the tranport layer.

3.7.1 Audio Data Acquisition

Since the data is continuously generated, the data is fetched by the collection thread at an interval of 10ms and thrown to the upper layer. Because the APIs provided by different system platforms are different, such as Android has AAdudio, Linux listens to ALSA, mac and windows are Apple and Windows respectively. Microsoft's API, so webRTC encapsulates the interface class through the ADM module, so that the upper layer uses the same API to call device-related methods, which realizes shielding. Here, the Linux platform is taken as an example, and other platforms will not repeat them one by one. When passing through the PulsAudio interface , webrtc::AudioDeviceLinuxPulse::Init()a collection thread will be created webrtc::AudioDeviceLinuxPulse::RecThreadProcess(). The data collected by the collection thread will webrtc::AudioDeviceLinuxPulse::ReadRecordedData(void const*, unsigned long)call APM to process the audio data webrtc::AudioDeviceLinuxPulse::ProcessRecordedData(signed char*, unsigned int, unsigned int), and the data processed by APM will be webrtc::AudioDeviceBuffer::DeliverRecordedData()sent to the transport layer, and webrtc::AudioTransportImpl::RecordedDataIsAvailable(void const*, unsigned long, unsigned long, unsigned long, unsigned int, unsigned int, int, unsigned int, bool, unsigned int&)the data will be sent to the channel for encoding and then sent.

The creation function of device initialization and acquisition thread is implemented as follows:

//webrtc/modules/audio_device/linux/audio_device_pulse_linux.cc
AudioDeviceGeneric::InitStatus AudioDeviceLinuxPulse::Init() {
    
    
  RTC_DCHECK(thread_checker_.IsCurrent());
  if (_initialized) {
    
    
    return InitStatus::OK;
  }

  // 初始化 PulseAudio
  if (InitPulseAudio() < 0) {
    
    
    RTC_LOG(LS_ERROR) << "failed to initialize PulseAudio";
    if (TerminatePulseAudio() < 0) {
    
    
      RTC_LOG(LS_ERROR) << "failed to terminate PulseAudio";
    }
    return InitStatus::OTHER_ERROR;
  }

  // RECORDING
  const auto attributes =
      rtc::ThreadAttributes().SetPriority(rtc::ThreadPriority::kRealtime);
  _ptrThreadRec = rtc::PlatformThread::SpawnJoinable(
  //通过c++的lambda表达式启动采集线程
      [this] {
    
    
        while (RecThreadProcess()) {
    
    
        }
      },
      "webrtc_audio_module_rec_thread", attributes);

  // PLAYOUT
  _ptrThreadPlay = rtc::PlatformThread::SpawnJoinable(
  //通过c++的lambda表达式启动播放线程
      [this] {
    
    
        while (PlayThreadProcess()) {
    
    
        }
      },
      "webrtc_audio_module_play_thread", attributes);
  _initialized = true;

  return InitStatus::OK;
}

After the acquisition thread is started through the lambda expression, the acquisition thread RecThreadProcess() starts to work:

bool AudioDeviceLinuxPulse::RecThreadProcess() {
    
    
  if (!_timeEventRec.Wait(TimeDelta::Seconds(1))) {
    
    
    return true;
  }

  MutexLock lock(&mutex_);
  if (quit_) {
    
    
    return false;
  }
  if (_startRec) {
    
    
    RTC_LOG(LS_VERBOSE) << "_startRec true, performing initial actions";

    _recDeviceName = NULL;

    // Set if not default device
    if (_inputDeviceIndex > 0) {
    
    
      // Get the recording device name
      _recDeviceName = new char[kAdmMaxDeviceNameSize];
      _deviceIndex = _inputDeviceIndex;
      RecordingDevices();
    }

    PaLock();

    RTC_LOG(LS_VERBOSE) << "connecting stream";

    // Connect the stream to a source
    if (LATE(pa_stream_connect_record)(
            _recStream, _recDeviceName, &_recBufferAttr,
            (pa_stream_flags_t)_recStreamFlags) != PA_OK) {
    
    
      RTC_LOG(LS_ERROR) << "failed to connect rec stream, err="
                        << LATE(pa_context_errno)(_paContext);
    }

    RTC_LOG(LS_VERBOSE) << "connected";

    // Wait for state change
    while (LATE(pa_stream_get_state)(_recStream) != PA_STREAM_READY) {
    
    
      LATE(pa_threaded_mainloop_wait)(_paMainloop);
    }

    RTC_LOG(LS_VERBOSE) << "done";

    // We can now handle read callbacks
    EnableReadCallback();

    PaUnLock();

    // Clear device name
    if (_recDeviceName) {
    
    
      delete[] _recDeviceName;
      _recDeviceName = NULL;
    }

    _startRec = false;
    _recording = true;
    _recStartEvent.Set();

    return true;
  }

  if (_recording) {
    
    
    // Read data and provide it to VoiceEngine
    if (ReadRecordedData(_tempSampleData, _tempSampleDataSize) == -1) {
    
    
      return true;
    }

    _tempSampleData = NULL;
    _tempSampleDataSize = 0;

    PaLock();
    while (true) {
    
    
      // Ack the last thing we read
      if (LATE(pa_stream_drop)(_recStream) != 0) {
    
    
        RTC_LOG(LS_WARNING)
            << "failed to drop, err=" << LATE(pa_context_errno)(_paContext);
      }

      if (LATE(pa_stream_readable_size)(_recStream) <= 0) {
    
    
        // Then that was all the data
        break;
      }

      // Else more data.
      const void* sampleData;
      size_t sampleDataSize;

      if (LATE(pa_stream_peek)(_recStream, &sampleData, &sampleDataSize) != 0) {
    
    
        RTC_LOG(LS_ERROR) << "RECORD_ERROR, error = "
                          << LATE(pa_context_errno)(_paContext);
        break;
      }

      // Drop lock for sigslot dispatch, which could take a while.
      PaUnLock();
      // Read data and provide it to VoiceEngine
      if (ReadRecordedData(sampleData, sampleDataSize) == -1) {
    
    
        return true;
      }
      PaLock();

      // Return to top of loop for the ack and the check for more data.
    }

    EnableReadCallback();
    PaUnLock();

  }  // _recording

The collected data is encoded by the ProcessAndEncodeAudio function corresponding to the channel and then sent out. The function call flow is as follows:

//audio/audio_transport_impl.cc
webrtc::AudioTransportImpl::SendProcessedData(std::unique_ptr<webrtc::AudioFrame, std::default_delete<webrtc::AudioFrame> >)
//audio/audio_send_stream.cc
ebrtc::internal::AudioSendStream::SendAudioData(std::unique_ptr<webrtc::AudioFrame, std::default_delete<webrtc::AudioFrame> >)
//audio/channel_send.cc
ChannelSend::ProcessAndEncodeAudio(std::unique_ptr<webrtc::AudioFrame, std::default_delete<webrtc::AudioFrame> >)

AudioSendStream::SendAudioDataThe implementation of the function is as follows:

void AudioSendStream::SendAudioData(std::unique_ptr<AudioFrame> audio_frame) {
    
    
  RTC_CHECK_RUNS_SERIALIZED(&audio_capture_race_checker_);
  RTC_DCHECK_GT(audio_frame->sample_rate_hz_, 0);
  TRACE_EVENT0("webrtc", "AudioSendStream::SendAudioData");
  double duration = static_cast<double>(audio_frame->samples_per_channel_) /
                    audio_frame->sample_rate_hz_;
  {
    
    
    // Note: SendAudioData() passes the frame further down the pipeline and it
    // may eventually get sent. But this method is invoked even if we are not
    // connected, as long as we have an AudioSendStream (created as a result of
    // an O/A exchange). This means that we are calculating audio levels whether
    // or not we are sending samples.
    // TODO(https://crbug.com/webrtc/10771): All "media-source" related stats
    // should move from send-streams to the local audio sources or tracks; a
    // send-stream should not be required to read the microphone audio levels.
    MutexLock lock(&audio_level_lock_);
    audio_level_.ComputeLevel(*audio_frame, duration);
  }
  channel_send_->ProcessAndEncodeAudio(std::move(audio_frame));
}

3.7.2 Encode and add to pacer queue

Following the SendAudioData function described in section 3.3.1 channel_send_->ProcessAndEncodeAudio(std::move(audio_frame));, this function calls the method provided by the webrtc encoding module to encode the data and adds it to the sent pacer queue. The main functions of the involved modules and calling functions are shown in the figure below:
Please add a picture description
audio The connection between the module and the rtp module passes the following functions,

int32_t ChannelSend::SendRtpAudio(AudioFrameType frameType,
                                  uint8_t payloadType,
                                  uint32_t rtp_timestamp,
                                  rtc::ArrayView<const uint8_t> payload,
                                  int64_t absolute_capture_timestamp_ms) {
    
    
  if (_includeAudioLevelIndication) {
    
    
    // Store current audio level in the RTP sender.
    // The level will be used in combination with voice-activity state
    // (frameType) to add an RTP header extension
    rtp_sender_audio_->SetAudioLevel(rms_level_.Average());
  }

  // E2EE Custom Audio Frame Encryption (This is optional).
  // Keep this buffer around for the lifetime of the send call.
  rtc::Buffer encrypted_audio_payload;
  // We don't invoke encryptor if payload is empty, which means we are to send
  // DTMF, or the encoder entered DTX.
  // TODO(minyue): see whether DTMF packets should be encrypted or not. In
  // current implementation, they are not.
  if (!payload.empty()) {
    
    
    if (frame_encryptor_ != nullptr) {
    
    
      // TODO([email protected]) - Allocate enough to always encrypt inline.
      // Allocate a buffer to hold the maximum possible encrypted payload.
      size_t max_ciphertext_size = frame_encryptor_->GetMaxCiphertextByteSize(
          cricket::MEDIA_TYPE_AUDIO, payload.size());
      encrypted_audio_payload.SetSize(max_ciphertext_size);

      // Encrypt the audio payload into the buffer.
      size_t bytes_written = 0;
      int encrypt_status = frame_encryptor_->Encrypt(
          cricket::MEDIA_TYPE_AUDIO, rtp_rtcp_->SSRC(),
          /*additional_data=*/nullptr, payload, encrypted_audio_payload,
          &bytes_written);
      if (encrypt_status != 0) {
    
    
        RTC_DLOG(LS_ERROR)
            << "Channel::SendData() failed encrypt audio payload: "
            << encrypt_status;
        return -1;
      }
      // Resize the buffer to the exact number of bytes actually used.
      encrypted_audio_payload.SetSize(bytes_written);
      // Rewrite the payloadData and size to the new encrypted payload.
      payload = encrypted_audio_payload;
    } else if (crypto_options_.sframe.require_frame_encryption) {
    
    
      RTC_DLOG(LS_ERROR) << "Channel::SendData() failed sending audio payload: "
                            "A frame encryptor is required but one is not set.";
      return -1;
    }
  }

  // Push data from ACM to RTP/RTCP-module to deliver audio frame for
  // packetization.
  if (!rtp_rtcp_->OnSendingRtpFrame(rtp_timestamp,
                                    // Leaving the time when this frame was
                                    // received from the capture device as
                                    // undefined for voice for now.
                                    -1, payloadType,
                                    /*force_sender_report=*/false)) {
    
    
    return -1;
  }

  // RTCPSender has it's own copy of the timestamp offset, added in
  // RTCPSender::BuildSR, hence we must not add the in the offset for the above
  // call.
  // TODO(nisse): Delete RTCPSender:timestamp_offset_, and see if we can confine
  // knowledge of the offset to a single place.

  // This call will trigger Transport::SendPacket() from the RTP/RTCP module.
  if (!rtp_sender_audio_->SendAudio(
          frameType, payloadType, rtp_timestamp + rtp_rtcp_->StartTimestamp(),
          payload.data(), payload.size(), absolute_capture_timestamp_ms)) {
    
    
    RTC_DLOG(LS_ERROR)
        << "ChannelSend::SendData() failed to send data to RTP/RTCP module";
    return -1;
  }

  return 0;
}

The core functions added to the pacer queue are as follows:

void PacingController::EnqueuePacket(std::unique_ptr<RtpPacketToSend> packet) {
    
    
  RTC_DCHECK(pacing_rate_ > DataRate::Zero())
      << "SetPacingRate must be called before InsertPacket.";
  RTC_CHECK(packet->packet_type());

  prober_.OnIncomingPacket(DataSize::Bytes(packet->payload_size()));

  const Timestamp now = CurrentTime();
  if (packet_queue_.Empty()) {
    
    
    // If queue is empty, we need to "fast-forward" the last process time,
    // so that we don't use passed time as budget for sending the first new
    // packet.
    Timestamp target_process_time = now;
    Timestamp next_send_time = NextSendTime();
    if (next_send_time.IsFinite()) {
    
    
      // There was already a valid planned send time, such as a keep-alive.
      // Use that as last process time only if it's prior to now.
      target_process_time = std::min(now, next_send_time);
    }
    UpdateBudgetWithElapsedTime(UpdateTimeAndGetElapsed(target_process_time));
  }
  packet_queue_.Push(now, std::move(packet));
  seen_first_packet_ = true;

  // Queue length has increased, check if we need to change the pacing rate.
  MaybeUpdateMediaRateDueToLongQueue(now);
}

The core is to put the audio packet packet_queue_on the queue, and send packet_queue_the data packets on the queue through the packer thread.

3.7.3 pacedSender sends RTP packets

Please add a picture description
After the RTP packet is added to the pacer queue in Section 3.7.2, it will be sent out by the sending thread of the pacer queue; the pacer queue Call::CreateAudioReceiveStreamwill call the creation of the sending thread when it is created TaskQueuePacedSender::EnsureStarted(). TaskQueuePacedSender::EnsureStarted()thread.

//webrtc/modules/pacing/task_queue_paced_sender.cc
void TaskQueuePacedSender::EnsureStarted() {
    
    
  task_queue_.RunOrPost([this]() {
    
    
    RTC_DCHECK_RUN_ON(&task_queue_);
    is_started_ = true;
    MaybeProcessPackets(Timestamp::MinusInfinity());
  });
}

//webrtc/modules/utility/maybe_worker_thread.cc
void MaybeWorkerThread::RunOrPost(absl::AnyInvocable<void() &&> task) {
    
    
  if (owned_task_queue_) {
    
    
    owned_task_queue_->PostTask(std::move(task));
  } else {
    
    
    RTC_DCHECK_RUN_ON(&sequence_checker_);
    std::move(task)();
  }
}

After cricket::MediaChannel::DoSendPacket()passing, the rtp data packet is sent from the media layer to the channel layer, and sent out through the socket interface of the channel layer.

3.7.4 Send data packets through the socket interface

Please add a picture description
The call relationship from mediaChannel to socket is shown in the figure above.

3.8 Reception and processing of audio data packets

The receiving and processing of audio data packets is divided into three main modules: receiving RTP packets from the network, inserting RTP packets into the NetEQ module, and playing after NetEQ decoding.

3.8.1 Receive audio RTP packets from the network

Please add a picture description

The RTP packet is notified asynchronously through the signal, cricket::DtlsTransport::OnReadPacketand the function bound to the signal is as follows:webrtc::RtpTransport::OnReadPacketRtpTransport::OnReadPacket

//third_party/webrtc/pc/rtp_transport.cc
void RtpTransport::SetRtpPacketTransport(
    rtc::PacketTransportInternal* new_packet_transport) {
    
    
  if (new_packet_transport == rtp_packet_transport_) {
    
    
    return;
  }
  if (rtp_packet_transport_) {
    
    
    rtp_packet_transport_->SignalReadyToSend.disconnect(this);
    rtp_packet_transport_->SignalReadPacket.disconnect(this);
    rtp_packet_transport_->SignalNetworkRouteChanged.disconnect(this);
    rtp_packet_transport_->SignalWritableState.disconnect(this);
    rtp_packet_transport_->SignalSentPacket.disconnect(this);
    // Reset the network route of the old transport.
    SignalNetworkRouteChanged(absl::optional<rtc::NetworkRoute>());
  }
  if (new_packet_transport) {
    
    
    new_packet_transport->SignalReadyToSend.connect(
        this, &RtpTransport::OnReadyToSend);
    new_packet_transport->SignalReadPacket.connect(this,
                                                   &RtpTransport::OnReadPacket);
    new_packet_transport->SignalNetworkRouteChanged.connect(
        this, &RtpTransport::OnNetworkRouteChanged);
    new_packet_transport->SignalWritableState.connect(
        this, &RtpTransport::OnWritableState);
    new_packet_transport->SignalSentPacket.connect(this,
                                                   &RtpTransport::OnSentPacket);
    // Set the network route for the new transport.
    SignalNetworkRouteChanged(new_packet_transport->network_route());
  }

  rtp_packet_transport_ = new_packet_transport;
  // Assumes the transport is ready to send if it is writable. If we are wrong,
  // ready to send will be updated the next time we try to send.
  SetReadyToSend(false,
                 rtp_packet_transport_ && rtp_packet_transport_->writable());
}

Both from cricket::UDPPort::HandleIncomingPacket()to cricket::UDPPort::OnReadPacket()and from cricket::UDPPort::OnReadPacket()to cricket::P2PTransportChannel::OnReadPacket()are processed asynchronously and in a timely manner through signal. Finally, at the voice engine layer, WebRtcVoiceMediaChannel::OnPacketReceived()the function is implemented as follows. This function sends the received rtp packet to the call layer through a lambda expression, and then puts this processing process in a new thread to execute the lambda expression through a PostTask method. The lambda expression The call in call_->Receiver()->DeliverRtpPacket()is very important.

void WebRtcVoiceMediaChannel::OnPacketReceived(
    const webrtc::RtpPacketReceived& packet) {
    
    
  RTC_DCHECK_RUN_ON(&network_thread_checker_);

  // TODO(bugs.webrtc.org/11993): This code is very similar to what
  // WebRtcVideoChannel::OnPacketReceived does. For maintainability and
  // consistency it would be good to move the interaction with
  // call_->Receiver() to a common implementation and provide a callback on
  // the worker thread for the exception case (DELIVERY_UNKNOWN_SSRC) and
  // how retry is attempted.
  worker_thread_->PostTask(
      SafeTask(task_safety_.flag(), [this, packet = packet]() mutable {
    
    
        RTC_DCHECK_RUN_ON(worker_thread_);

        // TODO(bugs.webrtc.org/7135): extensions in `packet` is currently set
        // in RtpTransport and does not neccessarily include extensions specific
        // to this channel/MID. Also see comment in
        // BaseChannel::MaybeUpdateDemuxerAndRtpExtensions_w.
        // It would likely be good if extensions where merged per BUNDLE and
        // applied directly in RtpTransport::DemuxPacket;
        packet.IdentifyExtensions(recv_rtp_extension_map_);
        if (!packet.arrival_time().IsFinite()) {
    
    
          packet.set_arrival_time(webrtc::Timestamp::Micros(rtc::TimeMicros()));
        }

        call_->Receiver()->DeliverRtpPacket(
            webrtc::MediaType::AUDIO, std::move(packet),
            absl::bind_front(
                &WebRtcVoiceMediaChannel::MaybeCreateDefaultReceiveStream,
                this));
      }));
}

3.8.2 Asynchronous insertion of audio RTP packets into NetEQ

At the end of section 3.8.1, call_->Receiver()->DeliverRtpPacketit is called in a new thread, which passes the RTP data packet to the voice engine layer. Due to the non-ideal factors such as packet loss and jitter in the network, it is necessary to resist the real-time requirements. For packet loss and dejitter, the NetEQ module can handle part of the jitter and packet loss. Of course, the encoder may also have its own packet loss compensation algorithm, such as G7XX and Opus encoders, which may not activate NeteQ’s packet loss algorithm. But anti-jitter all relies on NetEQ processing. The call flow for adding RTP packets to the NetEQ buffer is shown in the figure below:
Please add a picture description

3.8.3 Obtain NetEQ audio package and decode and play

Since the sound is continuous, like audio collection, the playback is also streaming. In WebRTC, the audio data is played once every 10ms, so a separate playback thread is used to play the audio data. For the Linux platform, the audio playback thread is created in the audio device In the initialization phase, the initialization of the audio device management implemented by the PulseAudio framework is as follows, and lambda expressions are used to create acquisition and playback threads.

AudioDeviceGeneric::InitStatus AudioDeviceLinuxPulse::Init() {
    
    
  RTC_DCHECK(thread_checker_.IsCurrent());
  if (_initialized) {
    
    
    return InitStatus::OK;
  }

  // Initialize PulseAudio
  if (InitPulseAudio() < 0) {
    
    
    RTC_LOG(LS_ERROR) << "failed to initialize PulseAudio";
    if (TerminatePulseAudio() < 0) {
    
    
      RTC_LOG(LS_ERROR) << "failed to terminate PulseAudio";
    }
    return InitStatus::OTHER_ERROR;
  }

  // RECORDING
  const auto attributes =
      rtc::ThreadAttributes().SetPriority(rtc::ThreadPriority::kRealtime);
  _ptrThreadRec = rtc::PlatformThread::SpawnJoinable(
      [this] {
    
    
        while (RecThreadProcess()) {
    
    
        }
      },
      "webrtc_audio_module_rec_thread", attributes);

  // PLAYOUT
  _ptrThreadPlay = rtc::PlatformThread::SpawnJoinable(
      [this] {
    
    
        while (PlayThreadProcess()) {
    
    
        }
      },
      "webrtc_audio_module_play_thread", attributes);
  _initialized = true;

  return InitStatus::OK;
}

Please add a picture description
It is called in the final decoding part of the playback AudioDecoderOpusImpl::DecodeInternal(), and of course it can be called AudioDecoderG722Impl::DecodeInternal(). As for which encoder to use, it depends on the result of the SDP protocol negotiation. For the Opus encoder, see the audio encoder opus analysis column .

Guess you like

Origin blog.csdn.net/shichaog/article/details/128884487