Article Directory
-
- 3.1 Audio data stream sending process
- 3.2 Encoding and RTP packaging in sending
- 3.3 AudioSendStream class relationship
- 3.4 `webrtc::AudioSendStream` creation and initialization
- 3.5 Create `CreateChannels`
- 3.6 Set the transport
- 3.7 Audio data packet sending processing
- 3.8 Reception and processing of audio data packets
A complete audio transmission involves audio acquisition, audio enhancement, audio mixing, audio format (sampling rate, number of channels), codec, RTP packet, and network transmission. In this process, WebRTC is structurally clear and decoupled , each layer is abstracted, because the data at the collection end is continuous, so the data collection has a separate thread implementation.
3.1 Audio data stream sending process
The modules involved in the audio sending process are shown in the figure below
Figure 3-1 WebRTC audio sending data flow
3.2 Encoding and RTP packaging in sending
webrtc::AudioTransport
It is more like a two-way bridge, which supports both sending and receiving. After sending, it also connects one or more data stream sending modules, which webrtc::AudioSendStream
connect the two ends of data collection and sending, but the sending end needs to denoise the data before sending , echo removal, codec, and RTP packaging. The noise reduction enhancement of data is realized by the APM module. The details of each algorithm processing of APM can be found in the "Real-time Voice Processing Practice Guide". In this section, see/middle audio data encoding, RTP packet webrtc::AudioSender
packaging webrtc::AudioSendStream
and Design and implementation of send control. The implementation of webrtc::AudioSender
/ is located at / , and the related class hierarchy is as follows: Figure 3-2 SendSteam UML relationship diagramwebrtc::AudioSendStream
webrtc/audio/audio_send_stream.h
webrtc/audio/audio_send_stream.cc
In the interactive real-time communication scenario, the encoding and sending of audio data in real-time communication is different from the streaming schemes such as RTMP in the live broadcast scenario. UDP/RTP protocol, and RTMP/TCP protocol is used in the live streaming scene. The real-time performance of real-time communication is prioritized, which does not mean that the quality requirements are not high. Packet loss and jitter disorder based on UDP transmission will bring about a decline in communication quality. Therefore, WebRTC adopts NetEQ technology at the receiving end, while at the sending end, it is necessary to dynamically adjust and control the encoding code rate according to the detected network conditions and the RTCP packets sent back from the receiving end.
webrtc::AudioSendStream
It is an interface class that implements streaming audio transmission. The main functions of this class are listed in the methods in the UML diagram. The main points are as follows:
- Use to
bool SetupSendCodec(const Config& new_config)
set the encoder type and configure the encoder encoding target bit rate; - Set the maximum, minimum and default priority bit rate of SendStream, and the dynamically updated allocated bit rate.
webrtc::AudioSendStream
The life cycle control of is realized by the method in thestart()
diagramstop()
;- Perform volume control, noise reduction, and encoding on the data collected by the ADM module;
- Receive and process the returned RTCP packets, and adjust the encoding bit rate;
The encoding, rtp packet, and pacing in the sending process are all webrtc::AudioSendStream
implemented in . The data flow is shown by the numbered numbers in the figure. The PCM data collected by ADM is processed by webrtc::AudioTransport
APM and passed to the ACM module webrtc::AudioSendStream
. webrtc::AudioSendStream
The ACM module code is first called internally. Then use the rtp_rtcp module packaging interface to pack the encoded bit stream into RTP packaging, then use the pacing module to do smoothing and priority transmission control, and finally send the interface call through the rtp_rtcp module webrtc::Transport
to deliver
3.3 AudioSendStream class relationship
webrtc::AudioSendStream
webrtc:internal::AudioSendStream
is an interface class, which implements the audio data processing stream structure in its derived class Figure X.2 . The definition and construction process of the audio data stream processing structure is as follows. The relationship betweenwebrtc::AudioSendStream
and is shown in the following UML, Figure 3-3 AudioSendSteam UML diagramwebrtc::AudioTransport
AudioTransportImpl
The type in the 3-3 figure audio_senders_
is vector type AduioSender
, which also confirms that one AudioTransport can correspond to multiple AudioSendStreams mentioned above. webrtc::AudioTransport
After the data is collected, RecordedDataIsAvailable
the method will be called. The implementation of this method is as follows:
// Not used in Chromium. Process captured audio and distribute to all sending
// streams, and try to do this at the lowest possible sample rate.
int32_t AudioTransportImpl::RecordedDataIsAvailable(
const void* audio_data,
const size_t number_of_frames,
const size_t bytes_per_sample,
const size_t number_of_channels,
const uint32_t sample_rate,
const uint32_t audio_delay_milliseconds,
const int32_t /*clock_drift*/,
const uint32_t /*volume*/,
const bool key_pressed,
uint32_t& /*new_mic_volume*/,
const int64_t
estimated_capture_time_ns) {
// NOLINT: to avoid changing APIs
RTC_DCHECK(audio_data);
RTC_DCHECK_GE(number_of_channels, 1);
RTC_DCHECK_LE(number_of_channels, 2);
RTC_DCHECK_EQ(2 * number_of_channels, bytes_per_sample);
RTC_DCHECK_GE(sample_rate, AudioProcessing::NativeRate::kSampleRate8kHz);
// 100 = 1 second / data duration (10 ms).
RTC_DCHECK_EQ(number_of_frames * 100, sample_rate);
RTC_DCHECK_LE(bytes_per_sample * number_of_frames * number_of_channels,
AudioFrame::kMaxDataSizeBytes);
int send_sample_rate_hz = 0;
size_t send_num_channels = 0;
bool swap_stereo_channels = false;
{
MutexLock lock(&capture_lock_);
send_sample_rate_hz = send_sample_rate_hz_;
send_num_channels = send_num_channels_;
swap_stereo_channels = swap_stereo_channels_;
}
std::unique_ptr<AudioFrame> audio_frame(new AudioFrame());
InitializeCaptureFrame(sample_rate, send_sample_rate_hz, number_of_channels,
send_num_channels, audio_frame.get());
voe::RemixAndResample(static_cast<const int16_t*>(audio_data),
number_of_frames, number_of_channels, sample_rate,
&capture_resampler_, audio_frame.get());
ProcessCaptureFrame(audio_delay_milliseconds, key_pressed,
swap_stereo_channels, audio_processing_,
audio_frame.get());
audio_frame->set_absolute_capture_timestamp_ms(estimated_capture_time_ns /
1000000);
RTC_DCHECK_GT(audio_frame->samples_per_channel_, 0);
if (async_audio_processing_)
async_audio_processing_->Process(std::move(audio_frame));
else
SendProcessedData(std::move(audio_frame));
return 0;
}
RecordedDataIsAvailable()
The method will call each method inaudio_senders_
the vector one by one . The function is to call the appropriate encoder to encode the data and send the encoded bitstream data, which means that a piece of recorded data can be encoded using different encoders and Use different sending control strategies to send, such as using UDP protocol to transmit RTP protocol packets in video conferences, or using TCP protocol to transmit RTMP protocol packets in live streaming. Figure 3-3 shows the relationship between these two classes With these two core APIs, there is a problem here, which is when it is set in , which is executed when the lifecycle function of is called. The process of adding is roughly as follows:webrtc::internal::AudioSendStream
SendAudioData()
SendAudioData()
audio_senders_
webrtc::AudioTransport
webrtc::AudioSendStream
Start()
//webrtc/audio/audio_transport_impl.cc
#0 webrtc::AudioTransportImpl::UpdateAudioSenders(std::vector<webrtc::AudioSender*, std::allocator<webrtc::AudioSender*> >, int, unsigned long) ()
//webrtc/audio/audio_state.cc
#1 webrtc::internal::AudioState::UpdateAudioTransportWithSendingStreams() ()
//webrtc/audio/audio_state.cc
#2 webrtc::internal::AudioState::AddSendingStream(webrtc::AudioSendStream*, int, unsigned long) ()
webrtc/audio/audio_send_stream.cc:
#3 webrtc::internal::AudioSendStream::Start() ()
webrtc::AudioSendStream
Add itself webrtc::AudioState
, and webrtc::AudioState
add the newly added webrtc::AudioSendStream
and the previously added webrtc::AudioSendStream
pass UpdateAudioSenders
to webrtc::AudioTransport
it. If the newly added webrtc::AudioSendStream
is the first webrtc::AudioSendStream
, the webrtc::AudioState
will also initialize the device and start recording.
3.4 webrtc::AudioSendStream
Creation and initialization
webrtc::AudioSendStream
The internal data processing components are shown in Figure X.1. It can be seen from the code that creating webrtc::AudioSendStream
an object will eventually call webrtc::voe::(anonymous namespace)::ChannelSend
the object to create some key objects/modules, and establish some connections between various objects/modules. The calling process is shown in the figure below:
Figure 3-4 webrtc::AudioSendStream initialization function call stack
In WebRTC, audio starts from VoiceEngine, and VoiceEngine WebRTCVoiceMediaChannel
is a very important object. It calls to webrtc::Call
create a new AudioSendStream object. The parameters passed during creation config_
are webrtc::AudioSendStream::Config
of type , including codec, bit rate, RTP, encryption and webrtc::Transport
other configurations .
webrtc::voe::(anonymous namespace)::ChannelSend
The object's constructor creates most and send related objects/modules, its constructor is as follows:
Clock* clock,
TaskQueueFactory* task_queue_factory,
Transport* rtp_transport,
RtcpRttStats* rtcp_rtt_stats,
RtcEventLog* rtc_event_log,
FrameEncryptorInterface* frame_encryptor,
const webrtc::CryptoOptions& crypto_options,
bool extmap_allow_mixed,
int rtcp_report_interval_ms,
uint32_t ssrc,
rtc::scoped_refptr<FrameTransformerInterface> frame_transformer,
TransportFeedbackObserver* feedback_observer,
const FieldTrialsView& field_trials)
: ssrc_(ssrc),
event_log_(rtc_event_log),
_timeStamp(0), // This is just an offset, RTP module will add it's own
// random offset
input_mute_(false),
previous_frame_muted_(false),
_includeAudioLevelIndication(false),
rtcp_observer_(new VoERtcpObserver(this)),
feedback_observer_(feedback_observer),
//创建一个 `RtpPacketSenderProxy` 对象
rtp_packet_pacer_proxy_(new RtpPacketSenderProxy()),
retransmission_rate_limiter_(
new RateLimiter(clock, kMaxRetransmissionWindowMs)),
frame_encryptor_(frame_encryptor),
crypto_options_(crypto_options),
fixing_timestamp_stall_(
field_trials.IsDisabled("WebRTC-Audio-FixTimestampStall")),
encoder_queue_(task_queue_factory->CreateTaskQueue(
"AudioEncoder",
TaskQueueFactory::Priority::NORMAL)) {
//创建一个 webrtc::AudioCodingModule 对象
audio_coding_.reset(AudioCodingModule::Create(AudioCodingModule::Config()));
RtpRtcpInterface::Configuration configuration;
configuration.bandwidth_callback = rtcp_observer_.get();
configuration.transport_feedback_callback = feedback_observer_;
configuration.clock = (clock ? clock : Clock::GetRealTimeClock());
configuration.audio = true;
configuration.outgoing_transport = rtp_transport;
configuration.paced_sender = rtp_packet_pacer_proxy_.get();
configuration.event_log = event_log_;
configuration.rtt_stats = rtcp_rtt_stats;
configuration.retransmission_rate_limiter =
retransmission_rate_limiter_.get();
configuration.extmap_allow_mixed = extmap_allow_mixed;
configuration.rtcp_report_interval_ms = rtcp_report_interval_ms;
configuration.rtcp_packet_type_counter_observer = this;
configuration.local_media_ssrc = ssrc;
//建了一个 webrtc::ModuleRtpRtcpImpl2 对象
rtp_rtcp_ = ModuleRtpRtcpImpl2::Create(configuration);
rtp_rtcp_->SetSendingMediaStatus(false);
//创建 `webrtc::RTPSenderAudio` 对象
rtp_sender_audio_ = std::make_unique<RTPSenderAudio>(configuration.clock,
rtp_rtcp_->RtpSender());
// Ensure that RTCP is enabled by default for the created channel.
rtp_rtcp_->SetRTCPStatus(RtcpMode::kCompound);
int error = audio_coding_->RegisterTransportCallback(this);
RTC_DCHECK_EQ(0, error);
if (frame_transformer)
InitFrameTransformerDelegate(std::move(frame_transformer));
}
webrtc::voe::(anonymous namespace)::ChannelSend
The main tasks of the object's constructor are as follows:
- Create
RtpPacketSenderProxy
an object ; its methodEnqueuePackets
is to add the RTP packet to the pacing queue, and then send the RTP packet according to the target sending bit rate and sending scheduling priority - Create
webrtc::AudioCodingModule
an object , corresponding to the connection in the AudioCodingModule direction labeled 7 in Figure X.1; - Create
webrtc::ModuleRtpRtcpImpl2
an object , the configuration item of itsCreate
methodconfiguration
parameteroutgoing_transport
points to the incoming onewebrtc::Transport
, corresponding to the connection labeled 13 in Figure X.1, andconfiguration
the parameterpaced_sender
points to the previously createdRtpPacketSenderProxy
object, corresponding to the connection labeled 10 in Figure X.1 - Create
webrtc::RTPSenderAudio
the object , passrtp_sender_audio_config
in the object obtained from the object through the object, corresponding to the connections labeled 8 and 9 in Figure X.1;rtp_sender
webrtc::ModuleRtpRtcpImpl2
webrtc::RTPSender
this
Register aswebrtc::AudioPacketizationCallback
towebrtc::AudioCodingModule
the object , which corresponds to the number 7 in Figure X.1 and connect in the direction of ChannelSendInterface.
At this stage, acm2 and rtp_rtcp have been connected webrtc::voe::(anonymous namespace)::ChannelSend
to the module of , and the connection of the pacing module is implemented in the functionChannelSend
of , and its call stack is shown in Figure X.4. The object's function is implemented as follows:RegisterSenderCongestionControlObjects()
ChannelSend
RegisterSenderCongestionControlObjects()
//webrtc/audio/channel_send.cc
706 void ChannelSend::RegisterSenderCongestionControlObjects(
707 RtpTransportControllerSendInterface* transport,
708 RtcpBandwidthObserver* bandwidth_observer) {
709 RTC_DCHECK_RUN_ON(&worker_thread_checker_);
710 RtpPacketSender* rtp_packet_pacer = transport->packet_sender();
711 PacketRouter* packet_router = transport->packet_router();
713 RTC_DCHECK(rtp_packet_pacer);
714 RTC_DCHECK(packet_router);
715 RTC_DCHECK(!packet_router_);
716 rtcp_observer_->SetBandwidthObserver(bandwidth_observer);
717 rtp_packet_pacer_proxy_->SetPacketPacer(rtp_packet_pacer);
718 rtp_rtcp_->SetStorePacketsStatus(true, 600);
719 packet_router_ = packet_router;
720}
ChannelSend::RegisterSenderCongestionControlObjects
Lines 710 and 711 webrtc::RtpTransportControllerSendInterface
extract webrtc::RtpPacketSender
the instance and webrtc::PacketRouter
instance in the object, and webrtc::RtpPacketSender
set the instance to the previously created RtpPacketSenderProxy
object , thus establishing the actual connection marked 11 in the previous figure, and 11 will no longer be a dotted line from now on, and the obtained webrtc::PacketRouter
saved Get up and spare.
Next, let's take a look at how the ACM module (acm2) module in Figure 3-1 associates the encoder interface layer with a specific encoder. The audio encoder is created when the webrtc::AudioSendStream
configuration interface of Reconfigure()
is called, and the encoder is registered webrtc::AudioCodingModule
. The process is as follows:
//third_party/webrtc/audio/audio_send_stream.cc
void AudioSendStream::ConfigureStream(
const webrtc::AudioSendStream::Config& new_config,
bool first_time,
SetParametersCallback callback) {
if (!ReconfigureSendCodec(new_config)) {
RTC_LOG(LS_ERROR) << "Failed to set up send codec state.";
webrtc::InvokeSetParametersCallback(
callback, webrtc::RTCError(webrtc::RTCErrorType::INTERNAL_ERROR,
"Failed to set up send codec state."));
}
}
Figure 3-5 Encoder initialization process
webrtc::AudioSendStream
Call MakeAudioEncoder to create an audio encoder. In addition to specific encoders (such as OPUS, AAC, G7XX), there are comfort noise encoders and redundant frame RED encoders. So far, the general interface class of the encoder in the acm2 box has been connected to the specific encoder implementation, and which encoding to enable depends on the result of the SDP negotiation.
// Apply current codec settings to a single voe::Channel used for sending.
bool AudioSendStream::SetupSendCodec(const Config& new_config) {
RTC_DCHECK(new_config.send_codec_spec);
const auto& spec = *new_config.send_codec_spec;
RTC_DCHECK(new_config.encoder_factory);
//创建特定类型的(由payload_type指定,如113是opus 单声道音频编码器)具体音频编码器
std::unique_ptr<AudioEncoder> encoder =
new_config.encoder_factory->MakeAudioEncoder(
spec.payload_type, spec.format, new_config.codec_pair_id);
if (!encoder) {
RTC_DLOG(LS_ERROR) << "Unable to create encoder for "
<< rtc::ToString(spec.format);
return false;
}
// If a bitrate has been specified for the codec, use it over the
// codec's default.
if (spec.target_bitrate_bps) {
encoder->OnReceivedTargetAudioBitrate(*spec.target_bitrate_bps);
}
// Enable ANA if configured (currently only used by Opus).
if (new_config.audio_network_adaptor_config) {
if (encoder->EnableAudioNetworkAdaptor(
*new_config.audio_network_adaptor_config, event_log_)) {
RTC_LOG(LS_INFO) << "Audio network adaptor enabled on SSRC "
<< new_config.rtp.ssrc;
} else {
RTC_LOG(LS_INFO) << "Failed to enable Audio network adaptor on SSRC "
<< new_config.rtp.ssrc;
}
}
// Wrap the encoder in an AudioEncoderCNG, if VAD is enabled.
if (spec.cng_payload_type) {
AudioEncoderCngConfig cng_config;
cng_config.num_channels = encoder->NumChannels();
cng_config.payload_type = *spec.cng_payload_type;
cng_config.speech_encoder = std::move(encoder);
cng_config.vad_mode = Vad::kVadNormal;
encoder = CreateComfortNoiseEncoder(std::move(cng_config));
RegisterCngPayloadType(*spec.cng_payload_type,
new_config.send_codec_spec->format.clockrate_hz);
}
// Wrap the encoder in a RED encoder, if RED is enabled.
if (spec.red_payload_type) {
AudioEncoderCopyRed::Config red_config;
red_config.payload_type = *spec.red_payload_type;
red_config.speech_encoder = std::move(encoder);
encoder = std::make_unique<AudioEncoderCopyRed>(std::move(red_config),
field_trials_);
}
// Set currently known overhead (used in ANA, opus only).
// If overhead changes later, it will be updated in UpdateOverheadForEncoder.
{
MutexLock lock(&overhead_per_packet_lock_);
size_t overhead = GetPerPacketOverheadBytes();
if (overhead > 0) {
encoder->OnReceivedOverhead(overhead);
}
}
StoreEncoderProperties(encoder->SampleRateHz(), encoder->NumChannels());
channel_send_->SetEncoder(new_config.send_codec_spec->payload_type,
std::move(encoder));
return true;
}
webrtc::PacketRouter
webrtc::ModuleRtpRtcpImpl2
and are connected when webrtc::AudioSendStream
the lifecycle function of Start()
is called , that is, webrtc::internal::AudioSendStream::Start()
the call realizes webrtc::voe::(anonymous namespace)::ChannelSend::StartSend()
the connection between the two. ChannelSend::StartSend()
The function is implemented as follows:
void ChannelSend::StartSend() {
RTC_DCHECK_RUN_ON(&worker_thread_checker_);
RTC_DCHECK(!sending_);
sending_ = true;
RTC_DCHECK(packet_router_);
packet_router_->AddSendRtpModule(rtp_rtcp_.get(), /*remb_candidate=*/false);
rtp_rtcp_->SetSendingMediaStatus(true);
int ret = rtp_rtcp_->SetSendingStatus(true);
RTC_DCHECK_EQ(0, ret);
// It is now OK to start processing on the encoder task queue.
encoder_queue_.PostTask([this] {
RTC_DCHECK_RUN_ON(&encoder_queue_);
encoder_queue_is_active_ = true;
});
}
At this point, the analysis of the relationship established between the various modules in Figure 3-1 is completed.
3.5 createCreateChannels
It is still an example of peerconnection_client based on WebRTC's Native layer. The audio is sent and received using the rtp protocol.
webrtc::PeerConnection::Initialize(const webrtc::PeerConnectionInterface::RTCConfiguration&, webrtc::PeerConnectionDependencies)
The function will call InitializeTransportController_n
the method, and finally call to webrtc::JsepTransportController::JsepTransportController
create JsepTransportController, jesp is the abbreviation of JavaScript Session Establishment Protocol, because webrtc is based on the web, and the web is mostly developed with java, so for these general protocol requirements, WebRTC implements a set of services Based on the protocol and implementation of the Web, this enables Web-based development to directly call the JS (JavaScript) that has been implemented, making program writing much easier, and here the JESP function is implemented in c++. With JsepTransportController, the callback call webrtc::JsepTransportCollection::RegisterTransport
creates and registers the transport. The reason why the transport is created at this layer is that WebRTC supports the DTLS (Datagram Transport Layer Security) transport protocol. Under this protocol, ICE and other protocols are also hidden. The next step is the details of the multimedia data stream using this transport to achieve specific sending and receiving. up.
During SDP negotiation, the offer end will call SdpOfferAnswerHandler::ApplyLocalDescription()
and the answer end will call SdpOfferAnswerHandler::ApplyRemoteDescription()
. Both APIs will be called to SdpOfferAnswerHandler::UpdateTransceiversAndDataChannels()
create an audio channel. SdpOfferAnswerHandler::CreateChannels()
In PeerConnection
get the RTP transport created by the JESP protocol according to the mid (media ID).
//webrtc/pc/sdp_offer_answer.cc
//参数desc中包括了媒体的类型,WebRTC目前支持视频、音频以及数据三种类型的Channel创建。
RTCError SdpOfferAnswerHandler::CreateChannels(const SessionDescription& desc) {
TRACE_EVENT0("webrtc", "SdpOfferAnswerHandler::CreateChannels");
// Creating the media channels. Transports should already have been created
// at this point.
RTC_DCHECK_RUN_ON(signaling_thread());
//对于Voice类型的多媒体,其返回值为MEDIA_TYPE_AUDIO
const cricket::ContentInfo* voice = cricket::GetFirstAudioContent(&desc);
if (voice && !voice->rejected &&
//对于还没设置local/remote description的transceiver其channel是还没创建的,因而会执行下面的CreateChannel()方法
!rtp_manager()->GetAudioTransceiver()->internal()->channel()) {
//CreateChannel()最后一个参数std::function<RtpTransportInternal*(absl::string_view)> transport_lookup使用Lambda表达式获取。其return返回的是transport的引用。
auto error =
rtp_manager()->GetAudioTransceiver()->internal()->CreateChannel(
voice->name, pc_->call_ptr(), pc_->configuration()->media_config,
pc_->SrtpRequired(), pc_->GetCryptoOptions(), audio_options(),
video_options(), video_bitrate_allocator_factory_.get(),
[&](absl::string_view mid) {
RTC_DCHECK_RUN_ON(network_thread());
return transport_controller_n()->GetRtpTransport(mid);
});
if (!error.ok()) {
return error;
}
}
//video媒体类型的channel创建
const cricket::ContentInfo* video = cricket::GetFirstVideoContent(&desc);
if (video && !video->rejected &&
!rtp_manager()->GetVideoTransceiver()->internal()->channel()) {
auto error =
rtp_manager()->GetVideoTransceiver()->internal()->CreateChannel(
video->name, pc_->call_ptr(), pc_->configuration()->media_config,
pc_->SrtpRequired(), pc_->GetCryptoOptions(),
audio_options(), video_options(),
video_bitrate_allocator_factory_.get(), [&](absl::string_view mid) {
RTC_DCHECK_RUN_ON(network_thread());
return transport_controller_n()->GetRtpTransport(mid);
});
if (!error.ok()) {
return error;
}
}
//data媒体类型的channel创建
const cricket::ContentInfo* data = cricket::GetFirstDataContent(&desc);
if (data && !data->rejected &&
!data_channel_controller()->data_channel_transport()) {
if (!CreateDataChannel(data->name)) {
return RTCError(RTCErrorType::INTERNAL_ERROR,
"Failed to create data channel.");
}
}
return RTCError::OK();
}
CreateChannel() 的最后一个参数使用c++ 11特性的lambda表达式获取,该lambda表达式返回的是transport的引用,
JsepTransportController` obtains the RTP transport object according to the mid, and its implementation is as follows:
//webrtc/pc/jsep_transport_controller.cc
RtpTransportInternal* JsepTransportController::GetRtpTransport(
absl::string_view mid) const {
RTC_DCHECK_RUN_ON(network_thread_);
auto jsep_transport = GetJsepTransportForMid(mid);
if (!jsep_transport) {
return nullptr;
}
//根据不同安全传输协议或非加密传输返回对应的transport对象
//const std::unique_ptr<webrtc::RtpTransport> unencrypted_rtp_transport_;
//const std::unique_ptr<webrtc::SrtpTransport> sdes_transport_;
//const std::unique_ptr<webrtc::DtlsSrtpTransport> dtls_srtp_transport_;
return jsep_transport->rtp_transport();
}
RtpTransceiver::CreateChannel()
The implementation is as follows, this function is applicable to both video and audio:
//pc/rtp_transceiver.cc
RTCError RtpTransceiver::CreateChannel(
absl::string_view mid,
Call* call_ptr,
const cricket::MediaConfig& media_config,
bool srtp_required,
CryptoOptions crypto_options,
const cricket::AudioOptions& audio_options,
const cricket::VideoOptions& video_options,
VideoBitrateAllocatorFactory* video_bitrate_allocator_factory,
std::function<RtpTransportInternal*(absl::string_view)> transport_lookup) {
RTC_DCHECK_RUN_ON(thread_);
//判断media_engine_对象是否已经存在
if (!media_engine()) {
// TODO(hta): Must be a better way
return RTCError(RTCErrorType::INTERNAL_ERROR,
"No media engine for mid=" + std::string(mid));
}
std::unique_ptr<cricket::ChannelInterface> new_channel;
//audio channel创建
if (media_type() == cricket::MEDIA_TYPE_AUDIO) {
// TODO(bugs.webrtc.org/11992): CreateVideoChannel internally switches to
// the worker thread. We shouldn't be using the `call_ptr_` hack here but
// simply be on the worker thread and use `call_` (update upstream code).
RTC_DCHECK(call_ptr);
RTC_DCHECK(media_engine());
// TODO(bugs.webrtc.org/11992): Remove this workaround after updates in
// PeerConnection and add the expectation that we're already on the right
// thread.
context()->worker_thread()->BlockingCall([&] {
RTC_DCHECK_RUN_ON(context()->worker_thread());
cricket::VoiceMediaChannel* media_channel =
media_engine()->voice().CreateMediaChannel(
call_ptr, media_config, audio_options, crypto_options);
if (!media_channel) {
return;
}
new_channel = std::make_unique<cricket::VoiceChannel>(
context()->worker_thread(), context()->network_thread(),
context()->signaling_thread(), absl::WrapUnique(media_channel), mid,
srtp_required, crypto_options, context()->ssrc_generator());
});
//video channel创建
} else {
RTC_DCHECK_EQ(cricket::MEDIA_TYPE_VIDEO, media_type());
// TODO(bugs.webrtc.org/11992): CreateVideoChannel internally switches to
// the worker thread. We shouldn't be using the `call_ptr_` hack here but
// simply be on the worker thread and use `call_` (update upstream code).
context()->worker_thread()->BlockingCall([&] {
RTC_DCHECK_RUN_ON(context()->worker_thread());
cricket::VideoMediaChannel* media_channel =
media_engine()->video().CreateMediaChannel(
call_ptr, media_config, video_options, crypto_options,
video_bitrate_allocator_factory);
if (!media_channel) {
return;
}
new_channel = std::make_unique<cricket::VideoChannel>(
context()->worker_thread(), context()->network_thread(),
context()->signaling_thread(), absl::WrapUnique(media_channel), mid,
srtp_required, crypto_options, context()->ssrc_generator());
});
}
if (!new_channel) {
// TODO(hta): Must be a better way
return RTCError(RTCErrorType::INTERNAL_ERROR,
"Failed to create channel for mid=" + std::string(mid));
}
SetChannel(std::move(new_channel), transport_lookup);
return RTCError::OK();
}
3.6 Set the transport
After the audio channel is created in Section 3.1, the SetChannel function will be called to complete the setting of the tranport object.
//webrtc/pc/rtp_transceiver.cc
void RtpTransceiver::SetChannel(
std::unique_ptr<cricket::ChannelInterface> channel,
std::function<RtpTransportInternal*(const std::string&)> transport_lookup) {
RTC_DCHECK_RUN_ON(thread_);
RTC_DCHECK(channel);
RTC_DCHECK(transport_lookup);
RTC_DCHECK(!channel_);
// Cannot set a channel on a stopped transceiver.
if (stopped_) {
return;
}
RTC_LOG_THREAD_BLOCK_COUNT();
RTC_DCHECK_EQ(media_type(), channel->media_type());
signaling_thread_safety_ = PendingTaskSafetyFlag::Create();
std::unique_ptr<cricket::ChannelInterface> channel_to_delete;
// An alternative to this, could be to require SetChannel to be called
// on the network thread. The channel object operates for the most part
// on the network thread, as part of its initialization being on the network
// thread is required, so setting a channel object as part of the construction
// (without thread hopping) might be the more efficient thing to do than
// how SetChannel works today.
// Similarly, if the channel() accessor is limited to the network thread, that
// helps with keeping the channel implementation requirements being met and
// avoids synchronization for accessing the pointer or network related state.
context()->network_thread()->BlockingCall([&]() {
if (channel_) {
channel_->SetFirstPacketReceivedCallback(nullptr);
channel_->SetRtpTransport(nullptr);
channel_to_delete = std::move(channel_);
}
//保存channel对象
channel_ = std::move(channel);
//设置tranport对象,这是RTP层级的tranport,这些tranport的类型可以是如下几种:
// * An RtpTransport without encryption.
// * An SrtpTransport for SDES.
// * A DtlsSrtpTransport for DTLS-SRTP.
channel_->SetRtpTransport(transport_lookup(channel_->mid()));
//通过lambda表达式设置OnFirstPacketReceived()为第一个数据包接收的回调函数
channel_->SetFirstPacketReceivedCallback(
[thread = thread_, flag = signaling_thread_safety_, this]() mutable {
thread->PostTask(
SafeTask(std::move(flag), [this]() {
OnFirstPacketReceived(); }));
});
});
PushNewMediaChannelAndDeleteChannel(nullptr);
RTC_DCHECK_BLOCK_COUNT_NO_MORE_THAN(2);
}
The core implementation of the tranport layer setting is as follows,
bool BaseChannel::SetRtpTransport(webrtc::RtpTransportInternal* rtp_transport) {
TRACE_EVENT0("webrtc", "BaseChannel::SetRtpTransport");
RTC_DCHECK_RUN_ON(network_thread());
if (rtp_transport == rtp_transport_) {
return true;
}
if (rtp_transport_) {
DisconnectFromRtpTransport_n();
// Clear the cached header extensions on the worker.
worker_thread_->PostTask(SafeTask(alive_, [this] {
RTC_DCHECK_RUN_ON(worker_thread());
rtp_header_extensions_.clear();
}));
}
rtp_transport_ = rtp_transport;
if (rtp_transport_) {
if (!ConnectToRtpTransport_n()) {
return false;
}
RTC_DCHECK(!media_channel_->HasNetworkInterface());
//SetInterface()函数的参数MediaChannelNetworkInterface* iface
media_channel_->SetInterface(this);
media_channel_->OnReadyToSend(rtp_transport_->IsReadyToSend());
UpdateWritableState_n();
// Set the cached socket options.
for (const auto& pair : socket_options_) {
rtp_transport_->SetRtpOption(pair.first, pair.second);
}
if (!rtp_transport_->rtcp_mux_enabled()) {
for (const auto& pair : rtcp_socket_options_) {
rtp_transport_->SetRtcpOption(pair.first, pair.second);
}
}
}
return true;
}
The parameter of the MediaChannel :: SetInterface ( MediaChannelNetworkInterface * iface ) function is the implementation of MediaChannel::NetworkInterface, which is used to send and receive data packets for MediaChannel.
void MediaChannel::SetInterface(MediaChannelNetworkInterface* iface) {
RTC_DCHECK_RUN_ON(network_thread_);
iface ? network_safety_->SetAlive() : network_safety_->SetNotAlive();
network_interface_ = iface;
UpdateDscp();
}
bool MediaChannel::DoSendPacket(rtc::CopyOnWriteBuffer* packet,
bool rtcp,
const rtc::PacketOptions& options) {
RTC_DCHECK_RUN_ON(network_thread_);
if (!network_interface_)
return false;
return (!rtcp) ? network_interface_->SendPacket(packet, options)
: network_interface_->SendRtcp(packet, options);
}
BaseChannel
Implement MediaChannel::NetworkInterface
the interface , BaseChannel::SetRtpTransport() connects MediaChannel
, BaseChannel
and RtpTransportInternal
these three components.
3.7 Audio data packet sending processing
The transmission of audio data packets is divided into processes such as collection, encoding, sub-packet transmission, and socket transmission. The relationship between the functions involved in the collection process of the Linux platform and their module calls is shown in the figure below.
The acquisition module mainly involves two modules, webrtc/modules/audio_device and webrtc/audio, and the acquisition starts from the initialization of the Linux platform device. During the initialization process, the collection and playback threads will be created, and the callback function of the transport layer will be called to pass it to the tranport layer.
3.7.1 Audio Data Acquisition
Since the data is continuously generated, the data is fetched by the collection thread at an interval of 10ms and thrown to the upper layer. Because the APIs provided by different system platforms are different, such as Android has AAdudio, Linux listens to ALSA, mac and windows are Apple and Windows respectively. Microsoft's API, so webRTC encapsulates the interface class through the ADM module, so that the upper layer uses the same API to call device-related methods, which realizes shielding. Here, the Linux platform is taken as an example, and other platforms will not repeat them one by one. When passing through the PulsAudio interface , webrtc::AudioDeviceLinuxPulse::Init()
a collection thread will be created webrtc::AudioDeviceLinuxPulse::RecThreadProcess()
. The data collected by the collection thread will webrtc::AudioDeviceLinuxPulse::ReadRecordedData(void const*, unsigned long)
call APM to process the audio data webrtc::AudioDeviceLinuxPulse::ProcessRecordedData(signed char*, unsigned int, unsigned int)
, and the data processed by APM will be webrtc::AudioDeviceBuffer::DeliverRecordedData()
sent to the transport layer, and webrtc::AudioTransportImpl::RecordedDataIsAvailable(void const*, unsigned long, unsigned long, unsigned long, unsigned int, unsigned int, int, unsigned int, bool, unsigned int&)
the data will be sent to the channel for encoding and then sent.
The creation function of device initialization and acquisition thread is implemented as follows:
//webrtc/modules/audio_device/linux/audio_device_pulse_linux.cc
AudioDeviceGeneric::InitStatus AudioDeviceLinuxPulse::Init() {
RTC_DCHECK(thread_checker_.IsCurrent());
if (_initialized) {
return InitStatus::OK;
}
// 初始化 PulseAudio
if (InitPulseAudio() < 0) {
RTC_LOG(LS_ERROR) << "failed to initialize PulseAudio";
if (TerminatePulseAudio() < 0) {
RTC_LOG(LS_ERROR) << "failed to terminate PulseAudio";
}
return InitStatus::OTHER_ERROR;
}
// RECORDING
const auto attributes =
rtc::ThreadAttributes().SetPriority(rtc::ThreadPriority::kRealtime);
_ptrThreadRec = rtc::PlatformThread::SpawnJoinable(
//通过c++的lambda表达式启动采集线程
[this] {
while (RecThreadProcess()) {
}
},
"webrtc_audio_module_rec_thread", attributes);
// PLAYOUT
_ptrThreadPlay = rtc::PlatformThread::SpawnJoinable(
//通过c++的lambda表达式启动播放线程
[this] {
while (PlayThreadProcess()) {
}
},
"webrtc_audio_module_play_thread", attributes);
_initialized = true;
return InitStatus::OK;
}
After the acquisition thread is started through the lambda expression, the acquisition thread RecThreadProcess() starts to work:
bool AudioDeviceLinuxPulse::RecThreadProcess() {
if (!_timeEventRec.Wait(TimeDelta::Seconds(1))) {
return true;
}
MutexLock lock(&mutex_);
if (quit_) {
return false;
}
if (_startRec) {
RTC_LOG(LS_VERBOSE) << "_startRec true, performing initial actions";
_recDeviceName = NULL;
// Set if not default device
if (_inputDeviceIndex > 0) {
// Get the recording device name
_recDeviceName = new char[kAdmMaxDeviceNameSize];
_deviceIndex = _inputDeviceIndex;
RecordingDevices();
}
PaLock();
RTC_LOG(LS_VERBOSE) << "connecting stream";
// Connect the stream to a source
if (LATE(pa_stream_connect_record)(
_recStream, _recDeviceName, &_recBufferAttr,
(pa_stream_flags_t)_recStreamFlags) != PA_OK) {
RTC_LOG(LS_ERROR) << "failed to connect rec stream, err="
<< LATE(pa_context_errno)(_paContext);
}
RTC_LOG(LS_VERBOSE) << "connected";
// Wait for state change
while (LATE(pa_stream_get_state)(_recStream) != PA_STREAM_READY) {
LATE(pa_threaded_mainloop_wait)(_paMainloop);
}
RTC_LOG(LS_VERBOSE) << "done";
// We can now handle read callbacks
EnableReadCallback();
PaUnLock();
// Clear device name
if (_recDeviceName) {
delete[] _recDeviceName;
_recDeviceName = NULL;
}
_startRec = false;
_recording = true;
_recStartEvent.Set();
return true;
}
if (_recording) {
// Read data and provide it to VoiceEngine
if (ReadRecordedData(_tempSampleData, _tempSampleDataSize) == -1) {
return true;
}
_tempSampleData = NULL;
_tempSampleDataSize = 0;
PaLock();
while (true) {
// Ack the last thing we read
if (LATE(pa_stream_drop)(_recStream) != 0) {
RTC_LOG(LS_WARNING)
<< "failed to drop, err=" << LATE(pa_context_errno)(_paContext);
}
if (LATE(pa_stream_readable_size)(_recStream) <= 0) {
// Then that was all the data
break;
}
// Else more data.
const void* sampleData;
size_t sampleDataSize;
if (LATE(pa_stream_peek)(_recStream, &sampleData, &sampleDataSize) != 0) {
RTC_LOG(LS_ERROR) << "RECORD_ERROR, error = "
<< LATE(pa_context_errno)(_paContext);
break;
}
// Drop lock for sigslot dispatch, which could take a while.
PaUnLock();
// Read data and provide it to VoiceEngine
if (ReadRecordedData(sampleData, sampleDataSize) == -1) {
return true;
}
PaLock();
// Return to top of loop for the ack and the check for more data.
}
EnableReadCallback();
PaUnLock();
} // _recording
The collected data is encoded by the ProcessAndEncodeAudio function corresponding to the channel and then sent out. The function call flow is as follows:
//audio/audio_transport_impl.cc
webrtc::AudioTransportImpl::SendProcessedData(std::unique_ptr<webrtc::AudioFrame, std::default_delete<webrtc::AudioFrame> >)
//audio/audio_send_stream.cc
ebrtc::internal::AudioSendStream::SendAudioData(std::unique_ptr<webrtc::AudioFrame, std::default_delete<webrtc::AudioFrame> >)
//audio/channel_send.cc
ChannelSend::ProcessAndEncodeAudio(std::unique_ptr<webrtc::AudioFrame, std::default_delete<webrtc::AudioFrame> >)
AudioSendStream::SendAudioData
The implementation of the function is as follows:
void AudioSendStream::SendAudioData(std::unique_ptr<AudioFrame> audio_frame) {
RTC_CHECK_RUNS_SERIALIZED(&audio_capture_race_checker_);
RTC_DCHECK_GT(audio_frame->sample_rate_hz_, 0);
TRACE_EVENT0("webrtc", "AudioSendStream::SendAudioData");
double duration = static_cast<double>(audio_frame->samples_per_channel_) /
audio_frame->sample_rate_hz_;
{
// Note: SendAudioData() passes the frame further down the pipeline and it
// may eventually get sent. But this method is invoked even if we are not
// connected, as long as we have an AudioSendStream (created as a result of
// an O/A exchange). This means that we are calculating audio levels whether
// or not we are sending samples.
// TODO(https://crbug.com/webrtc/10771): All "media-source" related stats
// should move from send-streams to the local audio sources or tracks; a
// send-stream should not be required to read the microphone audio levels.
MutexLock lock(&audio_level_lock_);
audio_level_.ComputeLevel(*audio_frame, duration);
}
channel_send_->ProcessAndEncodeAudio(std::move(audio_frame));
}
3.7.2 Encode and add to pacer queue
Following the SendAudioData function described in section 3.3.1 channel_send_->ProcessAndEncodeAudio(std::move(audio_frame));
, this function calls the method provided by the webrtc encoding module to encode the data and adds it to the sent pacer queue. The main functions of the involved modules and calling functions are shown in the figure below:
audio The connection between the module and the rtp module passes the following functions,
int32_t ChannelSend::SendRtpAudio(AudioFrameType frameType,
uint8_t payloadType,
uint32_t rtp_timestamp,
rtc::ArrayView<const uint8_t> payload,
int64_t absolute_capture_timestamp_ms) {
if (_includeAudioLevelIndication) {
// Store current audio level in the RTP sender.
// The level will be used in combination with voice-activity state
// (frameType) to add an RTP header extension
rtp_sender_audio_->SetAudioLevel(rms_level_.Average());
}
// E2EE Custom Audio Frame Encryption (This is optional).
// Keep this buffer around for the lifetime of the send call.
rtc::Buffer encrypted_audio_payload;
// We don't invoke encryptor if payload is empty, which means we are to send
// DTMF, or the encoder entered DTX.
// TODO(minyue): see whether DTMF packets should be encrypted or not. In
// current implementation, they are not.
if (!payload.empty()) {
if (frame_encryptor_ != nullptr) {
// TODO([email protected]) - Allocate enough to always encrypt inline.
// Allocate a buffer to hold the maximum possible encrypted payload.
size_t max_ciphertext_size = frame_encryptor_->GetMaxCiphertextByteSize(
cricket::MEDIA_TYPE_AUDIO, payload.size());
encrypted_audio_payload.SetSize(max_ciphertext_size);
// Encrypt the audio payload into the buffer.
size_t bytes_written = 0;
int encrypt_status = frame_encryptor_->Encrypt(
cricket::MEDIA_TYPE_AUDIO, rtp_rtcp_->SSRC(),
/*additional_data=*/nullptr, payload, encrypted_audio_payload,
&bytes_written);
if (encrypt_status != 0) {
RTC_DLOG(LS_ERROR)
<< "Channel::SendData() failed encrypt audio payload: "
<< encrypt_status;
return -1;
}
// Resize the buffer to the exact number of bytes actually used.
encrypted_audio_payload.SetSize(bytes_written);
// Rewrite the payloadData and size to the new encrypted payload.
payload = encrypted_audio_payload;
} else if (crypto_options_.sframe.require_frame_encryption) {
RTC_DLOG(LS_ERROR) << "Channel::SendData() failed sending audio payload: "
"A frame encryptor is required but one is not set.";
return -1;
}
}
// Push data from ACM to RTP/RTCP-module to deliver audio frame for
// packetization.
if (!rtp_rtcp_->OnSendingRtpFrame(rtp_timestamp,
// Leaving the time when this frame was
// received from the capture device as
// undefined for voice for now.
-1, payloadType,
/*force_sender_report=*/false)) {
return -1;
}
// RTCPSender has it's own copy of the timestamp offset, added in
// RTCPSender::BuildSR, hence we must not add the in the offset for the above
// call.
// TODO(nisse): Delete RTCPSender:timestamp_offset_, and see if we can confine
// knowledge of the offset to a single place.
// This call will trigger Transport::SendPacket() from the RTP/RTCP module.
if (!rtp_sender_audio_->SendAudio(
frameType, payloadType, rtp_timestamp + rtp_rtcp_->StartTimestamp(),
payload.data(), payload.size(), absolute_capture_timestamp_ms)) {
RTC_DLOG(LS_ERROR)
<< "ChannelSend::SendData() failed to send data to RTP/RTCP module";
return -1;
}
return 0;
}
The core functions added to the pacer queue are as follows:
void PacingController::EnqueuePacket(std::unique_ptr<RtpPacketToSend> packet) {
RTC_DCHECK(pacing_rate_ > DataRate::Zero())
<< "SetPacingRate must be called before InsertPacket.";
RTC_CHECK(packet->packet_type());
prober_.OnIncomingPacket(DataSize::Bytes(packet->payload_size()));
const Timestamp now = CurrentTime();
if (packet_queue_.Empty()) {
// If queue is empty, we need to "fast-forward" the last process time,
// so that we don't use passed time as budget for sending the first new
// packet.
Timestamp target_process_time = now;
Timestamp next_send_time = NextSendTime();
if (next_send_time.IsFinite()) {
// There was already a valid planned send time, such as a keep-alive.
// Use that as last process time only if it's prior to now.
target_process_time = std::min(now, next_send_time);
}
UpdateBudgetWithElapsedTime(UpdateTimeAndGetElapsed(target_process_time));
}
packet_queue_.Push(now, std::move(packet));
seen_first_packet_ = true;
// Queue length has increased, check if we need to change the pacing rate.
MaybeUpdateMediaRateDueToLongQueue(now);
}
The core is to put the audio packet packet_queue_
on the queue, and send packet_queue_
the data packets on the queue through the packer thread.
3.7.3 pacedSender sends RTP packets
After the RTP packet is added to the pacer queue in Section 3.7.2, it will be sent out by the sending thread of the pacer queue; the pacer queue Call::CreateAudioReceiveStream
will call the creation of the sending thread when it is created TaskQueuePacedSender::EnsureStarted()
. TaskQueuePacedSender::EnsureStarted()
thread.
//webrtc/modules/pacing/task_queue_paced_sender.cc
void TaskQueuePacedSender::EnsureStarted() {
task_queue_.RunOrPost([this]() {
RTC_DCHECK_RUN_ON(&task_queue_);
is_started_ = true;
MaybeProcessPackets(Timestamp::MinusInfinity());
});
}
//webrtc/modules/utility/maybe_worker_thread.cc
void MaybeWorkerThread::RunOrPost(absl::AnyInvocable<void() &&> task) {
if (owned_task_queue_) {
owned_task_queue_->PostTask(std::move(task));
} else {
RTC_DCHECK_RUN_ON(&sequence_checker_);
std::move(task)();
}
}
After cricket::MediaChannel::DoSendPacket()
passing, the rtp data packet is sent from the media layer to the channel layer, and sent out through the socket interface of the channel layer.
3.7.4 Send data packets through the socket interface
The call relationship from mediaChannel to socket is shown in the figure above.
3.8 Reception and processing of audio data packets
The receiving and processing of audio data packets is divided into three main modules: receiving RTP packets from the network, inserting RTP packets into the NetEQ module, and playing after NetEQ decoding.
3.8.1 Receive audio RTP packets from the network
The RTP packet is notified asynchronously through the signal, cricket::DtlsTransport::OnReadPacket
and the function bound to the signal is as follows:webrtc::RtpTransport::OnReadPacket
RtpTransport::OnReadPacket
//third_party/webrtc/pc/rtp_transport.cc
void RtpTransport::SetRtpPacketTransport(
rtc::PacketTransportInternal* new_packet_transport) {
if (new_packet_transport == rtp_packet_transport_) {
return;
}
if (rtp_packet_transport_) {
rtp_packet_transport_->SignalReadyToSend.disconnect(this);
rtp_packet_transport_->SignalReadPacket.disconnect(this);
rtp_packet_transport_->SignalNetworkRouteChanged.disconnect(this);
rtp_packet_transport_->SignalWritableState.disconnect(this);
rtp_packet_transport_->SignalSentPacket.disconnect(this);
// Reset the network route of the old transport.
SignalNetworkRouteChanged(absl::optional<rtc::NetworkRoute>());
}
if (new_packet_transport) {
new_packet_transport->SignalReadyToSend.connect(
this, &RtpTransport::OnReadyToSend);
new_packet_transport->SignalReadPacket.connect(this,
&RtpTransport::OnReadPacket);
new_packet_transport->SignalNetworkRouteChanged.connect(
this, &RtpTransport::OnNetworkRouteChanged);
new_packet_transport->SignalWritableState.connect(
this, &RtpTransport::OnWritableState);
new_packet_transport->SignalSentPacket.connect(this,
&RtpTransport::OnSentPacket);
// Set the network route for the new transport.
SignalNetworkRouteChanged(new_packet_transport->network_route());
}
rtp_packet_transport_ = new_packet_transport;
// Assumes the transport is ready to send if it is writable. If we are wrong,
// ready to send will be updated the next time we try to send.
SetReadyToSend(false,
rtp_packet_transport_ && rtp_packet_transport_->writable());
}
Both from cricket::UDPPort::HandleIncomingPacket()
to cricket::UDPPort::OnReadPacket()
and from cricket::UDPPort::OnReadPacket()
to cricket::P2PTransportChannel::OnReadPacket()
are processed asynchronously and in a timely manner through signal. Finally, at the voice engine layer, WebRtcVoiceMediaChannel::OnPacketReceived()
the function is implemented as follows. This function sends the received rtp packet to the call layer through a lambda expression, and then puts this processing process in a new thread to execute the lambda expression through a PostTask method. The lambda expression The call in call_->Receiver()->DeliverRtpPacket()
is very important.
void WebRtcVoiceMediaChannel::OnPacketReceived(
const webrtc::RtpPacketReceived& packet) {
RTC_DCHECK_RUN_ON(&network_thread_checker_);
// TODO(bugs.webrtc.org/11993): This code is very similar to what
// WebRtcVideoChannel::OnPacketReceived does. For maintainability and
// consistency it would be good to move the interaction with
// call_->Receiver() to a common implementation and provide a callback on
// the worker thread for the exception case (DELIVERY_UNKNOWN_SSRC) and
// how retry is attempted.
worker_thread_->PostTask(
SafeTask(task_safety_.flag(), [this, packet = packet]() mutable {
RTC_DCHECK_RUN_ON(worker_thread_);
// TODO(bugs.webrtc.org/7135): extensions in `packet` is currently set
// in RtpTransport and does not neccessarily include extensions specific
// to this channel/MID. Also see comment in
// BaseChannel::MaybeUpdateDemuxerAndRtpExtensions_w.
// It would likely be good if extensions where merged per BUNDLE and
// applied directly in RtpTransport::DemuxPacket;
packet.IdentifyExtensions(recv_rtp_extension_map_);
if (!packet.arrival_time().IsFinite()) {
packet.set_arrival_time(webrtc::Timestamp::Micros(rtc::TimeMicros()));
}
call_->Receiver()->DeliverRtpPacket(
webrtc::MediaType::AUDIO, std::move(packet),
absl::bind_front(
&WebRtcVoiceMediaChannel::MaybeCreateDefaultReceiveStream,
this));
}));
}
3.8.2 Asynchronous insertion of audio RTP packets into NetEQ
At the end of section 3.8.1, call_->Receiver()->DeliverRtpPacket
it is called in a new thread, which passes the RTP data packet to the voice engine layer. Due to the non-ideal factors such as packet loss and jitter in the network, it is necessary to resist the real-time requirements. For packet loss and dejitter, the NetEQ module can handle part of the jitter and packet loss. Of course, the encoder may also have its own packet loss compensation algorithm, such as G7XX and Opus encoders, which may not activate NeteQ’s packet loss algorithm. But anti-jitter all relies on NetEQ processing. The call flow for adding RTP packets to the NetEQ buffer is shown in the figure below:
3.8.3 Obtain NetEQ audio package and decode and play
Since the sound is continuous, like audio collection, the playback is also streaming. In WebRTC, the audio data is played once every 10ms, so a separate playback thread is used to play the audio data. For the Linux platform, the audio playback thread is created in the audio device In the initialization phase, the initialization of the audio device management implemented by the PulseAudio framework is as follows, and lambda expressions are used to create acquisition and playback threads.
AudioDeviceGeneric::InitStatus AudioDeviceLinuxPulse::Init() {
RTC_DCHECK(thread_checker_.IsCurrent());
if (_initialized) {
return InitStatus::OK;
}
// Initialize PulseAudio
if (InitPulseAudio() < 0) {
RTC_LOG(LS_ERROR) << "failed to initialize PulseAudio";
if (TerminatePulseAudio() < 0) {
RTC_LOG(LS_ERROR) << "failed to terminate PulseAudio";
}
return InitStatus::OTHER_ERROR;
}
// RECORDING
const auto attributes =
rtc::ThreadAttributes().SetPriority(rtc::ThreadPriority::kRealtime);
_ptrThreadRec = rtc::PlatformThread::SpawnJoinable(
[this] {
while (RecThreadProcess()) {
}
},
"webrtc_audio_module_rec_thread", attributes);
// PLAYOUT
_ptrThreadPlay = rtc::PlatformThread::SpawnJoinable(
[this] {
while (PlayThreadProcess()) {
}
},
"webrtc_audio_module_play_thread", attributes);
_initialized = true;
return InitStatus::OK;
}
It is called in the final decoding part of the playback AudioDecoderOpusImpl::DecodeInternal()
, and of course it can be called AudioDecoderG722Impl::DecodeInternal()
. As for which encoder to use, it depends on the result of the SDP protocol negotiation. For the Opus encoder, see the audio encoder opus analysis column .