Introduction to WebRTC
WebRTC (Web Real-Time Communications) is a real-time communication technology that allows network applications or sites to establish a peer-to-peer (Peer-to-Peer) connection between browsers without intermediaries to achieve video streaming and transmission of audio streams or other arbitrary data.
At present, the application of WebRTC is not limited to browsers. Through the official SDK, we can easily realize audio and video transmission between local applications. On the Android platform, we also integrate the WebRTC framework very easily, and can achieve powerful and reliable audio and video transmission functions with very simple code.
In the next part, we will build the WebRTC demo of the Android platform together, and realize the screen transfer function between the two ends of the LAN, and also support the sending of message data between each other.
Import WebRTC official aar
Google has officially provided packaged so and java layer sdk codes, which can be directly imported into aar packages conveniently.
implementation 'org.webrtc:google-webrtc:1.0.32006'
If there is a modification to the api part or the bottom layer of so, what should I do if I want to release the compiled aar by myself? src/tools_webrtc/android/
In the official source code build_aar.py
, release_aar.py
there are scripts related to generating local aar and publishing aar to maven warehouse.
Of course, you can also compile so and import java layer sdk code into the project yourself. However, the SDK source code for generating aar is not placed in one location, but scattered in each module of WebRTC. We can check the relevant code dependencies through the src/sdk/android/BUILD.gn
file dist_jar("libwebrtc")
task in the source code.
initializationPeerConnectionFactory
Before the first use PeerConnectionFactory
, you must call the static method initialize()
for global initialization and resource loading. Among them, the parameter passing InitializationOptions
is initialized through the internal Builder, which can be set for LibraryLoader
, , Tracer
, etc. It is generally recommended to call it in .Logger
Application
PeerConnectionFactory.initialize(PeerConnectionFactory
.InitializationOptions
.builder(this)
.setEnableInternalTracer(true)
.createInitializationOptions());
create PeerConnectionFactory
object
After global initialization is complete, we can create PeerConnectionFactory
instances. This factory class is very important. It needs to generate various important components for us in the subsequent creation of connections and audio and video capture/codec. Such as: PeerConnection
, VideoSource
, VideoTrack
etc... Use the Builder mode to initialize it, which is convenient for setting the codec.
final VideoEncoderFactory encoderFactory = new DefaultVideoEncoderFactory(mEglBase.getEglBaseContext(), true, true);
final VideoDecoderFactory decoderFactory = new DefaultVideoDecoderFactory(mEglBase.getEglBaseContext());
mPeerConnectionFactory = PeerConnectionFactory.builder()
.setVideoEncoderFactory(encoderFactory)
.setVideoDecoderFactory(decoderFactory)
.createPeerConnectionFactory();
Here we can use the default DefaultVideoEncoderFactory
sum DefaultVideoDecoderFactory
. You can simply look at the internal implementation. Take Decoder as an example. In fact, both soft and hard solutions are supported internally. The hard solution is preferred. If the hard solution does not support it, it will fall back to the soft solution:
public class DefaultVideoDecoderFactory implements VideoDecoderFactory {
......
@Override
public @Nullable
VideoDecoder createDecoder(VideoCodecInfo codecType) {
VideoDecoder softwareDecoder = softwareVideoDecoderFactory.createDecoder(codecType);
final VideoDecoder hardwareDecoder = hardwareVideoDecoderFactory.createDecoder(codecType);
if (softwareDecoder == null && platformSoftwareVideoDecoderFactory != null) {
softwareDecoder = platformSoftwareVideoDecoderFactory.createDecoder(codecType);
}
if (hardwareDecoder != null && softwareDecoder != null) {
// Both hardware and software supported, wrap it in a software fallback
return new VideoDecoderFallback(
/* fallback= */ softwareDecoder, /* primary= */ hardwareDecoder);
}
return hardwareDecoder != null ? hardwareDecoder : softwareDecoder;
}
......
}
PeerConnection
Create objects through Factory
After the Factory is generated, we can start creating PeerConnection
objects. As the name implies, this class represents a point-to-point connection, and data such as audio and video streams can be obtained from the remote end. Before creating, you can RTCConfiguration
configure the connection in detail, and finally createPeerConnection()
complete the creation through the method.
PeerConnection.RTCConfiguration rtcConfig =
new PeerConnection.RTCConfiguration(iceServers);
// TCP candidates are only useful when connecting to a server that supports
// ICE-TCP.
rtcConfig.tcpCandidatePolicy = PeerConnection.TcpCandidatePolicy.DISABLED;
rtcConfig.bundlePolicy = PeerConnection.BundlePolicy.MAXBUNDLE;
rtcConfig.rtcpMuxPolicy = PeerConnection.RtcpMuxPolicy.REQUIRE;
rtcConfig.continualGatheringPolicy = PeerConnection.ContinualGatheringPolicy.GATHER_CONTINUALLY;
// Use ECDSA encryption.
rtcConfig.keyType = PeerConnection.KeyType.ECDSA;
// Enable DTLS for normal calls and disable for loopback calls.
rtcConfig.enableDtlsSrtp = true;
rtcConfig.sdpSemantics = PeerConnection.SdpSemantics.UNIFIED_PLAN;
mPeerConnection = peerConnectionFactory.createPeerConnection(rtcConfig, this);
Create an audio and video data source
In addition, we can also use Factory to create audio and video data sources. Data sources can be quickly created through the createVideoSource()
method createAudioSource()
. But the data source here is only an abstract representation, so where does the specific data come from?
For audio, at creation AudioSource
time, data is captured from the audio device. For video streams, WebRTC defines VideoCapturer
an abstract interface and provides three implementations: ScreenCapturerAndroid
, CameraCapturer
and FileVideoCapturer
, which are to obtain video streams from screen recording, camera and files respectively, and will startCapture()
start to obtain data after calling.
// Create video source
SurfaceTextureHelper surfaceTextureHelper = SurfaceTextureHelper.create("CaptureThread", mEglBase.getEglBaseContext());
mVideoSource = mPeerConnectionFactory.createVideoSource(capturer.isScreencast());
capturer.initialize(surfaceTextureHelper, this, mVideoSource.getCapturerObserver());
capturer.startCapture(1920, 1080, 30);
// Create audio source
mAudioSource = mPeerConnectionFactauory.createAudioSource(new MediaConstraints());
VideoCapturer
The observer mode is used here. When the video stream is obtained, it will be CapturerObserver
called back through the incoming one to complete VideoSource
the association with.
public interface CapturerObserver {
void onCapturerStarted(boolean success);
void onCapturerStopped();
void onFrameCaptured(VideoFrame frame);
}
Finally, the packaging of the Source is completed. For the video track createVideoTrack()
, we can pass in the method to render and display the video stream locally (similar to the video conference scene to display the local video stream)createAudioTrack()
VideoTrack
addSink()
SurfaceViewRenderer
// Create video track
VideoTrack videoTrack = mPeerConnectionFactory.createVideoTrack(VIDEO_TRACK_ID, mVideoSource);
videoTrack.setEnabled(true);
videoTrack.addSink(mLocalSurfaceView);
// Create audio track
AudioTrack audioTrack = mPeerConnectionFactory.createAudioTrack(AUDIO_TRACK_ID, mAudioSource);
audioTrack.setEnabled(true);
Among them SurfaceViewRenderer
is VideoSink
the implementation class of the interface, we can regard VideoSink
the abstract as the receiver of the video stream, and let it decide how to process the video stream. SurfaceViewRenderer
After receiving onFrame()
the callback, OpenGL will be called internally for rendering.
public interface VideoSink {
@CalledByNative
void onFrame(VideoFrame frame);
}
Add toMediaStreamTrack
VideoTrack
After the sum is created AudioTrack
, we can PeerConnection
add audio and video tracks to it. In this way, WebRTC can help us generate an SDP containing the corresponding media information, so that it can be used for media capability negotiation later. It should be noted addTrack()
that it must be earlier than the subsequent commercial stage, otherwise the other end cannot receive related audio and video data.
mPeerConnection.addTrack(videoTrack, mediaStreamLabels);
mPeerConnection.addTrack(audioTrack, mediaStreamLabels);
Create a signaling server
Before establishing a connection, we have to exchange SDP information through the signaling server. For the sake of simplicity, our demo adopts the transmission method of LAN. Refer to the official demo, and implement it directly with java Socket (you can also choose Netty
or socket.io
wait for a third-party framework), see for details TCPChannelClient
. The code is relatively simple. If the incoming IP is determined to be a local address, it will be used as a server, otherwise it will be used as a client, and an interface for sending data will be provided to the upper layer.
public TCPChannelClient(
ExecutorService executor, TCPChannelEvents eventListener, String ip, int port) {
this.executor = executor;
executorThreadCheck = new ThreadUtils.ThreadChecker();
executorThreadCheck.detachThread();
this.eventListener = eventListener;
InetAddress address;
try {
address = InetAddress.getByName(ip);
} catch (UnknownHostException e) {
reportError("Invalid IP address.");
return;
}
if (address.isAnyLocalAddress()) {
socket = new TCPSocketServer(address, port);
} else {
socket = new TCPSocketClient(address, port);
}
socket.start();
}
conduct media negotiations
Similar to the previously analyzed Miracast RTSP protocol, capability negotiation is required before audio and video stream transmission. In fact, it is the audio and video codec supported by your device, the transmission protocol used, SSRC and other information...Transparently transmitted to the other party through the signaling server. If both parties support it, then the negotiation is considered successful.
- Offer: The SDP message sent by the caller is called Offer
- Answer: The SDP message sent by the caller is called Answer
The whole process of negotiation between the two parties is shown in the figure below:
Here, we use the connected Client as the caller to initiate an Offer request by createOffer()
creating an Offer SDP. After the creation is successful, a callback will SdpObserver
be received in . At this time , the method is called to save the Offer to the local Local domain, and then send the Offer to the other party.onCreateSuccess
setLocalDescription()
public class PeerConnectionWrapper implements PeerConnection.Observer, SdpObserver {
......
public void createOffer() {
mIsInitiator = true;
mPeerConnection.createOffer(this, mSdpMediaConstraints);
}
public void createAnswer() {
mIsInitiator = false;
mPeerConnection.createAnswer(this, mSdpMediaConstraints);
}
......
@Override
public void onCreateSuccess(SessionDescription sessionDescription) {
Log.d(TAG, "onCreateSuccess: " + sessionDescription.description);
if (mIsInitiator) {
mRTCClient.sendOfferSdp(sessionDescription);
} else {
mRTCClient.sendAnswerSdp(sessionDescription);
}
mPeerConnection.setLocalDescription(this, sessionDescription);
}
......
}
After the called party receives the Offer, setRemoteDescription()
it saves the Offer in its Remote domain through the method, and createAnswer()
creates an Answer SDP. After the creation is successful, it also calls setLocalDescription()
the method to save the Answer message in the local Local domain, and then replies to the calling party.
Finally, the caller will receive the Answer message and setRemoteDescription()
save the Answer to its Remote domain through the method. At this point, the entire media negotiation process is over.
mRTCClient = new DirectRTCClient(new AppRTCClient.SignalingCallback() {
......
@Override
public void onRemoteAnswer(SessionDescription sdp) {
mPeerConnectionWrapper.getPeerConnection().setRemoteDescription(mPeerConnectionWrapper, sdp);
}
@Override
public void onRemoteOffer(SessionDescription sdp) {
mPeerConnectionWrapper.getPeerConnection().setRemoteDescription(mPeerConnectionWrapper, sdp);
mPeerConnectionWrapper.createAnswer();
}
});
Establish a point-to-point connection
After the media negotiation, our peer-to-peer connection is not really established. At this time, the method createPeerConnection()
passed in PeerConnection.Observer
will be called back onIceCandidate()
and IceCandidate
the object will be provided. At this time, we will send the SDP signaling assembled as a candidate to the signaling server and transparently transmit it to the other end.
public class PeerConnectionWrapper implements PeerConnection.Observer, SdpObserver {
......
@Override
public void onIceCandidate(IceCandidate candidate) {
Log.d(TAG, "onIceCandidate:");
mRTCClient.sendLocalIceCandidate(candidate);
}
}
IceCandidate
The remote side reconstructs the object after it receives it , and addIceCandidate()
adds it PeerConnection
to it through the method.
mRTCClient = new DirectRTCClient(new AppRTCClient.SignalingCallback() {
......
@Override
public void onRemoteIceCandidate(IceCandidate candidate) {
mPeerConnectionWrapper.getPeerConnection().addIceCandidate(candidate);
}
}
Next, after the two parties obtain each other's Candidate, WebRTC starts to try to connect. Priority: host > srflx > relay. The connectivity detection between host types is the connectivity detection between intranets. In the above scenario, both parties of our call are in the same LAN, so they will be connected in the form of host.
Display remote video stream
When the point-to-point connection is established, we can start to obtain audio and video stream data. The callback method createPeerConnection()
passed in before (note that this method will be called back after receiving and calling the remote SDP , without waiting for the connection to be actually established, consistent with ), and provide an object, which contains the remote audio and video track with . We added a video track for screen recording earlier, so we just need to get the first object directly, and then bind it with the same as before to render the video stream.PeerConnection.Observer
onAddStream()
setRemoteDescription()
onAddTrack()
MediaStream
AudioTracks
VideoTracks
VideoTrack
addSink()
SurfaceViewRenderer
@Override
public void onAddStream(MediaStream mediaStream) {
Log.d(TAG, "onAddStream audio tracks size:" + mediaStream.audioTracks.size() + " video" + mediaStream.videoTracks.size());
if (mediaStream.videoTracks.size() >= 1) {
// Assuming there is only one video track.
VideoTrack remoteVideoTrack = mediaStream.videoTracks.get(0);
remoteVideoTrack.setEnabled(true);
remoteVideoTrack.addSink(mRemoteVideoSink);
}
}
About audio recording and playback
In WebRTC, JavaAudioDeviceModule
audio and video recording and playback are generally realized by using the bottom layer AudioRecord
for recording and AudioTrack
audio playback, and the Builder is used to build instances. And set it PeerConnectionFactory
through the method when creating it.setAudioDeviceModule()
private AudioDeviceModule createJavaAudioDevice() {
......
return JavaAudioDeviceModule.builder(getApplicationContext())
.setUseHardwareAcousticEchoCanceler(false)
.setUseHardwareNoiseSuppressor(false)
.setAudioRecordErrorCallback(audioRecordErrorCallback)
.setAudioTrackErrorCallback(audioTrackErrorCallback)
.setAudioRecordStateCallback(audioRecordStateCallback)
.setAudioTrackStateCallback(audioTrackStateCallback)
.createAudioDeviceModule();
}
private void initPeerConnection() {
final AudioDeviceModule adm = createJavaAudioDevice();
mPeerConnectionFactory = PeerConnectionFactory.builder()
.setAudioDeviceModule(adm)
......
.createPeerConnectionFactory();
......
adm.release();
}
DataChannel
send message using
The data channel of WebRTC DataChannel
is specially used to transmit any data other than audio and video streams, so its application is very extensive, such as real-time text chat, file transfer, etc. There are DataChannel
two creation methods, one is the default In-band
negotiation method, and the other is It is Out-of-band
a negotiation method, negotiated
initialized according to the fields.
In-band
negotiate
One end needs to call createDataChannel()
create DataChannel
object and set negotiated
to false
(default):
DataChannel.Init init = new DataChannel.Init();
init.ordered = true;
init.negotiated = false;
mDataChannel = mPeerConnection.createDataChannel("dataChannel", init);
mDataChannel.registerObserver(new DataChannel.Observer() {
...
@Override
public void onMessage(DataChannel.Buffer buffer) {
// Receive message from remote
......
}
});
When the media negotiation is completed and the connection is established, the other end will obtain the corresponding data channel through the callback. At this time, the data can be replied through the PeerConnection.Observer
parameters :onDataChannel()
DataChannel
@Override
public void onDataChannel(final DataChannel dataChannel) {
// Triggered when a remote peer opens a DataChannel
dataChannel.registerObserver(new DataChannel.Observer() {
...
@Override
public void onMessage(DataChannel.Buffer buffer) {
// Replay message to remote
sendDataChannelMessage("Replay message from:" + dataChannel);
......
}
});
}
DataChannel.send()
The two parties can send data to each other through the method:
public void sendDataChannelMessage(String message, DataChannel dataChannel) {
byte[] msg = message.getBytes();
DataChannel.Buffer buffer = new DataChannel.Buffer(
ByteBuffer.wrap(msg), false);
dataChannel.send(buffer);
}
DataChannel.Observer()
In the callback at the other end onMessage()
, we can get the data sent by the remote end:
ByteBuffer data = buffer.data;
final byte[] bytes = new byte[data.capacity()];
data.get(bytes);
String strData = new String(bytes, Charset.forName("UTF-8"));
Log.d(TAG, "Got msg: " + strData + " over " + mDataChannel.label() + " id:" + mDataChannel.id());
Out-of-band
negotiate
Both ends call createDataChannel()
the method to create DataChannel
the object, and set it negotiated
to true
, and then realize the data communication between the two parties through ID binding. The advantage of this method is that the two parties do not need to consider timing issues when sending data, and the code is more concise. Note that the bound IDs must be consistent:
DataChannel.Init init = new DataChannel.Init();
init.ordered = true; // 消息的传递是否有序
init.negotiated = true; // 协商方式
init.id = 0; // 通道ID
// init.maxPacketLifeTime // 重传最大超时时间
// init.maxRetransmits // 重传最大次数
mDataChannel = mPeerConnection.createDataChannel("dataChannel", init);
mDataChannel.registerObserver(new DataChannel.Observer() {
...
@Override
public void onMessage(DataChannel.Buffer buffer) {
// Receive message from remote
......
}
});
Reprint: https://codezjx.com/posts/webrtc-android-demo/#more