(Android-RTC-1) Android-WebRTC first experience

As an introduction, the project directory   can be found at https://github.com/MrZhaozhirong/AppWebRTC . The engineering environment is Gradle4.0.x+Androidx. The entire WebRTCdemo is manually re-forked. The official source code is here .

Preface

Officially starting the content of Android-WebRTC, all I can find on the Internet is the transfer tutorial of WebRTC-Codelab , and the learning demo is also a code snippet; otherwise, the experienced driver can directly install a set of Nignx+coturn+webrtc.js.api. These contents I still feel that it is not comprehensive and I do not have a clear understanding of the overall structure, so I decided to dig in step by step from shallow to deep. Until now, I dare not say that I fully understand everything in WebRTC, but at least I have a clear cognitive structure. WebRTC is not just a set of APIs or understood as an SDK, it is a set of development protocols (or solutions?) based on browser web for RTC (Real-Time Communications) real-time communication. Since it can be based on the browser web, there must be corresponding implementations on other platforms.

As for what is the difference between RTC and WebRTC? In fact, the two cannot be equated. In terms of functional processes, RTC includes collection, encoding, pre- and post-processing, transmission, decoding, buffering, rendering and many other links. Each subdivided link also has more subdivided technical modules. For example, the pre- and post-processing links include beautification, filters, echo cancellation, noise suppression, etc., the collection includes microphone arrays, etc., and the codecs include VP8, VP9, ​​H.264, H.265, etc. WebRTC is part of RTC and is a standard and open source project of Google specifically for real-time communication on web pages. Only basic front-end function implementation is provided, including encoding, decoding, jitter buffering, etc. If developers want to develop commercial projects based on WebRTC, they need to do their own server-side implementation and deployment, signaling front-end and front-end selection and implementation, and mobile phone adaptation. Waiting for a series of specific work; in addition, a lot of improvements and polishing must be carried out in terms of usability and high quality, and the threshold requirements for its own development capabilities are very high. A professional RTC technical service system needs to not only cover the above communication links, but also need a dedicated communication network to solve the instability of the Internet, as well as a high-tolerance audio and video signal processing algorithm for the Internet channel. Of course, the high availability of conventional cloud services, service quality assurance and monitoring and maintenance tools can only be regarded as the basic modules of a professional service provider. Therefore, WebRTC is only a combination of several small subdivisions in the RTC technology stack, and is not a full-stack solution.

After clarifying these relationships, the theoretical foundation is indispensable. It is still strongly recommended that beginners first understand the theoretical knowledge of WebRTC (the two long theoretical articles I posted before are very good, hurry up and read them), and then look for the code. read. Theory - practice - then theory - then optimization, this is the only shortcut to becoming a technical expert.

How to get started?

According to the technical direction of RTC, it can be simply classified into the following major areas:

  • End-to-end link (signaling signal, stun hole punching, turn forwarding);
  • Video capture/encoding/filter processing;
  • Audio collection/encoding/echo cancellation and noise reduction processing;
  • Real-time transmission/QoE quality assurance;

This article starts from the first part (end-to-end link), analyzes the composition of Android WebRTCdemo, and simply extends to the interpretation and analysis of the official suite API.

Android's WebRTC demo project has two dependencies, one is the official package, and the other is libs/autobanh.jar. This package is mainly responsible for websocket communication.

dependencies {
    implementation fileTree(dir: "libs", include: ["*.jar"]) // libs/autobanh.jar
    implementation 'androidx.appcompat:appcompat:1.1.0'
    implementation 'org.webrtc:google-webrtc:1.0.32006'
}

The file composition and its main functions can be divided as follows according to the inheritance relationship. (The classes in the org.webrtc:google-webrtc package are not displayed yet)

The core of the Demo is basically two blocks, AppRTCClient and PeerConnectClient. CallActivity serves as a carrier to undertake logical interactions. The rest are all about additional parameter settings around these parts.

Next, we will introduce the logic of 1v1 video call test and how to understand signaling Signal?

What is signaling?

Let’s first talk about how to use demo for testing:

1. Go to https://appr.tc from any browser and create any room number. First open https://appr.tc in the browser to generate the room id, and then enter the room.
2. Start the Android app. Enter the room number and press call. Video call should start. Enter room id and press call.

After steps 1~2, ConnectActivity will jump to CallActivity.onCreate, with the following code snippet:

peerConnectionParameters =
	new PeerConnectionClient.PeerConnectionParameters(
			intent.getBooleanExtra(EXTRA_VIDEO_CALL, true),
			loopback, tracing, videoWidth, videoHeight,
			intent.getIntExtra(EXTRA_VIDEO_FPS, 0),
			intent.getIntExtra(EXTRA_VIDEO_BITRATE, 0),
			intent.getStringExtra(EXTRA_VIDEOCODEC),
			intent.getBooleanExtra(EXTRA_HWCODEC_ENABLED, true),
			intent.getBooleanExtra(EXTRA_FLEXFEC_ENABLED, false),
			intent.getIntExtra(EXTRA_AUDIO_BITRATE, 0),
			intent.getStringExtra(EXTRA_AUDIOCODEC),
			intent.getBooleanExtra(EXTRA_NOAUDIOPROCESSING_ENABLED, false),
			intent.getBooleanExtra(EXTRA_AECDUMP_ENABLED, false),
			intent.getBooleanExtra(EXTRA_SAVE_INPUT_AUDIO_TO_FILE_ENABLED, false),
			intent.getBooleanExtra(EXTRA_OPENSLES_ENABLED, false),
			intent.getBooleanExtra(EXTRA_DISABLE_BUILT_IN_AEC, false),
			intent.getBooleanExtra(EXTRA_DISABLE_BUILT_IN_AGC, false),
			intent.getBooleanExtra(EXTRA_DISABLE_BUILT_IN_NS, false),
			intent.getBooleanExtra(EXTRA_DISABLE_WEBRTC_AGC_AND_HPF, false),
			intent.getBooleanExtra(EXTRA_ENABLE_RTCEVENTLOG, false),
			dataChannelParameters);

Uri roomUri = intent.getData(); // 默认为https://appr.tc
String roomId = intent.getStringExtra(EXTRA_ROOMID);   // 房间rumber
String urlParameters = intent.getStringExtra(EXTRA_URLPARAMETERS);
roomConnectionParameters = new AppRTCClient.RoomConnectionParameters(
                                roomUri.toString(), roomId, loopback, urlParameters);

From the name and parameter list, we can roughly know that PeerConnectionParameters are some configuration parameters related to the audio and video module; the other RoomConnectionParameters is also very simple, which is the room URL and room ID. The next step is to instantiate AppRtcClient and PeerConnectionClient.

// Create connection client. Use DirectRTCClient if room name is an IP 
// otherwise use the standard WebSocketRTCClient.
if (loopback || !DirectRTCClient.IP_PATTERN.matcher(roomId).matches()) {
    appRtcClient = new WebSocketRTCClient(this); // AppRTCClient.SignalingEvents Callback
} else {
    Log.i(TAG, "Using DirectRTCClient because room name looks like an IP.");
    appRtcClient = new DirectRTCClient(this);
}

// Create peer connection client.
peerConnectionClient = new PeerConnectionClient(getApplicationContext(), eglBase, peerConnectionParameters, CallActivity.this);

PeerConnectionFactory.Options options = new PeerConnectionFactory.Options();
if (loopback) {
    options.networkIgnoreMask = 0;
}
peerConnectionClient.createPeerConnectionFactory(options);
if (screencaptureEnabled) {
    startScreenCapture();
} else {
    startCall();
}

1. The remarks are clearly described. The room name using IP in the test is DirectRTCClient. In other cases, WebSocketRTCClient is used.

2. PeerConnectClient is a class that encapsulates PeerConnection. Students who have mastered the theoretical basis of WebRTC can know that WebRTC implements the following three sets of APIs:

  • MediaStream (also called getUserMedia)
  • RTCPeerConnection
  • RTCDataChannel

PeerConnection is the core of WebRTC network connection. The Android version of the API uses Factory mode configuration to create PeerConnection, which will be analyzed below.

3. The third detail I want to talk about is that WebRTC also uses the EGL environment, which proves that the underlying video collection and rendering processing uses OpenGL. This is also left to be analyzed later.

4. Finally, the Android version of WebRTC also supports screen projection for Android L and above (ouch ~ not bad); next, analyze startCall

    private void startCall() {
        if (appRtcClient == null) {
            Log.e(TAG, "AppRTC client is not allocated for a call.");
            return;
        }
        callStartedTimeMs = System.currentTimeMillis();
        // Start room connection.
        logAndToast(getString(R.string.connecting_to, roomConnectionParameters.roomUrl));
        appRtcClient.connectToRoom(roomConnectionParameters); // <- 这句是重点。

        // Create and audio manager that will take care of audio routing,
        // audio modes, audio device enumeration etc.
        audioManager = AppRTCAudioManager.create(getApplicationContext());
        // Store existing audio settings and change audio mode to
        // MODE_IN_COMMUNICATION for best possible VoIP performance.
        // This method will be called each time the number of available audio devices has changed.
        audioManager.start((device, availableDevices) -> {
            //onAudioManagerDevicesChanged(device, availableDevices);
            Log.d(TAG, "onAudioManagerDevicesChanged: " + availableDevices + ", " + "selected: " + device);
            // TODO: add callback handler.
        });
    }

It seems quite simple. AppRTCAudioManager is to create and manage audio devices and will be responsible for audio routing, audio mode, audio device enumeration, etc. The functions are very rich, including: wireless near-field sensor devices, Bluetooth headsets, traditional wired headsets, etc. But this is not the focus of the analysis in this article. Interested students can take a look at it and learn about it later.

The key point is AppRtcClient.connectToRoom(roomConnectionParameters); The above analysis shows that the instance object of AppRTCClient at this time is WebSocketRTCClient, which jumps to the corresponding code for analysis.

// Connects to room - function runs on a local looper thread.
private void connectToRoomInternal() {
    String connectionUrl = getConnectionUrl(connectionParameters);
    // connectionUrl = https://appr.tc/join/roomId;
   
    wsClient = new WebSocketChannelClient(handler, this);

    RoomParametersFetcher.RoomParametersFetcherEvents callbacks =
            new RoomParametersFetcher.RoomParametersFetcherEvents() {
        @Override
        public void onSignalingParametersReady(final SignalingParameters params) {
            WebSocketRTCClient.this.handler.post(new Runnable() {
                @Override
                public void run() {
                    WebSocketRTCClient.this.signalingParametersReady(params);
                }
            });
        }
        @Override
        public void onSignalingParametersError(String description) {
            WebSocketRTCClient.this.reportError(description);
        }
    };
    new RoomParametersFetcher(connectionUrl, null, callbacks).makeRequest();
}
// Callback issued when room parameters are extracted. Runs on local looper thread.
private void signalingParametersReady(final SignalingParameters signalingParameters) {
    Log.d(TAG, "Room connection completed.");
    if (!signalingParameters.initiator || signalingParameters.offerSdp != null) {
        reportError("Loopback room is busy.");
        return;
    }
    if (!signalingParameters.initiator && signalingParameters.offerSdp == null) {
        Log.w(TAG, "No offer SDP in room response.");
    }
    initiator = signalingParameters.initiator;
    messageUrl = getMessageUrl(connectionParameters, signalingParameters);
    // https://appr.tc/message/roomId/clientId
    leaveUrl = getLeaveUrl(connectionParameters, signalingParameters);
    // https://appr.tc/leave/roomId/clientId
    
    events.onConnectedToRoom(signalingParameters);
    
    wsClient.connect(signalingParameters.wssUrl, signalingParameters.wssPostUrl);
    wsClient.register(connectionParameters.roomId, signalingParameters.clientId);
}

1. Request a connection to  https://appr.tc/join/roomId , join the room corresponding to roomId, and return the signaling parameters of the current room.

2. Obtain the URLs of the two room events (message, leave) and the URLs of the two WebSockets through the returned signaling parameters. You can see similar logs printed as follows:

com.zzrblog.appwebrtc D/RoomRTCClient: RoomId: 028711912. ClientId: 74648260
com.zzrblog.appwebrtc D/RoomRTCClient: Initiator: false
com.zzrblog.appwebrtc D/RoomRTCClient: WSS url: wss://apprtc-ws.webrtc.org:443/ws
com.zzrblog.appwebrtc D/RoomRTCClient: WSS POST url: https://apprtc-ws.webrtc.org:443
com.zzrblog.appwebrtc D/RoomRTCClient: Request TURN from: https://appr.tc/v1alpha/iceconfig?key=
com.zzrblog.appwebrtc D/WebSocketRTCClient: Room connection completed.
com.zzrblog.appwebrtc D/WebSocketRTCClient: Message URL: https://appr.tc/message/028711912/74648260
com.zzrblog.appwebrtc D/WebSocketRTCClient: Leave URL: https://appr.tc/leave/028711912/74648260

Only one of the four URLs, signalingParameters.wssUrl, is the real websocket URL. The others are HTTPS requests. It can be guessed that wssUrl is the address responsible for signaling interaction between visitors in the room. The others are used to maintain the room status.

So we know here that signaling is actually a series of logical parameters used to maintain the current terminal/room for end-to-end call pre-connection. Therefore, it is impossible for WebRTC to define or implement APIs that are strongly related to the business. Because of different business requirements, signaling The type of parameters will change, making it difficult to achieve protocol unification.

4. Analyze the connect and register of wsClient (WebSocketChannelClient) to further confirm: wssUrl is used to send functional commands for signaling interaction, and wssPostUrl is only requested when exiting the room.

5. After analyzing this part of the logic, don’t forget to also use events.onConnectedToRoom(signalingParameters); to call back the signaling parameters to the host CallActivity.

private void onConnectedToRoomInternal(final AppRTCClient.SignalingParameters params) {
        signalingParameters = params;
        VideoCapturer videoCapturer = null;
        if (peerConnectionParameters.videoCallEnabled) {
            videoCapturer = createVideoCapturer();
        }
        peerConnectionClient.createPeerConnection(
                localProxyVideoSink, remoteSinks, videoCapturer, signalingParameters);

        if (signalingParameters.initiator) {
            // 房间创建第一人走这里
            peerConnectionClient.createOffer();
        } else {
            if (params.offerSdp != null) {
                peerConnectionClient.setRemoteDescription(params.offerSdp);
                // 如果不是房间创建第一人,那就判断信令是否拿到offersdp,
                // 如果有有offersdp就证明有人进入房间并发出了offer
                // 设置remote sdp 并 answer
                peerConnectionClient.createAnswer();
            }
            if (params.iceCandidates != null) {
                // Add remote ICE candidates from room.
                for (IceCandidate iceCandidate : params.iceCandidates) {
                    peerConnectionClient.addRemoteIceCandidate(iceCandidate);
                }
            }
        }
    }

Analyzing this section of CallActivity.onConnectedToRoom, it should not be difficult to understand with the annotation logic. After the normal test process createPeerConnection, the branch of createAnswer is usually followed, because after entering the room with https://appr.tc, he becomes the first person to create the room. First analyze the creation process of PeerConnection according to the process steps.

Creation of PeerConnection

Now let’s analyze how PeerConnectionClient creates a PeerConnection object. So far, PeerConnectionClient has gone through three steps: 1. PeerConnectionClient constructor; 2. createPeerConnectionFactory; 3. createPeerConnection; follow this process and post the detailed code.

1. PeerConnectionClient constructor

public PeerConnectionClient(Context appContext, EglBase eglBase,
                            PeerConnectionParameters peerConnectionParameters,
                            PeerConnectionEvents events)
{
    this.rootEglBase = eglBase; this.appContext = appContext; this.events = events;
    this.peerConnectionParameters = peerConnectionParameters;
    this.dataChannelEnabled = peerConnectionParameters.dataChannelParameters != null;

    final String fieldTrials = getFieldTrials(peerConnectionParameters);
    executor.execute(() -> {
        Log.d(TAG, "Initialize WebRTC. Field trials: " + fieldTrials);
        PeerConnectionFactory.initialize(
                PeerConnectionFactory.InitializationOptions.builder(appContext)
                .setFieldTrials(fieldTrials)
                .setEnableInternalTracer(true)
                .createInitializationOptions());
    });
}

WebRTC API代码//
public static class InitializationOptions.Builder {
    private final Context applicationContext;
    private String fieldTrials = "";
    private boolean enableInternalTracer;
    private NativeLibraryLoader nativeLibraryLoader = new DefaultLoader();
    private String nativeLibraryName = "jingle_peerconnection_so";
    @Nullable private Loggable loggable;
    @Nullable private Severity loggableSeverity;
    ... ...
}

Use Factory's configuration mode to initialize PeerConnectionFactory. There are two places that I paid a little attention to: fieldTrials and nativeLibraryName = "jingle_peerconnection_so"; I'll give it a try here and explain it in detail later in the in-depth source code analysis.

2、createPeerConnectionFactory

private void createPeerConnectionFactoryInternal(PeerConnectionFactory.Options options) {
    // Check if ISAC is used by default.
    preferIsac = peerConnectionParameters.audioCodec!=null
            && peerConnectionParameters.audioCodec.equals(AUDIO_CODEC_ISAC);
    // Create peer connection factory.
    final boolean enableH264HighProfile =
            VIDEO_CODEC_H264_HIGH.equals(peerConnectionParameters.videoCodec);
    final VideoEncoderFactory encoderFactory;
    final VideoDecoderFactory decoderFactory;
    if (peerConnectionParameters.videoCodecHwAcceleration) {
        encoderFactory = new DefaultVideoEncoderFactory(
                rootEglBase.getEglBaseContext(), true, enableH264HighProfile);
        decoderFactory = new DefaultVideoDecoderFactory(rootEglBase.getEglBaseContext());
    } else {
        encoderFactory = new SoftwareVideoEncoderFactory();
        decoderFactory = new SoftwareVideoDecoderFactory();
    }
	final AudioDeviceModule adm = createJavaAudioDevice();
    factory = PeerConnectionFactory.builder()
            .setOptions(options)
            .setAudioDeviceModule(adm)
            .setVideoEncoderFactory(encoderFactory)
            .setVideoDecoderFactory(decoderFactory)
            .createPeerConnectionFactory();
    Log.d(TAG, "Peer connection factory created.");
    adm.release();
    // 篇幅关系,只显示关键代码。
}

AudioDeviceModule createJavaAudioDevice() {
    if (!peerConnectionParameters.useOpenSLES) {
        Log.w(TAG, "External OpenSLES ADM not implemented yet.");
        // TODO: Add support for external OpenSLES ADM.
    }
    // Set audio record error callbacks.
    JavaAudioDeviceModule.AudioRecordErrorCallback audioRecordErrorCallback;
	// Set audio track error callbacks.
    JavaAudioDeviceModule.AudioTrackErrorCallback audioTrackErrorCallback;
    // Set audio record state callbacks.
    JavaAudioDeviceModule.AudioRecordStateCallback audioRecordStateCallback;
    // Set audio track state callbacks.
    JavaAudioDeviceModule.AudioTrackStateCallback audioTrackStateCallback;
    // 篇幅关系就不把代码贴全了。
    return JavaAudioDeviceModule.builder(appContext)
            .setSamplesReadyCallback(saveRecordedAudioToFile)
            .setUseHardwareAcousticEchoCanceler(!peerConnectionParameters.disableBuiltInAEC)
            .setUseHardwareNoiseSuppressor(!peerConnectionParameters.disableBuiltInNS)
            .setAudioRecordErrorCallback(audioRecordErrorCallback)
            .setAudioTrackErrorCallback(audioTrackErrorCallback)
            .setAudioRecordStateCallback(audioRecordStateCallback)
            .setAudioTrackStateCallback(audioTrackStateCallback)
            .createAudioDeviceModule();
}

There are several points that need to be marked here, depending on whether to use hardware acceleration to initialize VideoEncode/DecoderFactory; then there is AudioDeviceModule, which is the interface definition class of the WebRTC-Java layer representing the audio module. The current audio module only supports OpenSLES, and there are two that I am very concerned about The settings setUseHardwareAcousticEchoCanceler / setUseHardwareNoiseSuppressor. As we all know, before WebRTC was acquired by Google, the most famous technical point was audio processing. We must dig out the article in the future.

Use a diagram to briefly describe the composition of PeerConnectionFactory:

3、createPeerConnection

Finally, we have reached the point where we actually create the PeerConnection. Let’s not talk nonsense and let’s look at the code.

public void createPeerConnection(final VideoSink localRender,
                                 final List<VideoSink> remoteSinks,
                                 final VideoCapturer videoCapturer,
                                 final AppRTCClient.SignalingParameters signalingParameters)
{
    this.localRender = localRender; // 本地视频渲染载体VideoSink
    this.remoteSinks = remoteSinks; // 远程端视频渲染载体VideoSink,可能多个,所以是List
    this.videoCapturer = videoCapturer; // 本地视频源
    this.signalingParameters = signalingParameters;
    executor.execute(() -> {
            createMediaConstraintsInternal();
            createPeerConnectionInternal();
            maybeCreateAndStartRtcEventLog();
    });
}

private void createMediaConstraintsInternal() {
    // Create video constraints if video call is enabled.
    if (isVideoCallEnabled()) {
        videoWidth = peerConnectionParameters.videoWidth;
        videoHeight = peerConnectionParameters.videoHeight;
        videoFps = peerConnectionParameters.videoFps;
    }
    // Create audio constraints.
    audioConstraints = new MediaConstraints();
    // added for audio performance measurements
    if (peerConnectionParameters.noAudioProcessing) {
        Log.d(TAG, "Audio constraints disable audio processing");
        audioConstraints.mandatory.add(
                new MediaConstraints.KeyValuePair(AUDIO_ECHO_CANCELLATION_CONSTRAINT, "false"));
        audioConstraints.mandatory.add(
                new MediaConstraints.KeyValuePair(AUDIO_AUTO_GAIN_CONTROL_CONSTRAINT, "false"));
        audioConstraints.mandatory.add(
                new MediaConstraints.KeyValuePair(AUDIO_HIGH_PASS_FILTER_CONSTRAINT, "false"));
        audioConstraints.mandatory.add(
                new MediaConstraints.KeyValuePair(AUDIO_NOISE_SUPPRESSION_CONSTRAINT, "false"));
    }
    // Create SDP constraints.
    sdpMediaConstraints = new MediaConstraints();
    sdpMediaConstraints.mandatory.add(
            new MediaConstraints.KeyValuePair("OfferToReceiveAudio", "true"));
    sdpMediaConstraints.mandatory.add(new MediaConstraints.KeyValuePair(
            "OfferToReceiveVideo", Boolean.toString(isVideoCallEnabled())));
}

private void createPeerConnectionInternal() {
    PeerConnection.RTCConfiguration rtcConfig =
            new PeerConnection.RTCConfiguration(signalingParameters.iceServers);
    // TCP candidates are only useful when connecting to a server that supports ICE-TCP.
    rtcConfig.tcpCandidatePolicy = PeerConnection.TcpCandidatePolicy.DISABLED;
    rtcConfig.bundlePolicy = PeerConnection.BundlePolicy.MAXBUNDLE;
    rtcConfig.rtcpMuxPolicy = PeerConnection.RtcpMuxPolicy.REQUIRE;
    rtcConfig.continualGatheringPolicy = PeerConnection.ContinualGatheringPolicy.GATHER_CONTINUALLY;
    rtcConfig.keyType = PeerConnection.KeyType.ECDSA;  //Use ECDSA encryption.
    // Enable DTLS for normal calls and disable for loopback calls.
    rtcConfig.enableDtlsSrtp = !peerConnectionParameters.loopback;
    rtcConfig.sdpSemantics = PeerConnection.SdpSemantics.UNIFIED_PLAN;
    // 请关注这里
    peerConnection = factory.createPeerConnection(rtcConfig, pcObserver);

    List<String> mediaStreamLabels = Collections.singletonList("ARDAMS");
    if (isVideoCallEnabled()) {
        peerConnection.addTrack(createVideoTrack(videoCapturer), mediaStreamLabels);
        // We can add the renderers right away because we don't need to wait for an
        // answer to get the remote track.
        remoteVideoTrack = getRemoteVideoTrack();
        if (remoteVideoTrack != null) {
            remoteVideoTrack.setEnabled(renderVideo);
            for (VideoSink remoteSink : remoteSinks) {
                remoteVideoTrack.addSink(remoteSink);
            }
        }
    }
    peerConnection.addTrack(createAudioTrack(), mediaStreamLabels);
    if (isVideoCallEnabled()) {
        for (RtpSender sender : peerConnection.getSenders()) {
            if (sender.track() != null) {
                String trackType = sender.track().kind();
                if (trackType.equals(VIDEO_TRACK_TYPE)) {
                    Log.d(TAG, "Found video sender.");
                    localVideoSender = sender;
                }
            }
        }
    }

    if (peerConnectionParameters.aecDump) {
        try {
            ParcelFileDescriptor aecDumpFileDescriptor =
                    ParcelFileDescriptor.open("Download/audio.aecdump"), ...);
            factory.startAecDump(aecDumpFileDescriptor.detachFd(), -1);
        } catch (IOException e) {
            Log.e(TAG, "Can not open aecdump file", e);
        }
    }
}

private void maybeCreateAndStartRtcEventLog() {
    rtcEventLog = new RtcEventLog(peerConnection);
    rtcEventLog.start(createRtcEventLogOutputFile());
}

The code format is a bit long, and it is already compressed to retain useful parts. Students with a basic knowledge of WebRTC should be able to understand the logic of the function createMediaConstraintsInternal, which corresponds to the constraint settings of getUserMedia (mediaStreamConstraints).

Take a closer look at the logic of the function createPeerConnectionInternal and set up PeerConnection.RTCConfiguration. The focus is on the iceServers in the signaling parameters SignalingParameters, which records the stun URL of the hole-piercing server and the turn URL of the forwarding server provided by the business server (that is, our programmers). , and then use PeerConnectionFactory.createPeerConnection(rtcConfig, pcObserver); to create a PeerConnection and callback various status processing through pcObserver. Due to space constraints, the code will not be posted. Students can follow the code by themselves and just remember the callbacks of IceCandidateConnection.

Then there are the two addTracks of PeerConnection, let’s interpret them together:

List<String> mediaStreamLabels = Collections.singletonList("ARDAMS");
peerConnection.addTrack(createVideoTrack(videoCapturer), mediaStreamLabels);
peerConnection.addTrack(createAudioTrack(), mediaStreamLabels);

private @Nullable AudioTrack createAudioTrack() {
    audioSource = factory.createAudioSource(audioConstraints);
    localAudioTrack = factory.createAudioTrack(AUDIO_TRACK_ID, audioSource);
    localAudioTrack.setEnabled(enableAudio);
    return localAudioTrack;
}
private @Nullable VideoTrack createVideoTrack(VideoCapturer capturer) {
    surfaceTextureHelper =
            SurfaceTextureHelper.create("CaptureThread", rootEglBase.getEglBaseContext());
    videoSource = factory.createVideoSource(capturer.isScreencast());
    capturer.initialize(surfaceTextureHelper, appContext, videoSource.getCapturerObserver());
    capturer.startCapture(videoWidth, videoHeight, videoFps);
    localVideoTrack = factory.createVideoTrack(VIDEO_TRACK_ID, videoSource);
    localVideoTrack.setEnabled(renderVideo);
    localVideoTrack.addSink(localRender);
    return localVideoTrack;
}
///     WebRTC.PeerConnection.内部代码
public RtpSender addTrack(MediaStreamTrack track, List<String> streamIds) {
    if (track != null && streamIds != null) {
        RtpSender newSender = this.nativeAddTrack(track.getNativeMediaStreamTrack(), streamIds);
        if (newSender == null) {
            throw new IllegalStateException("C++ addTrack failed.");
        } else {
            this.senders.add(newSender);
            return newSender;
        }
    } else {
        throw new NullPointerException("No MediaStreamTrack specified in addTrack.");
    }
}

The first important point is that inside PeerConnection.addTrack, an RtpSender object is returned through nativeAddTrack and maintained on the java layer;

Curious and attentive students may also find that in addition to RtpSender, there are also RtpReceiver and RtpTransceiver, RtpTransceiver = RtpSender + RtpReceiver; it seems a bit messy, now we have an understanding (digging holes) and then we will analyze it in depth (filling holes).

public class PeerConnection {
    private final List<MediaStream> localStreams;
    private final long nativePeerConnection;
    private List<RtpSender> senders;
    private List<RtpReceiver> receivers; 
    private List<RtpTransceiver> transceivers;
    ... ...
}
public class RtpTransceiver {
    private long nativeRtpTransceiver;
    private RtpSender cachedSender;
    private RtpReceiver cachedReceiver;
    ... ...
}
public class RtpReceiver {
    private long nativeRtpReceiver;
    private long nativeObserver;
    @Nullable private MediaStreamTrack cachedTrack;
    ... ...
}
public class RtpSender {
    private long nativeRtpSender;
    @Nullable private MediaStreamTrack cachedTrack;
    @Nullable private final DtmfSender dtmfSender;
    ... ...
}

The second key point is to add the local videoTrack and audioTrack to the PeerConnection, then try to obtain the remote remoteVideoTrack from the RtpTransceiver of the PeerConnection, and add the VideoSink that currently corresponds to rendering the remote video to the remoteVideoTrack; at the same time, obtain the corresponding one and post Corresponding code snippet:

remoteVideoTrack = getRemoteVideoTrack();
if (remoteVideoTrack != null) {
    for (VideoSink remoteSink : remoteSinks) {
        remoteVideoTrack.addSink(remoteSink);
    }
}

// Returns the remote VideoTrack, assuming there is only one.
private @Nullable VideoTrack getRemoteVideoTrack() {
    for (RtpTransceiver transceiver : peerConnection.getTransceivers()) {
        MediaStreamTrack track = transceiver.getReceiver().track();
        if (track instanceof VideoTrack) {
            return (VideoTrack) track;
        }
    }
    return null;
}

At this point, the logic of createPeerConnection has been basically explained. Here are two summary ideas:

1. How to understand the relationship between Track, Source and Sink, and how are they connected?

Answer: In fact, it can be understood from the literal meaning: first of all, source is the data source or the encapsulation of data input. For video sources, it is generally a camera object or a file object. The source is injected into the track transport track and becomes a communication bridge between source and sink. The sink is equivalent to the terminal sink of the conveyor track. A source can flow through multiple tracks, and a track can eventually flow to multiple sinks and be processed by different sinks before output. As for the WebRTC code, VideoSink is nothing more than the rendering carrier of Android systems such as surfaceview, or it can be written to local files. An abstract understanding of the relationship between these objects will help to analyze the code in depth later.

2. The composition of PeerConnection is as follows: (Following the picture of Factory above)

onConnectedToRoom

Do you remember where the creation of PeerConnection was triggered? (Refer to the onConnectedToRoomInternal code snippet) Just visit https://appr.tc/join/roomid before connecting to the room, get the SignalingEvents.onConnectedToRoom callback after getting the signaling parameters, then return here now. For normal testing, use a browser to access https://appr.tc. After randomly generating a roomid, you will become the first person to create a room after entering the room. Therefore, after createPeerConnection, you usually take the createAnswer branch. From the code, we can see createOffer / createAnswer / setRemoteDescription, but setLocalDescription is missing. With these questions, we continue to analyze the logical flow of PeerConnectionClient.

public void createAnswer() {
    executor.execute(() -> {
        if (peerConnection != null && !isError) {
            isInitiator = false;
            peerConnection.createAnswer(sdpObserver, sdpMediaConstraints);
        }
    });
}
public void createOffer() {
    executor.execute(() -> {
        if (peerConnection != null && !isError) {
            isInitiator = true;
            peerConnection.createOffer(sdpObserver, sdpMediaConstraints);
        }
    });
}
public void setRemoteDescription(final SessionDescription sdp) {
    executor.execute(() -> {
        // 按需修改sdp设置
        SessionDescription sdpRemote = new SessionDescription(sdp.type, sdpDescription);
        peerConnection.setRemoteDescription(sdpObserver, sdpRemote);
    });
}

private class SDPObserver implements SdpObserver {
    @Override
    public void onCreateSuccess(SessionDescription sdp) {
        if (localSdp != null) {
            reportError("LocalSdp has created.");
            return;
        }
        String sdpDescription = sdp.description;
        if (preferIsac) {
            sdpDescription = preferCodec(sdpDescription, AUDIO_CODEC_ISAC, true);
        }
        if (isVideoCallEnabled()) {
            sdpDescription =
                    preferCodec(sdpDescription, getSdpVideoCodecName(peerConnectionParameters), false);
        }
        final SessionDescription renewSdp = new SessionDescription(sdp.type, sdpDescription);
        localSdp = renewSdp;
        executor.execute(() -> {
            if (peerConnection != null && !isError) {
                Log.d(TAG, "Set local SDP from " + sdp.type);
                peerConnection.setLocalDescription(sdpObserver, sdp);
            }
        });
    }

    @Override
    public void onSetSuccess() {
        executor.execute(() -> {
            if (peerConnection == null || isError) {
                return;
            }
            if (isInitiator) {
                // For offering peer connection we first create offer and set
                // local SDP, then after receiving answer set remote SDP.
                if (peerConnection.getRemoteDescription() == null) {
                    // We've just set our local SDP so time to send it.
                    Log.d(TAG, "Local SDP set successfully");
                    events.onLocalDescription(localSdp);
                } else {
                    // We've just set remote description,
                    // so drain remote and send local ICE candidates.
                    Log.d(TAG, "Remote SDP set successfully");
                    drainCandidates();
                }
            } else {
                // For answering peer connection we set remote SDP and then
                // create answer and set local SDP.
                if (peerConnection.getLocalDescription() != null) {
                    // We've just set our local SDP so time to send it, drain
                    // remote and send local ICE candidates.
                    Log.d(TAG, "Local SDP set successfully");
                    events.onLocalDescription(localSdp);
                    drainCandidates();
                } else {
                    Log.d(TAG, "Remote SDP set succesfully");
                }
            }
        });
    }

    @Override
    public void onCreateFailure(String error) {
        reportError("createSDP error: " + error);
    }
    @Override
    public void onSetFailure(String error) {
        reportError("setSDP error: " + error);
    }
}

From the implementation code, you can see that the createOffer/Answer events of PeerConnection are all taken over by an SDPObserver. There are two callback functions onCreateSuccess/onSetSuccess. Looking at the names, you can guess that onCreateSuccess is the callback after createOffer/Answer is successful, and onSetSuccess is the callback for setLocal/RemoteDescription.

Interpret according to the normal process:

1. Call createAnswer after createPeerConnection to trigger the callback SDPObserver.onCreateSuccess. At this time, the global variable localSdp==null will create localSdp and call setLocalDescription.

2. After creating localSdp and calling setLocalDescription, SDPObserver.onSetSuccess is triggered. Because it is not the first person to create, take the branch of isInitiator==false; because setLocalDescription is set in the first step, PeerConnection.getLocalDescription() != null, callback PeerConnectionEvents.onLocalDescription( localSdp) to CallActivity;

 implements  PeerConnectionClient.PeerConnectionEvents
@Override
public void onLocalDescription(SessionDescription sdp) {
    runOnUiThread(() -> {
        if (appRtcClient != null) {
            if (signalingParameters!=null && signalingParameters.initiator) {
                appRtcClient.sendOfferSdp(sdp);
            } else {
                appRtcClient.sendAnswerSdp(sdp);
            }
            // ... ...
        }
    });
}

3. Because the first person to create the room is not, take the branch of isInitiator==false and trigger the AppRTCClient instance WebSocketRTCClient.sendAnswerSdp.

@Override
public void sendAnswerSdp(SessionDescription sdp) {
    handler.post(new Runnable() {
        @Override
        public void run() {
            JSONObject json = new JSONObject();
            jsonPut(json, "sdp", sdp.description);
            jsonPut(json, "type", "answer");
            wsClient.send(json.toString());
        }
    });
}

public void onWebSocketMessage(String message) {

        JSONObject json = new JSONObject(message);
        String msgText = json.getString("msg");
        String errorText = json.optString("error");
        if (msgText.length() > 0) {
            json = new JSONObject(msgText);
            String type = json.optString("type");
            if (type.equals("candidate")) {
                events.onRemoteIceCandidate(toJavaCandidate(json));
            }else if (type.equals("remove-candidates")) {
                JSONArray candidateArray = json.getJSONArray("candidates");
                IceCandidate[] candidates = new IceCandidate[candidateArray.length()];
                for (int i = 0; i < candidateArray.length(); ++i) {
                    candidates[i] = toJavaCandidate(candidateArray.getJSONObject(i));
                }
                events.onRemoteIceCandidatesRemoved(candidates);
            } else if (type.equals("answer")) {
                if (initiator) {
                    SessionDescription sdp = new SessionDescription(
                            SessionDescription.Type.fromCanonicalForm(type), json.getString("sdp"));
                    events.onRemoteDescription(sdp);
                } else {
                    reportError("Received answer for call initiator: " + message);
                }
            } else if (type.equals("offer")) {
                if (!initiator) {
                    SessionDescription sdp = new SessionDescription(
                            SessionDescription.Type.fromCanonicalForm(type), json.getString("sdp"));
                    events.onRemoteDescription(sdp);
                } else {
                    reportError("Received offer for call receiver: " + message);
                }
            } else if (type.equals("bye")) {
                events.onChannelClose();
            } else {
                reportError("Unexpected WebSocket message: " + message);
            }
        } else {
            if (errorText.length() > 0) {
                reportError("WebSocket error message: " + errorText);
            } else {
                reportError("Unexpected WebSocket message: " + message);
            }
        }
}

4. wsClient uses WebSocket to establish a communication object connected to wssUrl. It has been analyzed before that the service corresponding to wssUrl is used to exchange signaling parameters between visitors in the room. Process various types of messages and make callbacks in onWebSocketMessage. But this is not callback type.equals("answer") type information. Interested students can print out all the callback information. This is very helpful for understanding the entire signaling parameter interaction process.

I will give the answer directly here: it may be a "candidate" / "offer" type message, or it may be that no information was received. "Candidate" type messages are more likely, because "offer" will generally receive such information immediately after the wssUrl link is successfully created, and then call back CallActivity.onRemoteDescription, and then call PeerConnectionClient.createAnswer(). Some students may be confused. Damn it, isn't step 1 already createAnswer? Yes, but this time after calling back SDPObserver.onCreateSuccess, localSdp! =null, no more information of any type will be sent out. At this point, the sdp of local/remote has been set up successfully.

finished?

This article is over, but Android-WebRTC still has a lot of content worth digging into. There will be a series of articles in the follow-up to record my learning process. The focus is on modules such as network connection transmission, video encoding and decoding, and audio processing (echo cancellation and noise reduction).

The next article will mostly involve API analysis of WebRTC Java and open the source code of jingle_peerconnection_so.

That is all.

Guess you like

Origin blog.csdn.net/a360940265a/article/details/115393626