foreword
This article describes WeRTC
how to achieve audio and video recording
Reference materials:
1. 2022 Android Eleven Big Factory interview questions; 134 real questions; no longer afraid of interviews
2. Tencent Android development notes
1. What is webRTC
" WebRTC
( Web Real-Time Communications
) is a real-time communication technology that allows network applications or sites to establish a peer-to-peer (Peer-to-Peer) connection between browsers without intermediaries to achieve video streaming and/or audio Streaming or other arbitrary data transmission. WebRTC
The inclusion of these standards makes it possible for users to create peer-to-peer (Peer-to-Peer) data sharing and teleconferencing without installing any plug-ins or third-party software." In summary, In fact, there are four points:
- Cross-platform
- (mainly) for browsers
- real-time transmission
- audio and video engine
After a little in-depth study webRTC
, I found that it can be used not only for "audio and video recording, video calls", but also for "cameras, music players, shared remote desktops, instant messaging tools, P2P network acceleration, file transfers, real-time "Face recognition" and other scenarios - of course, it is completed on the basis of combining many other technologies. But a few of these seem to be "familiar" to us: camera, face recognition, shared desktop. This is because RTC uses an API based on audio and video streams!
2. webRTC audio and video data collection
The most important thing to realize data transmission is data collection. Here is a very important API:
let promise=navigator.mediaDevices.getUserMedia(containts);
This API prompts the user for permission to use the media input, which produces a MediaStream containing a track of the requested media type. This stream can contain a video track (from hardware or virtual video sources, such as cameras, video capture devices, screen sharing services, etc.), an audio track (also from hardware or virtual audio sources, such as microphones, A/D converters, etc. etc.), and possibly other track types.
It returns an object and will call back an object Promise
on success . If the user denies the permission, or the required media source is not available, an or will be called back .resolve
MediaStream
promise
reject
PermissionDeniedError
NotFoundError
The most important thing here is the "orbit": because it is based on "flow". What it gets is an MediaSource
object (it's an stream
object)! That means: if you want to use this result, you must either use srcObject
attributes or URL.createObjectURL()
convert to url
!
/**
* params:
* object: 用于创建URL的File 对象、Blob对象或者MediaSource对象。
* return : 一个DOMString包含了一个对象URL,该URL可用于指定源 object的内容。
*/
objectURL= URL.create0bjectURL(object) ;
Back to the API itself, it can be understood as follows: through it, you can get the collection of all audio and video channels on the current page, divide them into different "tracks" according to different interfaces, and we can set and output configuration items for them.
In the article, you can see that this API is used in the part of implementing "photographing"
Let's first write an element on the page video
:
<video autoplay playsinline id="player"></video>
Because it is necessary to obtain audio and video streams, and HTML5
the related elements in it are video and elements audio
, the only ones that can obtain these two at the same time video
are elements (if you only need audio streams, you can use audio
elements).
According to the above getUserMedia API
usage, it is not difficult to write the following grammar:
navigator.mediaDevices.getUserMedia(constraints)
.then(gotMediaStream)
.catch(handleError);
3. webRTC acquisition constraints
"Constraint" refers to some configuration items of the control object. Constraints fall into two categories: video constraints and audio constraints.
In webRTC
, commonly used video constraints are:
width
height
aspectRatio
: aspect ratioframeRate
facingMode
: Camera flip (front and rear cameras) (this configuration is mainly for mobile terminals, because the browser only has a front camera)resizeMode
: whether to clip
The commonly used audio constraints are:
volume
:VolumesampleRate
:Sampling RatesampleSize
: sample sizeechoCancellation
: Echo cancellation settingautoGainControl
: Auto GainnoiseSuppression
: noise reductionlatency
: Delay (according to different scenarios, generally speaking, the smaller the delay, the better the experience)channelCount
: Mono/Dual channel…
contraints
what is it Officially it is called " MediaStreamContraints
". We can see it more clearly through the code:
dictionary MediaStreamContraints{
(boolean or MediaTrackContaints) video = false;
(boolean or MediaTrackContaints) audio = false;
}
It is responsible for the collection of audio and video constraints
- If
video/audio
it is simply set asbool
the type, it is simply to decide whether to collect - If
video/audio
yesMedia Track
, you can further set such as: video resolution, frame rate, audio volume, sampling rate, etc.
According to the above description, we can improve the configuration items in the code - of course, usually the first thing to do when using this API is to determine whether the user's current browser supports:
if(!navigator.mediaDevices || !navigator.mediaDevices.getUserMedia){
console.log('getUserMedia is not supported!');
return;
}else{
var constraints={
// 这里也可以直接:video:false/true,则知识简单的表示不采集/采集视频
video:{
width:640,
height:480,
frameRate:60
},
// 这里也可以直接:audio:false/true,则只是简单的表示不采集/采集音频
audio:{
noiseSuppression:true,
echoCancellation:true
}
}
navigator.mediaDevices.getUserMedia(constraints)
.then(gotMediaStream)
.catch(handleError);
}
Now it's time to implement getUserMedia API
the callback function for success and failure. Here we need to understand "what to do when calling back" - in fact, it is nothing more than handing over the "acquired stream": save it as a global variable or directly as video/audio
a srcObject
value (success) or throws an error (failure)
var videoplay=document.querySelector('#player');
function gotMediaStream(stream){
// 【1】
videoplay.srcObject=stream;
// 【2】
// return navigator.mediaDevices.enumerateDevices();
}
function handleError(err){
console.log('getUserMedia error:',err);
}
A comment is usedAPI
:mediaDevices.enumerateDevices()
.MediaDevices
method enumerates all available media input and output devices (such as microphone, camera, headset, etc.). The returned promiseMediaDevice
is resolved with an array of information describing the device.
It can also be regarded as the "track collection" mentioned above. For example, if youpromise
write the comment according to the method here, you will find that it is an array containing three "tracks". Twoaudio
(input, output) onevideo
:
And when you output stream
the parameters you will find
In some scenarios, through this API
we can expose some device configurations to users
So far the first effect in the video - real-time capture of audio and video is complete. This is also the basis for the following: if you can't capture it, how can you record it?
First add some needed nodes:
<video id="recplayer" playsinline></video>
<button id="record">Start Record</button>
<button id="recplay" disabled>Play</button>
<button id="download" disabled>Download</button>
As mentioned earlier, what is obtained after capture is a "stream" object, so when receiving, it must also be able to get the stream and operate it API
: MediaStream API
!
MDN
It is introduced as follows: " MediaRecorder
It is MediaStream Recording API
an interface provided for easy recording of media, and it needs to MediaRecorder()
be instantiated by calling the constructor." Because it is MediaStream Recording API
a provided interface, it has a constructor:
MediaRecorder.MediaRecorder()
: Create a new object and record theMediaRecorder
specified object. The supported configuration items include setting the type of container (such as " " or " ") and the bit rate of audio and video or the same bit rate for bothMediaStream
MIME
video/webm
video/mp4
According to MDN
the method provided above, we can get a clearer idea: call the constructor first, pass in the stream obtained earlier, API
get an object after parsing, and pass them into the array in turn under the specified time slice (Because the work has basically been completed so far, it is convenient to use the array to receive and convert to Blob
object operation in the future):
function startRecord(){
// 定义一个接收数组
buffer=[];
var options={
mimeType:'video/webm;codecs=vp8'
}
// 返回一个Boolean值,来表示设置的MIME type 是否被当前用户的设备支持.
if(!MediaRecorder.isTypeSupported(options.mimeType)){
console.error(`${options.mimeType} is not supported`);
return;
}
try{
mediaRecorder=new MediaRecorder(window.stream,options);
}catch(e){
console.error('Fail to create');
return;
}
mediaRecorder.ondataavailable=handleDataAvailable;
// 时间片
// 开始录制媒体,这个方法调用时可以通过给timeslice参数设置一个毫秒值,如果设置这个毫秒值,那么录制的媒体会按照你设置的值进行分割成一个个单独的区块
mediaRecorder.start(10);
}
There are two points to note here:
- Acquired
stream
stream: Because the function of capturing audio and video before must be encapsulated, so if you want to use it here again, you must set this stream object as a global object——I directly mounted it on the object, and commented in the previouswindow
code [1] places the receiving object:window.stream=stream
; buffer
An array is defined ,mediaRecorder
which is the same as the object. Considering theondataavailable
use in the method and the need to stop capturing and recording the audio and video stream after completion, the variable is also declared under the global
var buffer;
var mediaRecorder;
ondataavailable
The method is triggered after the recording data preparation is completed. What we need to do in it is to store the sliced data in sequence according to the previous thinking
function handleDataAvailable(e){
// console.log(e)
if(e && e.data && e.data.size>0){
buffer.push(e.data);
}
}
//结束录制
function stopRecord(){
mediaRecorder.stop();
}
Then call:
let recvideo=document.querySelector('#recplayer');
let btnRecord=document.querySelector('#record');
let btnPlay=document.querySelector('#recplay');
let btnDownload=document.querySelector('#download');
// 开始/停止录制
btnRecord.onclick=()=>{
if(btnRecord.textContent==='Start Record'){
startRecord();
btnRecord.textContent='Stop Record';
btnPlay.disabled=true;
btnDownload.disabled=true;
}else{
stopRecord();
btnRecord.textContent='Start Record';
btnPlay.disabled=false;
btnDownload.disabled=false;
}
}
// 播放
btnPlay.onclick=()=>{
var blob=new Blob(buffer,{type: 'video/webm'});
recvideo.src=window.URL.createObjectURL(blob);
recvideo.srcObject=null;
recvideo.controls=true;
recvideo.play();
}
// 下载
btnDownload.onclick=()=>{
var blob=new Blob(buffer,{type:'video/webm'});
var url=window.URL.createObjectURL(blob);
var a=document.createElement('a');
a.href=url;
a.style.display='none';
a.download='recording.webm';
a.click();
}
At this point in the article, the functions in the video are basically realized, but there is still a problem: if you only want to record audio, at this time when you are talking, because the capture stream is always working, there are actually two sound sources at this time—you voice and captured your voice. So we need to turn off the sound when capturing! getUserMedia API
But if you add it in according to the above, volume:0
it will not have any effect-because there is no browser support for this attribute so far.
However, we think that what we carry audio and video is a video/audio
container, so we can actually directly use the properties DOM
of its corresponding nodes volume
to control:
// 在前面代码中注释为【2】的地方添加代码
videoplay.volume=0;
Reference materials:
1. 2022 Android Eleven Big Factory interview questions; 134 real questions; no longer afraid of interviews
How does WebRTC realize audio and video recording
How does WebRTC achieve audio and video recording in the original text-Knowledge
★The business card at the end of the article can receive audio and video development learning materials for free, including (FFmpeg, webRTC, rtmp, hls, rtsp, ffplay, srs) and audio and video learning roadmaps, etc.
see below!