An example of audio collection and push on the access side of GB28181 devices on the Android platform

Technical background​

GB/T28181 is a standard protocol specification widely used in the video surveillance industry, which can realize interconnection between different devices. Today we mainly discuss the Audio acquisition part of the Android platform.

Let me first talk about how to get the data source and collect audio on the Android platform. The commonly used methods are as follows:

1. Use the MediaRecorder class: The MediaRecorder class provides a set of APIs that can be used to record audio. You can use the MediaRecorder.AudioSource.MIC source to capture audio from the microphone, and use the MediaRecorder.setOutputFormat() method to set the output file format, use the MediaRecorder.setAudioEncoder() method to set the audio encoding, and so on. Once set up, you can prepare for recording using the MediaRecorder.prepare() method, start recording using the MediaRecorder.start() method, stop recording using the MediaRecorder.stop() method, and finally release resources using the MediaRecorder.release() method.
2. Use the AudioRecord class: The AudioRecord class provides a set of APIs that can be used to collect audio data in real time. You can use the AudioRecord.AudioSource.MIC source to collect audio from the microphone, and set parameters such as sampling rate, number of channels, and sampling precision. Once configured, you can use the AudioRecord.read() method to read audio data and process it.
3. Use third-party SDKs: Some third-party SDKs also provide audio capture functions, such as OpenCV, OpenAL, etc. You can find the audio capture API that suits your needs in these SDKs, and use and configure them according to their documentation.

Technical realization

This article takes the Android platform GB28181 device access module of Daniu Live SDK as an example. Here we use the AudioRecord class to complete the collection of audio data sources, collect audio data, and perform PCMA or AAC encoding (About AAC encoding, GB/T28181 -2022 is clearly stated).

Set the audio encoding type:

    /**
     * Set audio encoder type(设置音频编码类型)
     * 
     * @param type: if with 1:AAC, if with 2: SPEEX, if with 3: PCMA
     * 
     * @return {0} if successful
     */
    public native int SmartPublisherSetAudioCodecType(long handle, int type);

If it is AAC, you can also set the encoding rate:

	/**
	 * Set audio encoder bit-rate(设置音频编码码率), 当前只对AAC编码有效
	 *
	 * @param kbit_rate: 码率(单位是kbps), 如果是0的话将使用默认码率, 必须大于等于0
	 *
	 * @return {0} if successful
	 */
	public native int SmartPublisherSetAudioBitRate(long handle, int kbit_rate);

The Android project call is as follows:

void CheckInitAudioRecorder() {
        if (audioRecord_ == null) {
            audioRecord_ = new NTAudioRecordV2(this);
        }

        if (audioRecord_ != null) {
            Log.i(TAG, "CheckInitAudioRecorder call audioRecord_.start()+++...");

            audioRecordCallback_ = new NTAudioRecordV2CallbackImpl();

            // audioRecord_.IsMicSource(true);      //如采集音频声音过小,可以打开此选项

            // audioRecord_.IsRemoteSubmixSource(true);

            audioRecord_.AddCallback(audioRecordCallback_);

            audioRecord_.Start(is_pcma_?8000: 44100,1);

            Log.i(TAG, "CheckInitAudioRecorder call audioRecord_.start()---...");
        }
    }

In view of the fact that GB28181 will involve voice broadcast and voice intercom, it is necessary to open the echo cancellation setting, as well as noise suppression and other related settings.

    /**
     * Set Audio Noise Suppression(设置音频噪音抑制)
     * 
     * @param isNS: if with 1:suppress, if with 0: does not suppress
     * 
     * @return {0} if successful
     */
    public native int SmartPublisherSetNoiseSuppression(long handle, int isNS);
    
    
    /**
     * Set Audio AGC(设置音频自动增益控制)
     * 
     * @param isAGC: if with 1:AGC, if with 0: does not AGC
     * 
     * @return {0} if successful
     */
    public native int SmartPublisherSetAGC(long handle, int isAGC);
    

	/**
	 * Set Audio Echo Cancellation(设置音频回音消除)
	 *
	 * @param isCancel: if with 1:Echo Cancellation, if with 0: does not cancel
	 *
	 * @param delay: echo delay(ms), if with 0, SDK will automatically estimate the delay.
	 *
	 * @return {0} if successful
	 */
	public native int SmartPublisherSetEchoCancellation(long handle, int isCancel, int delay);

If you need to adjust the audio volume of the acquisition end, you can use the following interface:

	/**
	 * 设置输入音量, 这个接口一般不建议调用, 在一些特殊情况下可能会用, 一般不建议放大音量
	 *
	 * @param index: 一般是0和1, 如果没有混音的只用0, 有混音的话, 0,1分别设置音量
	 *
	 * @param volume: 音量,默认是1.0,范围是[0.0, 5.0], 设置成0静音, 1音量不变
	 *
	 * @return {0} if successful
	 */
	public native int SmartPublisherSetInputAudioVolume(long handle, int index, float volume);

The audio data delivery interface settings before encoding are as follows:

	/**
	 * 传递PCM音频数据给SDK, 每10ms音频数据传入一次
	 * 
	 *  @param pcmdata: pcm数据, 需要使用ByteBuffer.allocateDirect分配, ByteBuffer.isDirect()是true的才行.
	 *  @param size: pcm数据大小
	 *  @param sample_rate: 采样率,当前只支持{44100, 8000, 16000, 24000, 32000, 48000}, 推荐44100
	 *  @param channel: 通道, 当前通道支持单通道(1)和双通道(2),推荐单通道(1)
	 *  @param per_channel_sample_number: 这个请传入的是 sample_rate/100
	 */
	public native int SmartPublisherOnPCMData(long handle, ByteBuffer pcmdata, int size, int sample_rate, int channel, int per_channel_sample_number);


	/**
	 * 传递PCM音频数据给SDK, 每10ms音频数据传入一次
	 *
	 *  @param pcmdata: pcm数据, 需要使用ByteBuffer.allocateDirect分配, ByteBuffer.isDirect()是true的才行.
	 *  @param offset: pcmdata的偏移
	 *  @param size: pcm数据大小
	 *  @param sample_rate: 采样率,当前只支持{44100, 8000, 16000, 24000, 32000, 48000}, 推荐44100
	 *  @param channel: 通道, 当前通道支持单通道(1)和双通道(2),推荐单通道(1)
	 *  @param per_channel_sample_number: 这个请传入的是 sample_rate/100
	 */
	public native int SmartPublisherOnPCMDataV2(long handle, ByteBuffer pcmdata, int offset, int size, int sample_rate, int channel, int per_channel_sample_number);


	/**
	 * 传递PCM音频数据给SDK, 每10ms音频数据传入一次
	 *
	 *  @param pcm_short_array: pcm数据, short是native endian order
	 *  @param offset: 数组偏移
	 *  @param len: 数组项数
	 *  @param sample_rate: 采样率,当前只支持{44100, 8000, 16000, 24000, 32000, 48000}, 推荐44100
	 *  @param channel: 通道, 当前通道支持单通道(1)和双通道(2),推荐单通道(1)
	 *  @param per_channel_sample_number: 这个请传入的是 sample_rate/100
	 */
	public native int SmartPublisherOnPCMShortArray(long handle, short[] pcm_short_array, int offset, int len, int sample_rate, int channel, int per_channel_sample_number);


	/**
	 * 传递PCM音频数据给SDK, 每10ms音频数据传入一次
	 *
	 *  @param pcm_float_array: pcm数据
	 *  @param offset: 数组偏移
	 *  @param len: 数组项数
	 *  @param sample_rate: 采样率,当前只支持{44100, 8000, 16000, 24000, 32000, 48000}, 推荐44100
	 *  @param channel: 通道, 当前通道支持单通道(1)和双通道(2),推荐单通道(1)
	 *  @param per_channel_sample_number: 这个请传入的是 sample_rate/100
	 */
	public native int SmartPublisherOnPCMFloatArray(long handle, float[] pcm_float_array, int offset, int len, int sample_rate, int channel, int per_channel_sample_number);


	/**
	 * 请参考SmartPublisherOnPCMFloatArray
	 */
	public native int SmartPublisherOnPCMFloatNative(long handle, long pcm_float_data, int offset, int len, int sample_rate, int channel, int per_channel_sample_number);

	/**
	 * Set far end pcm data
	 * 
	 * @param pcmdata : 16bit pcm data
	 * @param sampleRate: audio sample rate
	 * @param channel: auido channel
	 * @param per_channel_sample_number: per channel sample numbers
	 * @param is_low_latency: if with 0, it is not low_latency, if with 1, it is low_latency
	 * @return {0} if successful
	 */
	public native int SmartPublisherOnFarEndPCMData(long handle,  ByteBuffer pcmdata, int sampleRate, int channel, int per_channel_sample_number, int is_low_latency);

How to encode the audio data, you can use the following interface to deliver:

    /**
     * 设置音频数据(AAC/PCMA/PCMU/SPEEX)
     *
     * @param codec_id:
     *
     *  NT_MEDIA_CODEC_ID_AUDIO_BASE = 0x10000,
     *	NT_MEDIA_CODEC_ID_PCMA = NT_MEDIA_CODEC_ID_AUDIO_BASE,
     *	NT_MEDIA_CODEC_ID_PCMU,
     *	NT_MEDIA_CODEC_ID_AAC,
     *	NT_MEDIA_CODEC_ID_SPEEX,
     *	NT_MEDIA_CODEC_ID_SPEEX_NB,
     *	NT_MEDIA_CODEC_ID_SPEEX_WB,
     *	NT_MEDIA_CODEC_ID_SPEEX_UWB,
     *
     * @param data audio数据
     *
     * @param offset data的偏移
     *
     * @param size data length
     *
     * @param is_key_frame 是否I帧, if with key frame, please set 1, otherwise, set 0, audio忽略
     *
     * @param timestamp video timestamp
     *
     * @param parameter_info 用于AAC special config信息填充
     *
     * @param parameter_info_size parameter info size
     *
     * @param sample_rate 采样率,如果需要录像的话必须传正确的值
     *
     *@param channels 通道数, 如果需要录像的话必须传正确的值, 一般是1或者2
     *
     * @return {0} if successful
     */
    public native int SmartPublisherPostAudioEncodedData(long handle, int codec_id,
                                                           ByteBuffer data, int offset, int size,
                                                           int is_key_frame, long timestamp,
                                                           byte[] parameter_info, int parameter_info_size,
                                                           int sample_rate, int channels);

Example of audio data delivery:

class NTAudioRecordV2CallbackImpl implements NTAudioRecordV2Callback {
  @Override
  public void onNTAudioRecordV2Frame(ByteBuffer data, int size, int sampleRate, int channel, int per_channel_sample_number) {
    /*
    		 Log.i(TAG, "onNTAudioRecordV2Frame size=" + size + " sampleRate=" + sampleRate + " channel=" + channel
    				 + " per_channel_sample_number=" + per_channel_sample_number);

    		 */

    if (publisherHandle != 0) {
      libPublisher.SmartPublisherOnPCMData(publisherHandle, data, size, sampleRate, channel, per_channel_sample_number);
    }
  }
}

Stop Audio acquisition:

if (audioRecord_ != null) {
  Log.i(TAG, "stopPush, call audioRecord_.StopRecording..");

  audioRecord_.Stop();

  if (audioRecordCallback_ != null) {
    audioRecord_.RemoveCallback(audioRecordCallback_);
    audioRecordCallback_ = null;
  }

  audioRecord_ = null;
}

Summarize

GB28181 sets the access side, and generally adopts G.711A law or AAC encoding. Data access may be directly collected through AudioRecord, or it may be Audio data after external encoding, which can be selected according to the scene.

 

Guess you like

Origin blog.csdn.net/renhui1112/article/details/131775649