You need to have some basic understanding of audio, if you don't know it very well, it is recommended to read the previous two articles first.

scene description

Only one sound in music is sometimes very thin. We often want to add different sounds together, but when recording, we need to strictly synchronize and control the time difference between the two sounds within the range of hearing. get the result we want. Another point is that when recording, in order not to mix the playback sound with the vocals or instrumental sounds, the recorder usually needs to listen and record with headphones.

In order to achieve a very good fit between two or more sounds, in addition to solving the synchronization of recording and playback, it is also necessary to consider the delay of sound transmission from the mobile phone to the headset. In addition to this scene appearing on some more professional music software, the commonly used K song software will inevitably encounter this problem.

A silver lining: MediaSyncEvent?

Throwing the conclusion first: it does not solve the problem~

I must start with the SDK first, AudioRecordfind a method startRecording(MediaSyncEvent syncEvent)in it, and read the document again, as if I saw a light in the dark.

The MediaSyncEvent class defines events that can be used to synchronize playback or capture * actions between different players and recorders.

However, there is too little information on its use, there is a question on stackoverflow with 0 answer: here . After searching Google for a long time, I finally found it in the official CTS (Compatibility Test Suite): in the method of AudioRecordTest . testSynchronizedRecordBy the way, these unit tests are very good official learning materials. If you can't find the answer, you might as well come here and look for it.

After the research, testSynchronizedRecordlet's come back and see MediaSyncEventwhat it is used for?

MediaSycEventMediaSyncEvent.createEvent()Constructed by , it supports two event types.

    /**
     * No sync event specified. When used with a synchronized playback or capture method, the
     * behavior is equivalent to calling the corresponding non synchronized method.
     */
    public static final int SYNC_EVENT_NONE = AudioSystem.SYNC_EVENT_NONE;

    /**
     * The corresponding action is triggered only when the presentation is completed
     * (meaning the media has been presented to the user) on the specified session.
     * A synchronization of this type requires a source audio session ID to be set via
     * {@link #setAudioSessionId(int) method.
     */
    public static final int SYNC_EVENT_PRESENTATION_COMPLETE = AudioSystem.SYNC_EVENT_PRESENTATION_COMPLETE;

In fact, there is only one, SYNC_EVENT_NONEwhich is equivalent to no synchronization event. The conventional AudioRecord.startRecording()method is to use this parameter. From the test case of AudioRecordTest.testSynchronizedRecord , it can be known that SYNC_EVENT_PRESENTATION_COMPLETEthe function is actually a recording that AudioTrackis triggered at the moment AudioRecordof playback, which is obviously inconsistent with our needs. I don’t know which scenarios will have this demand. This is a parameter, if you have an idea, you can leave a message to me.

CyclicBarrier to help

When this road fails, we need to find another way. Before the athletes compete, we need to let everyone wait on the same line until we see the signal and then start together. Here, we also need to let AudioTrackand AudioRecordwait on the same starting line first, and then set off together and go their separate ways. In the Java world, it is CyclicBarriervery suitable to do this.

// play 和 record 两个同步线程
CyclicBarrier recordBarrier = new CyclicBarrier(2);

AudioTrack audioTrack;
AudioRecord audioRecord;

// UI Thread
public void start(){
    recordBarrier.reset();
    audioTrack.play();
    audioRecord.startRecording();
    new RecordThread().start();
    new PlayThread().start();
}

class RecordThread extends Thread{
    public void run(){
        //等play线程开始写的时候read
        recordBarrier.await();
        audioRecord.read();
    }
}

class PlayThread extends Thread{
    public void run(){
        //等reacord线程开始读的时候write
        recordBarrier.await();
        audioTrack.write();
    }
}

By CyclicBarrierhaving AudioTrackthe writeand AudioRecordof the above readon the same starting line, it seems that things have been resolved, but it is not. Although you start to writesend data to the headset, it will take a while for the headset to receive the signal and actually make sound.

Dealing with recording delay issues

Let's go back to the user's real usage scenario to see how the problem occurs?

recording delay

The playback source is the real data source. For example, the accompaniment data block located at 1ms AudioTrackmay be 100ms from the start of writing to the headphone playback, and the user only starts recording his own voice at this time. There may also be a start from the device. A delay from collecting sound to the buffer. If you use a Bluetooth headset, the delay problem will be more prominent.

Let's take a look at the delay. The sound recorded in the cafe has more noise, but it is not difficult to hear that the recording is delayed compared to the original sound.

Take a look at the sonogram:

Delay sonogram

solution:

When recording and playback start, they will be performed in parallel in the same time domain. According to the characteristics of delay, it is not difficult to draw:

Recording duration = delay duration + playback duration + extra duration (free recording after playback)

As long as we can know the length of the delay, when reading the recorded data, we only need to intercept the AudioRecordprevious delay data to solve the problem. So how can we know how many bytes of data should be truncated? Here I came up with an ingenious solution to share with you.

From the sound wave diagram of the metronome above, we can see that the peak corresponds to 哒the sound, and the difference between the peaks on the recording track and the metronome track is what we want to know 延迟时长. According to this feature, we can design 延迟时长an idea to obtain this:

Let the user wear headphones and record according to the sound of a fixed-rhythm metronome (with a certain time interval), as simple as that 啦..啦..啦...
According to the obtained recording data and the original metronome sound, I took 8 peak interval data for comparison. If the delay error is within a small range, it is considered correct.

The specific algorithm is roughly as follows:

//ANALYZE_BEAT_LEN = 8
int[] maxPositions = new int[ANALYZE_BEAT_LEN];
for(i = 0; i != maxPositions.length; i++){
    byte[] segBytes = getSegBytes(); //获取一拍时长的数据
    maxPositions[i] = getMaxSamplePos(segBytes);// 获取拍中波峰所在的大致位置
}

//按小到大排序
Arrays.sort(maxPositions);

//取中间一半的值，如果平均值误差在 10 毫秒内，就认为是正确的
int sampleTotalValue = 0;
int sampleLen = ANALYZE_BEAT_LEN / 2;
int[] sampleValues = new int[sampleLen];

for(int beginIndex = sampleLen / 2, i=0; i != sampleLen; i++){
    sampleValues[i] = maxPositions[ i + beginIndex];
    sampleTotalValue += sampleValues[i];
}

int averSampleValue = sampleTotalValue / sampleLen;

boolean isValid = true;
for(int sampleValue : sampleValues){
    //errorRangeByteLen : 10 毫秒的 byte 长度
    if(Math.abs(averSampleValue - sampleValue) > errorRangeByteLen){
        isValid = false;
    }
}

if(isValid){
    stopPlay = true;
    // 结果
    int result = averSampleValue;
}

Results display

Waveform diagram:

Sound result:

After the adjustment, the situation has improved a lot, and there is basically no delay in hearing. However, this will bring some inconvenience to the user, and it needs to be re-adjusted when changing the headset. Personal cognition is really limited. Although this may be an effective method, it is definitely not the best practice. At the same time, I am curious how software like singba handles it? Welcome to share your ideas~

References

Delay problem of wireless audio: http://www.memchina.cn/News/9733.html
MediaSyncEvent TestCase:

Technical exchange group: 70948803, the group is quiet most of the time, only communicates technology related, rarely speaks, advertising trolls are not welcome.

If you don't play music, you can close it here.

Brightly coloured ad time:

If you play music, I made a music learning and recording aid. I can finally download it in the App Market: Voice Notes+ , although it is still rough, I look forward to your support~

And these useful tools:

Android implementation of listening and recording exploration