android audio architecture and technology selection

1. android audio architecture

The Android system provides four levels of audio API:

1.Java layer MediaRecorder&MediaPlayer series;
2.Java layer AudioTrack&AudioRecorder series;
3.Jni layer opensles;
4.JNI layer AAudio (introduced by Android O)

Let’s start with this classic Android system architecture diagram:
Insert image description here
From the diagram, the entire Android system is divided into the following four layers from bottom to top:
1. Linux Kernel
2. Hardware adaptation layer
3. Framework layer (can be divided into Java layer and C++ layer)
4. APP layer
The four levels of audio API implementation we introduced above are all in the Framework layer. What are the audio-related functions of other layers? How do we ultimately drive the hardware to work when we call a certain API? Let's first take a look at the audio-related modules and functions of each layer of the system.
Insert image description here
1.1 Java layer
The Java layer provides the android.media API to interact with audio hardware. Internally, this code calls the appropriate JNI class to access the native code that interacts with the audio hardware.

-Source code directory: frameworks/base/media/java/android/media/

  • AudioManager: Audio manager, including volume management, AudioFocus management, audio device management, and mode management;
  • Recording: AudioRecord, MediaRecorder;
  • Playback: AudioTrack, MedaiPlayer, SoundPool, ToneGenerator;
  • Codec: MediaCodec, audio and video data codec interface.

1.2 JNI layer
The JNI code associated with android.media can call lower-level native code to access the audio hardware. JNI is located in frameworks/base/core/jni/ and frameworks/base/media/jni.
The AAudio and OpenSLES interfaces are called here.

1.3 Native framework Native framework layer
Whether it is the Java layer or the JNI layer, they are only interfaces provided to the outside world, and the real implementation is in the native framework layer. The native framework provides a native equivalent of the android.media package, which calls the Binder IPC proxy to access the media server's audio-specific services. The native framework code is located in frameworks/av/media/libmedia or frameworks/av/media/libaudioclient (the location changes in different versions).

1.4 Binder IPC
The Binder IPC agent is used to facilitate communication across process boundaries. The agent is located in frameworks/av/media/libmedia or frameworks/av/media/libaudioclient and starts with the letter "I".

1.5 Audio Server
The Audio system is responsible for audio data streaming and control functions in Android, and is also responsible for the management of audio devices. This part serves as the input/output level of Android's Audio system. It is generally responsible for playing PCM sound output and obtaining PCM sound from the outside, as well as managing sound devices and settings (note: the decoding function is not implemented here. In the android system, the decoding of audio and video is Completed by opencore or stagefright, the audio system interface is called after decoding to create the audio stream and play it). Audio services existed in mediaserver before Android N (7.0). Android N began to exist in the form of audioserver. These audio services are the actual code that interacts with the HAL implementation. The media servers are located in frameworks/av/services/audioflinger and frameworks/av/services/audiopolicy.

Audio service includes AudioFlinger and AudioPolicyService:

  • AudioFlinger: Mainly responsible for the management of audio streaming devices and the processing and transmission of audio stream data, volume calculation, resampling, mixing, sound effects, etc.
  • AudioPolicyService: Mainly responsible for audio policy related, volume adjustment effect, device selection, audio channel selection, etc.

1.6 HAL layer
HAL defines the standard interface called by the audio service and that the mobile phone must implement to ensure the normal operation of the audio hardware function. The audio HAL interface is located in hardware/libhardware/include/hardware. See audio.h for details.

1.7 Kernel Driver Layer
The audio driver interacts with the hardware and HAL implementation. We can use Advanced Linux Audio Architecture (ALSA), Open Sound System (OSS) or a custom driver (HAL is driver agnostic).

Note: If you are using ALSA, it is recommended to use external/tinyalsa for the user part of the driver, as it has a compatible license (the standard user-mode library is licensed under the GPL).

2. Technology selection and its advantages and disadvantages

The Android system provides developers with a variety of audio rendering methods at the SDK and NDK layers. Each rendering method is actually designed for different scenarios. We must understand the best practices for each method, so that You can use them best in development work.

Audio rendering at the SDK layer

  • MediaPlayer: Suitable for playing local music files or online streaming media files in the background for a long time. It is equivalent to an end-to-end player that can play both audio and video. Its packaging level is relatively high and its use is relatively simple.
  • SoundPool: It is also an end-to-end audio player. The advantage is: low latency, more suitable for scenes with interactive feedback sounds, and suitable for playing relatively short audio clips, such as game sounds, button sounds, ringtone clips, etc. It can simultaneously Play multiple audios.
  • AudioTrack: It is an audio rendering API that directly faces PCM data, so it is also a lower-level API that provides very powerful control capabilities and is suitable for scenarios such as low-latency playback and streaming audio rendering. Because it is directly oriented to PCM data. Rendering, so generally it needs to be used in conjunction with a decoder.

Audio rendering at the NDK layer
The Android system provides 2 sets of commonly used audio rendering methods at the NDK layer (the API provided by the Native layer, that is, the API that can be called by the C or C++ layer), namely OpenSL ES and AAudio, both of which are for Android. It is designed for low-latency scenarios (real-time ear monitor, RTC, real-time feedback interaction). Let’s take a look at it together.

  • OpenSL ES: It is the implementation of the OpenSL ES API specification developed by Khronos Group. It is dedicated to low-latency and high-performance audio scenarios on Android. The API interface design will be somewhat obscure and complicated. Currently, Google does not recommend developers to use OpenSL ES for new applications. developed. However, it has better compatibility under Android 8.0 system and some fragmented Android devices, so it is also very important to master this audio rendering method.
  • AAudio: Designed specifically for low-latency, high-performance audio applications. The API design is streamlined. It is an application interface recommended by Google for building audio in new applications. It is very beneficial to master this audio rendering method and add this audio rendering capability to existing applications. However, it is only suitable for Android 8.0 and above, and the adaptability is not particularly good in some brands of special Rom versions.

The following points need attention:

  1. The two sets of audio rendering methods in the NDK layer are suitable for different Android versions and can be applied in different scenarios.
  2. The most common method of rendering PCM is AudioTrack.
  3. Since AudioTrack is the lowest-level audio playback API provided by the Android SDK layer, only PCM raw data is allowed to be input. Compared with MediaPlayer, for a compressed audio file (such as MP3, AAC, etc.), it requires the developer to implement decoding operations and buffer control.

Guess you like

Origin blog.csdn.net/weixin_45719581/article/details/131231558