Analysis of audiomixer implementation in Android

The Android framework's audio processing module library libaudioprocessing (located in frameworks/av/media/libaudioprocessing ) provides a mixer component AudioMixer, which is mainly used in audioflinger to mix multi-channel audio source data to facilitate sending to the audio device for playback. come out.

The audio mixing operation itself generally just adds up the sampling data of each audio source. However, when designing a mixer, you generally have to consider how to solve the following problems:

  • How to represent the audio source data to be mixed?
  • How to set the audio data source for an audio source to be mixed?
  • How to set the data buffer for data output after mixing, or how to obtain the data after mixing?
  • How to determine the output data format and configuration of the mixer, that is, is it set by an external user through the provided interface, or is it dynamically calculated based on the data format and configuration of each audio source?
  • When the data format and configuration of the audio source are different from the output data format and configuration, how to convert the data format and configuration?
  • How to add or create an audio source to the mixer?
  • How to modify the parameters of an audio source of the mixer?
  • How to delete an audio source from the mixer?
  • When mixing, what should I do if the data after mixing crosses the boundary?
  • How is the mixing operation driven? Does the mixer start a thread inside to perform the mixing operation and throw the result through the callback, or does the user actively call the mixing operation?

Here we mainly look at the implementation of Android's mixer component based on the above issues AudioMixer. AudioMixerThe definition of the class (located in frameworks/av/media/libaudioprocessing/include/media/AudioMixer.h ) is as follows:

class AudioMixer : public AudioMixerBase
{
public:
    // maximum number of channels supported for the content
    static const uint32_t MAX_NUM_CHANNELS_TO_DOWNMIX = AUDIO_CHANNEL_COUNT_MAX;

    enum { // extension of AudioMixerBase parameters
        DOWNMIX_TYPE    = 0x4004,
        // for haptic
        HAPTIC_ENABLED  = 0x4007, // Set haptic data from this track should be played or not.
        HAPTIC_INTENSITY = 0x4008, // Set the intensity to play haptic data.
        HAPTIC_MAX_AMPLITUDE = 0x4009, // Set the max amplitude allowed for haptic data.
        // for target TIMESTRETCH
        PLAYBACK_RATE   = 0x4300, // Configure timestretch on this track name;
                                  // parameter 'value' is a pointer to the new playback rate.
    };

    AudioMixer(size_t frameCount, uint32_t sampleRate)
            : AudioMixerBase(frameCount, sampleRate) {
        pthread_once(&sOnceControl, &sInitRoutine);
    }

    bool isValidChannelMask(audio_channel_mask_t channelMask) const override;

    void setParameter(int name, int target, int param, void *value) override;
    void setBufferProvider(int name, AudioBufferProvider* bufferProvider);

private:
 . . . . . .
    inline std::shared_ptr<Track> getTrack(int name) {
        return std::static_pointer_cast<Track>(mTracks[name]);
    }

    std::shared_ptr<TrackBase> preCreateTrack() override;
    status_t postCreateTrack(TrackBase *track) override;

    void preProcess() override;
    void postProcess() override;

    bool setChannelMasks(int name,
            audio_channel_mask_t trackChannelMask, audio_channel_mask_t mixerChannelMask) override;

    static void sInitRoutine();

    static pthread_once_t sOnceControl; // initialized in constructor by first new
};

AudioMixerClass inherits from class AudioMixerBase, which contains an internal struct struct Trackthat struct Trackinherits from AudioMixerBase::TrackBase. AudioMixerBaseThe definition of the class (located in frameworks/av/media/libaudioprocessing/include/media/AudioMixerBase.h ) is as follows:

class AudioMixerBase
{
public:
    // Do not change these unless underlying code changes.
    static constexpr uint32_t MAX_NUM_CHANNELS = FCC_LIMIT;
    static constexpr uint32_t MAX_NUM_VOLUMES = FCC_2; // stereo volume only

    static const uint16_t UNITY_GAIN_INT = 0x1000;
    static const CONSTEXPR float UNITY_GAIN_FLOAT = 1.0f;

    enum { // names
        // setParameter targets
        TRACK           = 0x3000,
        RESAMPLE        = 0x3001,
        RAMP_VOLUME     = 0x3002, // ramp to new volume
        VOLUME          = 0x3003, // don't ramp
        TIMESTRETCH     = 0x3004,

        // set Parameter names
        // for target TRACK
        CHANNEL_MASK    = 0x4000,
        FORMAT          = 0x4001,
        MAIN_BUFFER     = 0x4002,
        AUX_BUFFER      = 0x4003,
        // 0x4004 reserved
        MIXER_FORMAT    = 0x4005, // AUDIO_FORMAT_PCM_(FLOAT|16_BIT)
        MIXER_CHANNEL_MASK = 0x4006, // Channel mask for mixer output
        // for target RESAMPLE
        SAMPLE_RATE     = 0x4100, // Configure sample rate conversion on this track name;
                                  // parameter 'value' is the new sample rate in Hz.
                                  // Only creates a sample rate converter the first time that
                                  // the track sample rate is different from the mix sample rate.
                                  // If the new sample rate is the same as the mix sample rate,
                                  // and a sample rate converter already exists,
                                  // then the sample rate converter remains present but is a no-op.
        RESET           = 0x4101, // Reset sample rate converter without changing sample rate.
                                  // This clears out the resampler's input buffer.
        REMOVE          = 0x4102, // Remove the sample rate converter on this track name;
                                  // the track is restored to the mix sample rate.
        // for target RAMP_VOLUME and VOLUME (8 channels max)
        // FIXME use float for these 3 to improve the dynamic range
        VOLUME0         = 0x4200,
        VOLUME1         = 0x4201,
        AUXLEVEL        = 0x4210,
    };

    AudioMixerBase(size_t frameCount, uint32_t sampleRate)
        : mSampleRate(sampleRate)
        , mFrameCount(frameCount) {
    }

    virtual ~AudioMixerBase() {}

    virtual bool isValidFormat(audio_format_t format) const;
    virtual bool isValidChannelMask(audio_channel_mask_t channelMask) const;

    // Create a new track in the mixer.
    //
    // \param name        a unique user-provided integer associated with the track.
    //                    If name already exists, the function will abort.
    // \param channelMask output channel mask.
    // \param format      PCM format
    // \param sessionId   Session id for the track. Tracks with the same
    //                    session id will be submixed together.
    //
    // \return OK        on success.
    //         BAD_VALUE if the format does not satisfy isValidFormat()
    //                   or the channelMask does not satisfy isValidChannelMask().
    status_t    create(
            int name, audio_channel_mask_t channelMask, audio_format_t format, int sessionId);

    bool        exists(int name) const {
        return mTracks.count(name) > 0;
    }

    // Free an allocated track by name.
    void        destroy(int name);

    // Enable or disable an allocated track by name
    void        enable(int name);
    void        disable(int name);

    virtual void setParameter(int name, int target, int param, void *value);

    void        process() {
        preProcess();
        (this->*mHook)();
        postProcess();
    }

    size_t      getUnreleasedFrames(int name) const;

    std::string trackNames() const;

  protected:
    // Set kUseNewMixer to true to use the new mixer engine always. Otherwise the
    // original code will be used for stereo sinks, the new mixer for everything else.
    static constexpr bool kUseNewMixer = true;

    // Set kUseFloat to true to allow floating input into the mixer engine.
    // If kUseNewMixer is false, this is ignored or may be overridden internally
    static constexpr bool kUseFloat = true;

#ifdef FLOAT_AUX
    using TYPE_AUX = float;
    static_assert(kUseNewMixer && kUseFloat,
            "kUseNewMixer and kUseFloat must be true for FLOAT_AUX option");
#else
    using TYPE_AUX = int32_t; // q4.27
#endif

    /* For multi-format functions (calls template functions
     * in AudioMixerOps.h).  The template parameters are as follows:
     *
     *   MIXTYPE     (see AudioMixerOps.h MIXTYPE_* enumeration)
     *   USEFLOATVOL (set to true if float volume is used)
     *   ADJUSTVOL   (set to true if volume ramp parameters needs adjustment afterwards)
     *   TO: int32_t (Q4.27) or float
     *   TI: int32_t (Q4.27) or int16_t (Q0.15) or float
     *   TA: int32_t (Q4.27)
     */

    enum {
        // FIXME this representation permits up to 8 channels
        NEEDS_CHANNEL_COUNT__MASK   = 0x00000007,
    };

    enum {
        NEEDS_CHANNEL_1             = 0x00000000,   // mono
        NEEDS_CHANNEL_2             = 0x00000001,   // stereo

        // sample format is not explicitly specified, and is assumed to be AUDIO_FORMAT_PCM_16_BIT

        NEEDS_MUTE                  = 0x00000100,
        NEEDS_RESAMPLE              = 0x00001000,
        NEEDS_AUX                   = 0x00010000,
    };

    // hook types
    enum {
        PROCESSTYPE_NORESAMPLEONETRACK, // others set elsewhere
    };

    enum {
        TRACKTYPE_NOP,
        TRACKTYPE_RESAMPLE,
        TRACKTYPE_RESAMPLEMONO,
        TRACKTYPE_RESAMPLESTEREO,
        TRACKTYPE_NORESAMPLE,
        TRACKTYPE_NORESAMPLEMONO,
        TRACKTYPE_NORESAMPLESTEREO,
    };

    // process hook functionality
    using process_hook_t = void(AudioMixerBase::*)();

    static bool isAudioChannelPositionMask(audio_channel_mask_t channelMask) {
        return audio_channel_mask_get_representation(channelMask)
                == AUDIO_CHANNEL_REPRESENTATION_POSITION;
    }

    struct TrackBase;
    using hook_t = void(TrackBase::*)(
            int32_t* output, size_t numOutFrames, int32_t* temp, int32_t* aux);

    struct TrackBase {
        TrackBase()
            : bufferProvider(nullptr)
        {
            // TODO: move additional initialization here.
        }
        virtual ~TrackBase() {}

        virtual uint32_t getOutputChannelCount() { return channelCount; }
        virtual uint32_t getMixerChannelCount() { return mMixerChannelCount; }

        bool        needsRamp() { return (volumeInc[0] | volumeInc[1] | auxInc) != 0; }
        bool        setResampler(uint32_t trackSampleRate, uint32_t devSampleRate);
        bool        doesResample() const { return mResampler.get() != nullptr; }
        void        recreateResampler(uint32_t devSampleRate);
        void        resetResampler() { if (mResampler.get() != nullptr) mResampler->reset(); }
        void        adjustVolumeRamp(bool aux, bool useFloat = false);
        size_t      getUnreleasedFrames() const { return mResampler.get() != nullptr ?
                                                    mResampler->getUnreleasedFrames() : 0; };

        bool        useStereoVolume() const { return channelMask == AUDIO_CHANNEL_OUT_STEREO
                                        && isAudioChannelPositionMask(mMixerChannelMask); }

        static hook_t getTrackHook(int trackType, uint32_t channelCount,
                audio_format_t mixerInFormat, audio_format_t mixerOutFormat);

        void track__nop(int32_t* out, size_t numFrames, int32_t* temp, int32_t* aux);

        template <int MIXTYPE, bool USEFLOATVOL, bool ADJUSTVOL,
            typename TO, typename TI, typename TA>
        void volumeMix(TO *out, size_t outFrames, const TI *in, TA *aux, bool ramp);

        uint32_t    needs;

        // TODO: Eventually remove legacy integer volume settings
        union {
        int16_t     volume[MAX_NUM_VOLUMES]; // U4.12 fixed point (top bit should be zero)
        int32_t     volumeRL;
        };

        int32_t     prevVolume[MAX_NUM_VOLUMES];
        int32_t     volumeInc[MAX_NUM_VOLUMES];
        int32_t     auxInc;
        int32_t     prevAuxLevel;
        int16_t     auxLevel;       // 0 <= auxLevel <= MAX_GAIN_INT, but signed for mul performance

        uint16_t    frameCount;

        uint8_t     channelCount;   // 1 or 2, redundant with (needs & NEEDS_CHANNEL_COUNT__MASK)
        uint8_t     unused_padding; // formerly format, was always 16
        uint16_t    enabled;        // actually bool
        audio_channel_mask_t channelMask;

        // actual buffer provider used by the track hooks
        AudioBufferProvider*                bufferProvider;

        mutable AudioBufferProvider::Buffer buffer; // 8 bytes

        hook_t      hook;
        const void  *mIn;             // current location in buffer

        std::unique_ptr<AudioResampler> mResampler;
        uint32_t    sampleRate;
        int32_t*    mainBuffer;
        int32_t*    auxBuffer;

        int32_t     sessionId;

        audio_format_t mMixerFormat;     // output mix format: AUDIO_FORMAT_PCM_(FLOAT|16_BIT)
        audio_format_t mFormat;          // input track format
        audio_format_t mMixerInFormat;   // mix internal format AUDIO_FORMAT_PCM_(FLOAT|16_BIT)
                                         // each track must be converted to this format.

        float          mVolume[MAX_NUM_VOLUMES];     // floating point set volume
        float          mPrevVolume[MAX_NUM_VOLUMES]; // floating point previous volume
        float          mVolumeInc[MAX_NUM_VOLUMES];  // floating point volume increment

        float          mAuxLevel;                     // floating point set aux level
        float          mPrevAuxLevel;                 // floating point prev aux level
        float          mAuxInc;                       // floating point aux increment

        audio_channel_mask_t mMixerChannelMask;
        uint32_t             mMixerChannelCount;

      protected:

        // hooks
        void track__genericResample(int32_t* out, size_t numFrames, int32_t* temp, int32_t* aux);
        void track__16BitsStereo(int32_t* out, size_t numFrames, int32_t* temp, int32_t* aux);
        void track__16BitsMono(int32_t* out, size_t numFrames, int32_t* temp, int32_t* aux);

        void volumeRampStereo(int32_t* out, size_t frameCount, int32_t* temp, int32_t* aux);
        void volumeStereo(int32_t* out, size_t frameCount, int32_t* temp, int32_t* aux);

        // multi-format track hooks
        template <int MIXTYPE, typename TO, typename TI, typename TA>
        void track__Resample(TO* out, size_t frameCount, TO* temp __unused, TA* aux);
        template <int MIXTYPE, typename TO, typename TI, typename TA>
        void track__NoResample(TO* out, size_t frameCount, TO* temp __unused, TA* aux);
    };

    // preCreateTrack must create an instance of a proper TrackBase descendant.
    // postCreateTrack is called after filling out fields of TrackBase. It can
    // abort track creation by returning non-OK status. See the implementation
    // of create() for details.
    virtual std::shared_ptr<TrackBase> preCreateTrack();
    virtual status_t postCreateTrack(TrackBase *track __unused) { return OK; }

    // preProcess is called before the process hook, postProcess after,
    // see the implementation of process() method.
    virtual void preProcess() {}
    virtual void postProcess() {}

    virtual bool setChannelMasks(int name,
            audio_channel_mask_t trackChannelMask, audio_channel_mask_t mixerChannelMask);

    // Called when track info changes and a new process hook should be determined.
    void invalidate() {
        mHook = &AudioMixerBase::process__validate;
    }

    void process__validate();
    void process__nop();
    void process__genericNoResampling();
    void process__genericResampling();
    void process__oneTrack16BitsStereoNoResampling();

    template <int MIXTYPE, typename TO, typename TI, typename TA>
    void process__noResampleOneTrack();

    static process_hook_t getProcessHook(int processType, uint32_t channelCount,
            audio_format_t mixerInFormat, audio_format_t mixerOutFormat,
            bool useStereoVolume);

    static void convertMixerFormat(void *out, audio_format_t mixerOutFormat,
            void *in, audio_format_t mixerInFormat, size_t sampleCount);

    // initialization constants
    const uint32_t mSampleRate;
    const size_t mFrameCount;

    process_hook_t mHook = &AudioMixerBase::process__nop;   // one of process__*, never nullptr

    // the size of the type (int32_t) should be the largest of all types supported
    // by the mixer.
    std::unique_ptr<int32_t[]> mOutputTemp;
    std::unique_ptr<int32_t[]> mResampleTemp;

    // track names grouped by main buffer, in no particular order of main buffer.
    // however names for a particular main buffer are in order (by construction).
    std::unordered_map<void * /* mainBuffer */, std::vector<int /* name */>> mGroups;

    // track names that are enabled, in increasing order (by construction).
    std::vector<int /* name */> mEnabled;

    // track smart pointers, by name, in increasing order of name.
    std::map<int /* name */, std::shared_ptr<TrackBase>> mTracks;
};

AudioMixerWhen the object is constructed, it needs to be passed in to perform a mixing operation, the desired output frame number and sampling rate. The specific values ​​of these parameters can be obtained from the device. As in AudioFlinger::MixerThreadand FastMixer::onStateChange(), when creating AudioMixeran object, parameters such as sample rate will be obtained from the opened audio output stream. AudioMixerThe output sampling rate is set by the user through the provided interface.

AudioMixerUse AudioMixer::Track/ AudioMixerBase::TrackBaseto represent the audio source data to be mixed. AudioMixer::Track/ AudioMixerBase::TrackBaseMaintain various information related to audio source data, including input audio data formats and parameters, various objects used in format conversion required before performing actual mixing operations, etc.

Create, add and delete audio sources

AudioMixerAudioMixerBaseThe / create()operation is used to create and add an audio source to the mixer. The definition of this operation (located in frameworks/av/media/libaudioprocessing/AudioMixerBase.cpp ) is as follows:

bool AudioMixerBase::isValidFormat(audio_format_t format) const
{
    switch (format) {
    case AUDIO_FORMAT_PCM_8_BIT:
    case AUDIO_FORMAT_PCM_16_BIT:
    case AUDIO_FORMAT_PCM_24_BIT_PACKED:
    case AUDIO_FORMAT_PCM_32_BIT:
    case AUDIO_FORMAT_PCM_FLOAT:
        return true;
    default:
        return false;
    }
}

bool AudioMixerBase::isValidChannelMask(audio_channel_mask_t channelMask) const
{
    return audio_channel_count_from_out_mask(channelMask) <= MAX_NUM_CHANNELS;
}

std::shared_ptr<AudioMixerBase::TrackBase> AudioMixerBase::preCreateTrack()
{
    return std::make_shared<TrackBase>();
}

status_t AudioMixerBase::create(
        int name, audio_channel_mask_t channelMask, audio_format_t format, int sessionId)
{
    LOG_ALWAYS_FATAL_IF(exists(name), "name %d already exists", name);

    if (!isValidChannelMask(channelMask)) {
        ALOGE("%s invalid channelMask: %#x", __func__, channelMask);
        return BAD_VALUE;
    }
    if (!isValidFormat(format)) {
        ALOGE("%s invalid format: %#x", __func__, format);
        return BAD_VALUE;
    }

    auto t = preCreateTrack();
    {
        // TODO: move initialization to the Track constructor.
        // assume default parameters for the track, except where noted below
        t->needs = 0;

        // Integer volume.
        // Currently integer volume is kept for the legacy integer mixer.
        // Will be removed when the legacy mixer path is removed.
        t->volume[0] = 0;
        t->volume[1] = 0;
        t->prevVolume[0] = 0 << 16;
        t->prevVolume[1] = 0 << 16;
        t->volumeInc[0] = 0;
        t->volumeInc[1] = 0;
        t->auxLevel = 0;
        t->auxInc = 0;
        t->prevAuxLevel = 0;

        // Floating point volume.
        t->mVolume[0] = 0.f;
        t->mVolume[1] = 0.f;
        t->mPrevVolume[0] = 0.f;
        t->mPrevVolume[1] = 0.f;
        t->mVolumeInc[0] = 0.;
        t->mVolumeInc[1] = 0.;
        t->mAuxLevel = 0.;
        t->mAuxInc = 0.;
        t->mPrevAuxLevel = 0.;

        // no initialization needed
        // t->frameCount
        t->channelCount = audio_channel_count_from_out_mask(channelMask);
        t->enabled = false;
        ALOGV_IF(audio_channel_mask_get_bits(channelMask) != AUDIO_CHANNEL_OUT_STEREO,
                "Non-stereo channel mask: %d\n", channelMask);
        t->channelMask = channelMask;
        t->sessionId = sessionId;
        // setBufferProvider(name, AudioBufferProvider *) is required before enable(name)
        t->bufferProvider = NULL;
        t->buffer.raw = NULL;
        // no initialization needed
        // t->buffer.frameCount
        t->hook = NULL;
        t->mIn = NULL;
        t->sampleRate = mSampleRate;
        // setParameter(name, TRACK, MAIN_BUFFER, mixBuffer) is required before enable(name)
        t->mainBuffer = NULL;
        t->auxBuffer = NULL;
        t->mMixerFormat = AUDIO_FORMAT_PCM_16_BIT;
        t->mFormat = format;
        t->mMixerInFormat = kUseFloat && kUseNewMixer ?
                AUDIO_FORMAT_PCM_FLOAT : AUDIO_FORMAT_PCM_16_BIT;
        t->mMixerChannelMask = audio_channel_mask_from_representation_and_bits(
                AUDIO_CHANNEL_REPRESENTATION_POSITION, AUDIO_CHANNEL_OUT_STEREO);
        t->mMixerChannelCount = audio_channel_count_from_out_mask(t->mMixerChannelMask);
        status_t status = postCreateTrack(t.get());
        if (status != OK) return status;
        mTracks[name] = t;
        return OK;
    }
}

When calling create()the operation, you need to pass in the audio source identifier name, the channel mask (number of channels) channelMask, sampling format formatand session ID of the audio source input data. The execution process of this operation is as follows:

  1. nameCheck whether the corresponding audio source already exists according to , if it already exists, fail and exit, otherwise continue execution;
  2. Check whether the channel mask (number of channels) of the audio source channelMaskis valid, mainly checking whether the number of channels exceeds the limit of 12. If it is invalid, an error code will be returned, otherwise the execution will continue;
  3. Check whether the sampling format formatis valid, if not, return an error code, otherwise continue execution;
  4. Call preCreateTrack()to create a Track object;
  5. Initialize the Track object, such as initializing the channel mask, channel number, sampling format and session ID of the audio source input data as the incoming parameters, initializing the sampling rate of the single-channel audio source as the output sampling rate, initializing the mixer sampling format AudioMixer, Mixer input sampling format, mixer channel mask, mixer channel number, etc. The mixer sampling format can only be AUDIO_FORMAT_PCM_FLOATor AUDIO_FORMAT_PCM_16_BIT;
  6. Call postCreateTrack()to further initialize the Track object;
  7. Save the created and initialized Track object in a map.

From AudioMixerthe perspective of , create()it is not very reasonable for the user to pass in the audio source identifier for the operation. This destroys AudioMixerthe cohesion of . It is better to AudioMixergenerate this identifier internally and return it to the user.

AudioMixerpreCreateTrack()The class overrides the and functions seen above postCreateTrack(). The definitions of these two functions in AudioMixerthe class (located in frameworks/av/media/libaudioprocessing/AudioMixer.cpp ) are as follows:

std::shared_ptr<AudioMixerBase::TrackBase> AudioMixer::preCreateTrack()
{
    return std::make_shared<Track>();
}

status_t AudioMixer::postCreateTrack(TrackBase *track)
{
    Track* t = static_cast<Track*>(track);

    audio_channel_mask_t channelMask = t->channelMask;
    t->mHapticChannelMask = static_cast<audio_channel_mask_t>(
            channelMask & AUDIO_CHANNEL_HAPTIC_ALL);
    t->mHapticChannelCount = audio_channel_count_from_out_mask(t->mHapticChannelMask);
    channelMask = static_cast<audio_channel_mask_t>(channelMask & ~AUDIO_CHANNEL_HAPTIC_ALL);
    t->channelCount = audio_channel_count_from_out_mask(channelMask);
    ALOGV_IF(audio_channel_mask_get_bits(channelMask) != AUDIO_CHANNEL_OUT_STEREO,
            "Non-stereo channel mask: %d\n", channelMask);
    t->channelMask = channelMask;
    t->mInputBufferProvider = NULL;
    t->mDownmixRequiresFormat = AUDIO_FORMAT_INVALID; // no format required
    t->mPlaybackRate = AUDIO_PLAYBACK_RATE_DEFAULT;
    // haptic
    t->mHapticPlaybackEnabled = false;
    t->mHapticIntensity = os::HapticScale::NONE;
    t->mHapticMaxAmplitude = NAN;
    t->mMixerHapticChannelMask = AUDIO_CHANNEL_NONE;
    t->mMixerHapticChannelCount = 0;
    t->mAdjustInChannelCount = t->channelCount + t->mHapticChannelCount;
    t->mAdjustOutChannelCount = t->channelCount;
    t->mKeepContractedChannels = false;
    // Check the downmixing (or upmixing) requirements.
    status_t status = t->prepareForDownmix();
    if (status != OK) {
        ALOGE("AudioMixer::getTrackName invalid channelMask (%#x)", channelMask);
        return BAD_VALUE;
    }
    // prepareForDownmix() may change mDownmixRequiresFormat
    ALOGVV("mMixerFormat:%#x  mMixerInFormat:%#x\n", t->mMixerFormat, t->mMixerInFormat);
    t->prepareForReformat();
    t->prepareForAdjustChannels(mFrameCount);
    return OK;
}

AudioMixer::preCreateTrack()Just simply create the object. Some unique fields AudioMixer::postCreateTrack()will be initialized , the channel mask will be corrected to include only the audio data channel mask and not the haptic channel, and a data processing pipeline will be built for the mixed audio source.AudioMixer::Track

AudioMixerAudioMixerBaseThe / destroy()operation is used to remove an audio source from the mixer. The definition of this operation (located in frameworks/av/media/libaudioprocessing/AudioMixerBase.cpp ) is as follows:

void AudioMixerBase::destroy(int name)
{
    LOG_ALWAYS_FATAL_IF(!exists(name), "invalid name: %d", name);
    ALOGV("deleteTrackName(%d)", name);

    if (mTracks[name]->enabled) {
        invalidate();
    }
    mTracks.erase(name); // deallocate track
}

AudioMixerBase::destroy(int name)Executed when the audio source to be removed is enabled invalidate(), requesting a reinitialization of the mix processing. After that, simply remove the audio source you want to remove.

Audio source data processing pipeline

Each audio source of the mixer AudioMixer, that is, Track, maintains an audio data processing pipeline to convert the input audio source data into the AudioMixerrequired format and configuration data that can be used to perform mixing operations. The nodes of Track's audio data processing pipeline are represented by AudioBufferProvider/ PassthruBufferProvider. Above we see AudioMixer::Trackthat the definition of contains multiple member variables of this type of object:

        AudioBufferProvider* mInputBufferProvider;    // externally provided buffer provider.
        std::unique_ptr<PassthruBufferProvider> mAdjustChannelsBufferProvider;
        std::unique_ptr<PassthruBufferProvider> mReformatBufferProvider;
        std::unique_ptr<PassthruBufferProvider> mDownmixerBufferProvider;
        std::unique_ptr<PassthruBufferProvider> mPostDownmixReformatBufferProvider;
        std::unique_ptr<PassthruBufferProvider> mTimestretchBufferProvider;

The inheritance hierarchy of AudioBufferProvider/ related classes in libaudioprocessing is as follows:PassthruBufferProvider

AudioBufferProvider/PassthruBufferProvider

AudioBufferProviderIt is an interface class, and all the interface functions it contains are shown in the figure above. In frameworks/av/media/libaudioprocessing/include/media/BufferProviders.h , CopyBufferProviderthe and PassthruBufferProviderclasses are defined as follows:

class PassthruBufferProvider : public AudioBufferProvider {
public:
    PassthruBufferProvider() : mTrackBufferProvider(NULL) { }

    virtual ~PassthruBufferProvider() { }

    // call this to release the buffer to the upstream provider.
    // treat it as an audio discontinuity for future samples.
    virtual void reset() { }

    // set the upstream buffer provider. Consider calling "reset" before this function.
    virtual void setBufferProvider(AudioBufferProvider *p) {
        mTrackBufferProvider = p;
    }

protected:
    AudioBufferProvider *mTrackBufferProvider;
};

// Base AudioBufferProvider class used for DownMixerBufferProvider, RemixBufferProvider,
// and ReformatBufferProvider.
// It handles a private buffer for use in converting format or channel masks from the
// input data to a form acceptable by the mixer.
// TODO: Make a ResamplerBufferProvider when integers are entirely removed from the
// processing pipeline.
class CopyBufferProvider : public PassthruBufferProvider {
public:
    // Use a private buffer of bufferFrameCount frames (each frame is outputFrameSize bytes).
    // If bufferFrameCount is 0, no private buffer is created and in-place modification of
    // the upstream buffer provider's buffers is performed by copyFrames().
    CopyBufferProvider(size_t inputFrameSize, size_t outputFrameSize,
            size_t bufferFrameCount);
    virtual ~CopyBufferProvider();

    // Overrides AudioBufferProvider methods
    virtual status_t getNextBuffer(Buffer *buffer);
    virtual void releaseBuffer(Buffer *buffer);

    // Overrides PassthruBufferProvider
    virtual void reset();
    void setBufferProvider(AudioBufferProvider *p) override;

    // this function should be supplied by the derived class.  It converts
    // #frames in the *src pointer to the *dst pointer.  It is public because
    // some providers will allow this to work on arbitrary buffers outside
    // of the internal buffers.
    virtual void copyFrames(void *dst, const void *src, size_t frames) = 0;

protected:
    const size_t         mInputFrameSize;
    const size_t         mOutputFrameSize;
private:
    AudioBufferProvider::Buffer mBuffer;
    const size_t         mLocalBufferFrameCount;
    void                *mLocalBufferData;
    size_t               mConsumed;
};

PassthruBufferProviderThe class adds setBufferProvider()functions to facilitate the connection of different audio data processing pipeline nodes, and CopyBufferProviderthe class adds copyFrames()functions to facilitate the transfer of data between different audio data processing pipeline nodes. The functions implemented by each specific audio data processing pipeline node copyFrames()implement corresponding data processing or format conversion operations.

CopyBufferProviderWhen a class object is constructed, a memory buffer is allocated as needed. When the object is destructed, the used buffer is released. The relevant logic definition (located in frameworks/av/media/libaudioprocessing/BufferProviders.cpp ) is as follows:

CopyBufferProvider::CopyBufferProvider(size_t inputFrameSize,
        size_t outputFrameSize, size_t bufferFrameCount) :
        mInputFrameSize(inputFrameSize),
        mOutputFrameSize(outputFrameSize),
        mLocalBufferFrameCount(bufferFrameCount),
        mLocalBufferData(NULL),
        mConsumed(0)
{
    ALOGV("CopyBufferProvider(%p)(%zu, %zu, %zu)", this,
            inputFrameSize, outputFrameSize, bufferFrameCount);
    LOG_ALWAYS_FATAL_IF(inputFrameSize < outputFrameSize && bufferFrameCount == 0,
            "Requires local buffer if inputFrameSize(%zu) < outputFrameSize(%zu)",
            inputFrameSize, outputFrameSize);
    if (mLocalBufferFrameCount) {
        (void)posix_memalign(&mLocalBufferData, 32, mLocalBufferFrameCount * mOutputFrameSize);
    }
    mBuffer.frameCount = 0;
}

CopyBufferProvider::~CopyBufferProvider()
{
    ALOGV("%s(%p) %zu %p %p",
           __func__, this, mBuffer.frameCount, mTrackBufferProvider, mLocalBufferData);
    if (mBuffer.frameCount != 0) {
        mTrackBufferProvider->releaseBuffer(&mBuffer);
    }
    free(mLocalBufferData);
}

In order to avoid unnecessary memory consumption, CopyBufferProvidermemory allocation during class object construction is not necessary, but when the input audio frame size is smaller than the output audio frame size, a buffer must be allocated. CopyBufferProviderThe constructor of the class determines the size of the allocated buffer based on the number of local buffer frames passed in and the output audio frame size.

Track's audio data processing pipeline is driven by the function of its last node getNextBuffer(). In CopyBufferProviderthe class, the definition of this function (located in frameworks/av/media/libaudioprocessing/BufferProviders.cpp ) is as follows:

status_t CopyBufferProvider::getNextBuffer(AudioBufferProvider::Buffer *pBuffer)
{
    //ALOGV("CopyBufferProvider(%p)::getNextBuffer(%p (%zu))",
    //        this, pBuffer, pBuffer->frameCount);
    if (mLocalBufferFrameCount == 0) {
        status_t res = mTrackBufferProvider->getNextBuffer(pBuffer);
        if (res == OK) {
            copyFrames(pBuffer->raw, pBuffer->raw, pBuffer->frameCount);
        }
        return res;
    }
    if (mBuffer.frameCount == 0) {
        mBuffer.frameCount = pBuffer->frameCount;
        status_t res = mTrackBufferProvider->getNextBuffer(&mBuffer);
        // At one time an upstream buffer provider had
        // res == OK and mBuffer.frameCount == 0, doesn't seem to happen now 7/18/2014.
        //
        // By API spec, if res != OK, then mBuffer.frameCount == 0.
        // but there may be improper implementations.
        ALOG_ASSERT(res == OK || mBuffer.frameCount == 0);
        if (res != OK || mBuffer.frameCount == 0) { // not needed by API spec, but to be safe.
            pBuffer->raw = NULL;
            pBuffer->frameCount = 0;
            return res;
        }
        mConsumed = 0;
    }
    ALOG_ASSERT(mConsumed < mBuffer.frameCount);
    size_t count = std::min(mLocalBufferFrameCount, mBuffer.frameCount - mConsumed);
    count = std::min(count, pBuffer->frameCount);
    pBuffer->raw = mLocalBufferData;
    pBuffer->frameCount = count;
    copyFrames(pBuffer->raw, (uint8_t*)mBuffer.raw + mConsumed * mInputFrameSize,
            pBuffer->frameCount);
    return OK;
}

If the incoming local buffer frame number is 0 when the object is constructed, this usually means that the buffer will not be allocated locally, and the buffer size required by the current node of Track's audio data processing pipeline will not be larger than that of the previous node. If the need is large, obtain the processed data from the previous node, then call copyFrames()the function of the current node to perform the processing of the current node, and return it to the caller. For this situation, a specific implementation subclass is required to ensure that its audio data processing does not need to change the sampling rate of the audio data processed by the previous node.

If the number of local buffer frames passed in when the object is constructed is not 0, there is a buffer mBufferused to save data obtained from the previous node. In the function CopyBufferProviderof class getNextBuffer(), if mBufferthe number of frames in is 0, it means that the data previously obtained from the previous node has been consumed, and a piece of data is obtained from the previous node again. Then call the function of the current node copyFrames()to process the data obtained from the previous node and return it to the caller. Here we directly compare the number of frames of the current node mLocalBufferFrameCountwith the number of frames of the data of the previous node mBuffer.frameCount - mConsumed, but the two should not be compared directly. For example, according to the configuration, the current node needs to output audio data with a sampling rate of 48 kHz, and the number of output frames is 10 ms of data, that is, 480 frames. If the previous node outputs data with a sampling rate of 16 kHz according to the configuration, then the current node only needs 160 frames of data from the previous node to output data once. In other words, it also implies that the data processing of the current node does not change the sampling rate.

The audio data obtained from CopyBufferProviderthe class getNextBuffer()function needs to be released after use. CopyBufferProvider::releaseBuffer()The function is defined as follows:

void CopyBufferProvider::releaseBuffer(AudioBufferProvider::Buffer *pBuffer)
{
    //ALOGV("CopyBufferProvider(%p)::releaseBuffer(%p(%zu))",
    //        this, pBuffer, pBuffer->frameCount);
    if (mLocalBufferFrameCount == 0) {
        mTrackBufferProvider->releaseBuffer(pBuffer);
        return;
    }
    // LOG_ALWAYS_FATAL_IF(pBuffer->frameCount == 0, "Invalid framecount");
    mConsumed += pBuffer->frameCount; // TODO: update for efficiency to reuse existing content
    if (mConsumed != 0 && mConsumed >= mBuffer.frameCount) {
        mTrackBufferProvider->releaseBuffer(&mBuffer);
        ALOG_ASSERT(mBuffer.frameCount == 0);
    }
    pBuffer->raw = NULL;
    pBuffer->frameCount = 0;
}

The buffer is released wherever it is obtained. mConsumedIndicates the frame number of the audio data of the previous node, and pBuffer->frameCountindicates the frame number of the audio data of the current node. If the sampling rates of the two nodes are different, the two frame numbers have different meanings, and therefore cannot be added directly. It is also implicit here that the data processing of the current node does not change the sampling rate.

CopyBufferProviderreset()The and operations of the class setBufferProvider()are defined as follows:

void CopyBufferProvider::reset()
{
    if (mBuffer.frameCount != 0) {
        mTrackBufferProvider->releaseBuffer(&mBuffer);
    }
    mConsumed = 0;
}

void CopyBufferProvider::setBufferProvider(AudioBufferProvider *p) {
    ALOGV("%s(%p): mTrackBufferProvider:%p  mBuffer.frameCount:%zu",
            __func__, p, mTrackBufferProvider, mBuffer.frameCount);
    if (mTrackBufferProvider == p) {
        return;
    }
    mBuffer.frameCount = 0;
    PassthruBufferProvider::setBufferProvider(p);
}

The different PassthruBufferProviderand CopyBufferProvidersubclasses we saw above provide different processing operations. DownmixerBufferProviderIt is used to convert the channel mask format/channel number of audio data. The sampling rate and sampling format of the audio data remain unchanged. It is implemented based on the sound effect HAL. DownmixerBufferProviderThe implementation of each member function of the class is as follows:

DownmixerBufferProvider::DownmixerBufferProvider(
        audio_channel_mask_t inputChannelMask,
        audio_channel_mask_t outputChannelMask, audio_format_t format,
        uint32_t sampleRate, int32_t sessionId, size_t bufferFrameCount) :
        CopyBufferProvider(
            audio_bytes_per_sample(format) * audio_channel_count_from_out_mask(inputChannelMask),
            audio_bytes_per_sample(format) * audio_channel_count_from_out_mask(outputChannelMask),
            bufferFrameCount)  // set bufferFrameCount to 0 to do in-place
{
    ALOGV("DownmixerBufferProvider(%p)(%#x, %#x, %#x %u %d %d)",
            this, inputChannelMask, outputChannelMask, format,
            sampleRate, sessionId, (int)bufferFrameCount);
    if (!sIsMultichannelCapable) {
        ALOGE("DownmixerBufferProvider() error: not multichannel capable");
        return;
    }
    mEffectsFactory = EffectsFactoryHalInterface::create();
    if (mEffectsFactory == 0) {
        ALOGE("DownmixerBufferProvider() error: could not obtain the effects factory");
        return;
    }
    if (mEffectsFactory->createEffect(&sDwnmFxDesc.uuid,
                                      sessionId,
                                      SESSION_ID_INVALID_AND_IGNORED,
                                      AUDIO_PORT_HANDLE_NONE,
                                      &mDownmixInterface) != 0) {
         ALOGE("DownmixerBufferProvider() error creating downmixer effect");
         mDownmixInterface.clear();
         mEffectsFactory.clear();
         return;
     }
     // channel input configuration will be overridden per-track
     mDownmixConfig.inputCfg.channels = inputChannelMask;   // FIXME: Should be bits
     mDownmixConfig.outputCfg.channels = outputChannelMask; // FIXME: should be bits
     mDownmixConfig.inputCfg.format = format;
     mDownmixConfig.outputCfg.format = format;
     mDownmixConfig.inputCfg.samplingRate = sampleRate;
     mDownmixConfig.outputCfg.samplingRate = sampleRate;
     mDownmixConfig.inputCfg.accessMode = EFFECT_BUFFER_ACCESS_READ;
     mDownmixConfig.outputCfg.accessMode = EFFECT_BUFFER_ACCESS_WRITE;
     // input and output buffer provider, and frame count will not be used as the downmix effect
     // process() function is called directly (see DownmixerBufferProvider::getNextBuffer())
     mDownmixConfig.inputCfg.mask = EFFECT_CONFIG_SMP_RATE | EFFECT_CONFIG_CHANNELS |
             EFFECT_CONFIG_FORMAT | EFFECT_CONFIG_ACC_MODE;
     mDownmixConfig.outputCfg.mask = mDownmixConfig.inputCfg.mask;

     mInFrameSize =
             audio_bytes_per_sample(format) * audio_channel_count_from_out_mask(inputChannelMask);
     mOutFrameSize =
             audio_bytes_per_sample(format) * audio_channel_count_from_out_mask(outputChannelMask);
     status_t status;
     status = mEffectsFactory->mirrorBuffer(
             nullptr, mInFrameSize * bufferFrameCount, &mInBuffer);
     if (status != 0) {
         ALOGE("DownmixerBufferProvider() error %d while creating input buffer", status);
         mDownmixInterface.clear();
         mEffectsFactory.clear();
         return;
     }
     status = mEffectsFactory->mirrorBuffer(
             nullptr, mOutFrameSize * bufferFrameCount, &mOutBuffer);
     if (status != 0) {
         ALOGE("DownmixerBufferProvider() error %d while creating output buffer", status);
         mInBuffer.clear();
         mDownmixInterface.clear();
         mEffectsFactory.clear();
         return;
     }
     mDownmixInterface->setInBuffer(mInBuffer);
     mDownmixInterface->setOutBuffer(mOutBuffer);

     int cmdStatus;
     uint32_t replySize = sizeof(int);

     // Configure downmixer
     status = mDownmixInterface->command(
             EFFECT_CMD_SET_CONFIG /*cmdCode*/, sizeof(effect_config_t) /*cmdSize*/,
             &mDownmixConfig /*pCmdData*/,
             &replySize, &cmdStatus /*pReplyData*/);
     if (status != 0 || cmdStatus != 0) {
         ALOGE("DownmixerBufferProvider() error %d cmdStatus %d while configuring downmixer",
                 status, cmdStatus);
         mOutBuffer.clear();
         mInBuffer.clear();
         mDownmixInterface.clear();
         mEffectsFactory.clear();
         return;
     }

     // Enable downmixer
     replySize = sizeof(int);
     status = mDownmixInterface->command(
             EFFECT_CMD_ENABLE /*cmdCode*/, 0 /*cmdSize*/, NULL /*pCmdData*/,
             &replySize, &cmdStatus /*pReplyData*/);
     if (status != 0 || cmdStatus != 0) {
         ALOGE("DownmixerBufferProvider() error %d cmdStatus %d while enabling downmixer",
                 status, cmdStatus);
         mOutBuffer.clear();
         mInBuffer.clear();
         mDownmixInterface.clear();
         mEffectsFactory.clear();
         return;
     }

     // Set downmix type
     // parameter size rounded for padding on 32bit boundary
     const int psizePadded = ((sizeof(downmix_params_t) - 1)/sizeof(int) + 1) * sizeof(int);
     const int downmixParamSize =
             sizeof(effect_param_t) + psizePadded + sizeof(downmix_type_t);
     effect_param_t * const param = (effect_param_t *) malloc(downmixParamSize);
     param->psize = sizeof(downmix_params_t);
     const downmix_params_t downmixParam = DOWNMIX_PARAM_TYPE;
     memcpy(param->data, &downmixParam, param->psize);
     const downmix_type_t downmixType = DOWNMIX_TYPE_FOLD;
     param->vsize = sizeof(downmix_type_t);
     memcpy(param->data + psizePadded, &downmixType, param->vsize);
     replySize = sizeof(int);
     status = mDownmixInterface->command(
             EFFECT_CMD_SET_PARAM /* cmdCode */, downmixParamSize /* cmdSize */,
             param /*pCmdData*/, &replySize, &cmdStatus /*pReplyData*/);
     free(param);
     if (status != 0 || cmdStatus != 0) {
         ALOGE("DownmixerBufferProvider() error %d cmdStatus %d while setting downmix type",
                 status, cmdStatus);
         mOutBuffer.clear();
         mInBuffer.clear();
         mDownmixInterface.clear();
         mEffectsFactory.clear();
         return;
     }
     ALOGV("DownmixerBufferProvider() downmix type set to %d", (int) downmixType);
}

DownmixerBufferProvider::~DownmixerBufferProvider()
{
    ALOGV("~DownmixerBufferProvider (%p)", this);
    if (mDownmixInterface != 0) {
        mDownmixInterface->close();
    }
}

void DownmixerBufferProvider::copyFrames(void *dst, const void *src, size_t frames)
{
    mInBuffer->setExternalData(const_cast<void*>(src));
    mInBuffer->setFrameCount(frames);
    mInBuffer->update(mInFrameSize * frames);
    mOutBuffer->setFrameCount(frames);
    mOutBuffer->setExternalData(dst);
    if (dst != src) {
        // Downmix may be accumulating, need to populate the output buffer
        // with the dst data.
        mOutBuffer->update(mOutFrameSize * frames);
    }
    // may be in-place if src == dst.
    status_t res = mDownmixInterface->process();
    if (res == OK) {
        mOutBuffer->commit(mOutFrameSize * frames);
    } else {
        ALOGE("DownmixBufferProvider error %d", res);
    }
}

/* call once in a pthread_once handler. */
/*static*/ status_t DownmixerBufferProvider::init()
{
    // find multichannel downmix effect if we have to play multichannel content
    sp<EffectsFactoryHalInterface> effectsFactory = EffectsFactoryHalInterface::create();
    if (effectsFactory == 0) {
        ALOGE("AudioMixer() error: could not obtain the effects factory");
        return NO_INIT;
    }
    uint32_t numEffects = 0;
    int ret = effectsFactory->queryNumberEffects(&numEffects);
    if (ret != 0) {
        ALOGE("AudioMixer() error %d querying number of effects", ret);
        return NO_INIT;
    }
    ALOGV("EffectQueryNumberEffects() numEffects=%d", numEffects);

    for (uint32_t i = 0 ; i < numEffects ; i++) {
        if (effectsFactory->getDescriptor(i, &sDwnmFxDesc) == 0) {
            ALOGV("effect %d is called %s", i, sDwnmFxDesc.name);
            if (memcmp(&sDwnmFxDesc.type, EFFECT_UIID_DOWNMIX, sizeof(effect_uuid_t)) == 0) {
                ALOGI("found effect \"%s\" from %s",
                        sDwnmFxDesc.name, sDwnmFxDesc.implementor);
                sIsMultichannelCapable = true;
                break;
            }
        }
    }
    ALOGW_IF(!sIsMultichannelCapable, "unable to find downmix effect");
    return NO_INIT;
}

/*static*/ bool DownmixerBufferProvider::sIsMultichannelCapable = false;
/*static*/ effect_descriptor_t DownmixerBufferProvider::sDwnmFxDesc;

DownmixerBufferProviderThis class encapsulates the audio HAL interface and implements the conversion of the number of audio data channels. When the first AudioMixerobject is created, DownmixerBufferProviderclass-level initialization is performed, mainly to check whether the sound effect HAL supports channel mask format/channel number conversion for audio data. When the class object is created, prepare the interface objects such as and etc. DownmixerBufferProviderrequired to perform conversion , as well as the sound effect HAL output and input buffer, and send commands to the sound effect HAL to configure and enable it. Class Transforms a piece of data. The specific strategy for converting audio data into channel mask format/channel number depends on the specific implementation of the sound effect HAL.sp<EffectsFactoryHalInterface>sp<EffectHalInterface>DownmixerBufferProvidercopyFrames()

RemixBufferProviderIt is used to convert the channel mask format/channel number of audio data. The sampling rate and sampling format of the audio data remain unchanged. The RemixBufferProviderimplementation of each member function of the class is as follows:

RemixBufferProvider::RemixBufferProvider(audio_channel_mask_t inputChannelMask,
        audio_channel_mask_t outputChannelMask, audio_format_t format,
        size_t bufferFrameCount) :
        CopyBufferProvider(
                audio_bytes_per_sample(format)
                    * audio_channel_count_from_out_mask(inputChannelMask),
                audio_bytes_per_sample(format)
                    * audio_channel_count_from_out_mask(outputChannelMask),
                bufferFrameCount),
        mFormat(format),
        mSampleSize(audio_bytes_per_sample(format)),
        mInputChannels(audio_channel_count_from_out_mask(inputChannelMask)),
        mOutputChannels(audio_channel_count_from_out_mask(outputChannelMask))
{
    ALOGV("RemixBufferProvider(%p)(%#x, %#x, %#x) %zu %zu",
            this, format, inputChannelMask, outputChannelMask,
            mInputChannels, mOutputChannels);
    (void) memcpy_by_index_array_initialization_from_channel_mask(
            mIdxAry, ARRAY_SIZE(mIdxAry), outputChannelMask, inputChannelMask);
}

void RemixBufferProvider::copyFrames(void *dst, const void *src, size_t frames)
{
    memcpy_by_index_array(dst, mOutputChannels,
            src, mInputChannels, mIdxAry, mSampleSize, frames);
}

RemixBufferProviderThe way to convert the channel mask format/channel number of audio data is to first find the index in each frame of input data corresponding to each channel of the audio output data according to the input and output audio channel mask formats, and then based on the obtained index array, Copies the input audio data to the output data buffer frame by frame. memcpy_by_index_array_initialization_from_channel_mask()The function used to obtain the index array is defined in the system/media/audio_utils/format.c file:

size_t memcpy_by_index_array_initialization_from_channel_mask(int8_t *idxary, size_t arysize,
        audio_channel_mask_t dst_channel_mask, audio_channel_mask_t src_channel_mask)
{
    const audio_channel_representation_t src_representation =
            audio_channel_mask_get_representation(src_channel_mask);
    const audio_channel_representation_t dst_representation =
            audio_channel_mask_get_representation(dst_channel_mask);
    const uint32_t src_bits = audio_channel_mask_get_bits(src_channel_mask);
    const uint32_t dst_bits = audio_channel_mask_get_bits(dst_channel_mask);

    switch (src_representation) {
    case AUDIO_CHANNEL_REPRESENTATION_POSITION:
        switch (dst_representation) {
        case AUDIO_CHANNEL_REPRESENTATION_POSITION:
            return memcpy_by_index_array_initialization(idxary, arysize,
                    dst_bits, src_bits);
        case AUDIO_CHANNEL_REPRESENTATION_INDEX:
            return memcpy_by_index_array_initialization_dst_index(idxary, arysize,
                    dst_bits, src_bits);
        default:
            return 0;
        }
        break;
    case AUDIO_CHANNEL_REPRESENTATION_INDEX:
        switch (dst_representation) {
        case AUDIO_CHANNEL_REPRESENTATION_POSITION:
            return memcpy_by_index_array_initialization_src_index(idxary, arysize,
                    dst_bits, src_bits);
        case AUDIO_CHANNEL_REPRESENTATION_INDEX:
            return memcpy_by_index_array_initialization(idxary, arysize,
                    dst_bits, src_bits);
        default:
            return 0;
        }
        break;
    default:
        return 0;
    }
}

memcpy_by_index_array_initialization_from_channel_mask()The function creates index arrays through different functions according to the channel mask format of the input and output data. audio_channel_mask_tThere are two types of audio channel masks represented. For POSITIONthe type audio channel mask, it contains the number of channels; for INDEXthe type audio channel mask, each audio channel is assigned a binary bit. When the audio data contains the corresponding channel When , the corresponding position is set to 1, otherwise it is set to 0. The audio channel mask is obtained by bitwise OR operation of each audio channel; audio_channel_mask_tit can represent up to 30 channels, and the 30th and 31st bits are used to represent the type. This can be seen through the function in system/media/audio/include/system/audio.h that obtains the number of channels based on the audio channel mask:audio_channel_count_from_out_mask()

static inline uint32_t audio_channel_count_from_out_mask(audio_channel_mask_t channel)
{
    uint32_t bits = audio_channel_mask_get_bits(channel);
    switch (audio_channel_mask_get_representation(channel)) {
    case AUDIO_CHANNEL_REPRESENTATION_POSITION:
        // TODO: We can now merge with from_in_mask and remove anding
        bits &= AUDIO_CHANNEL_OUT_ALL;
        FALLTHROUGH_INTENDED;
    case AUDIO_CHANNEL_REPRESENTATION_INDEX:
        return __builtin_popcount(bits);
    default:
        return 0;
    }
}

memcpy_by_index_array_initialization_from_channel_mask()Several functions called by the function are defined in the system/media/audio_utils/primitives.c file:

size_t memcpy_by_index_array_initialization(int8_t *idxary, size_t idxcount,
        uint32_t dst_mask, uint32_t src_mask)
{
    size_t n = 0;
    int srcidx = 0;
    uint32_t bit, ormask = src_mask | dst_mask;

    while (ormask && n < idxcount) {
        bit = ormask & -ormask;          /* get lowest bit */
        ormask ^= bit;                   /* remove lowest bit */
        if (src_mask & dst_mask & bit) { /* matching channel */
            idxary[n++] = srcidx++;
        } else if (src_mask & bit) {     /* source channel only */
            ++srcidx;
        } else {                         /* destination channel only */
            idxary[n++] = -1;
        }
    }
    return n + __builtin_popcount(ormask & dst_mask);
}

size_t memcpy_by_index_array_initialization_src_index(int8_t *idxary, size_t idxcount,
        uint32_t dst_mask, uint32_t src_mask) {
    size_t dst_count = __builtin_popcount(dst_mask);
    if (idxcount == 0) {
        return dst_count;
    }
    if (dst_count > idxcount) {
        dst_count = idxcount;
    }

    size_t src_idx, dst_idx;
    for (src_idx = 0, dst_idx = 0; dst_idx < dst_count; ++dst_idx) {
        if (src_mask & 1) {
            idxary[dst_idx] = src_idx++;
        } else {
            idxary[dst_idx] = -1;
        }
        src_mask >>= 1;
    }
    return dst_idx;
}

size_t memcpy_by_index_array_initialization_dst_index(int8_t *idxary, size_t idxcount,
        uint32_t dst_mask, uint32_t src_mask) {
    size_t src_idx, dst_idx;
    size_t dst_count = __builtin_popcount(dst_mask);
    size_t src_count = __builtin_popcount(src_mask);
    if (idxcount == 0) {
        return dst_count;
    }
    if (dst_count > idxcount) {
        dst_count = idxcount;
    }
    for (src_idx = 0, dst_idx = 0; dst_idx < dst_count; ++src_idx) {
        if (dst_mask & 1) {
            idxary[dst_idx++] = src_idx < src_count ? (signed)src_idx : -1;
        }
        dst_mask >>= 1;
    }
    return dst_idx;
}

memcpy_by_index_array_initialization_src_index()I strongly suspect that the and functions implemented here memcpy_by_index_array_initialization_dst_index()are wrong. For memcpy_by_index_array_initialization_src_index()the function, its dst_maskparameter is the number of channels, so there is no need to call __builtin_popcount()to calculate the number of output channels. For memcpy_by_index_array_initialization_dst_index()the function, the number of input channels does not need to be calculated, but the number of output channels does. In addition, memcpy_by_index_array_initialization_from_channel_mask()in the function, POSITIONwhen the input audio channel mask and the output audio channel mask are both of type, memcpy_by_index_array_initialization()it is not right to call the function to create an index array.

The function definition for copying data based on the index array memcpy_by_index_array()(located in system/media/audio_utils/primitives.c ) is as follows:

/*
 * C macro to do copying by index array, to rearrange samples
 * within a frame.  This is independent of src/dst sample type.
 * Don't pass in any expressions for the macro arguments here.
 */
#define copy_frame_by_idx(dst, dst_channels, src, src_channels, idxary, count, zero) \
{ \
    unsigned i; \
    int index; \
    for (; (count) > 0; --(count)) { \
        for (i = 0; i < (dst_channels); ++i) { \
            index = (idxary)[i]; \
            *(dst)++ = index < 0 ? (zero) : (src)[index]; \
        } \
        (src) += (src_channels); \
    } \
}

void memcpy_by_index_array(void *dst, uint32_t dst_channels,
        const void *src, uint32_t src_channels,
        const int8_t *idxary, size_t sample_size, size_t count)
{
    switch (sample_size) {
    case 1: {
        uint8_t *udst = (uint8_t*)dst;
        const uint8_t *usrc = (const uint8_t*)src;

        copy_frame_by_idx(udst, dst_channels, usrc, src_channels, idxary, count, 0);
    } break;
    case 2: {
        uint16_t *udst = (uint16_t*)dst;
        const uint16_t *usrc = (const uint16_t*)src;

        copy_frame_by_idx(udst, dst_channels, usrc, src_channels, idxary, count, 0);
    } break;
    case 3: { /* could be slow.  use a struct to represent 3 bytes of data. */
        uint8x3_t *udst = (uint8x3_t*)dst;
        const uint8x3_t *usrc = (const uint8x3_t*)src;
        static const uint8x3_t zero;

        copy_frame_by_idx(udst, dst_channels, usrc, src_channels, idxary, count, zero);
    } break;
    case 4: {
        uint32_t *udst = (uint32_t*)dst;
        const uint32_t *usrc = (const uint32_t*)src;

        copy_frame_by_idx(udst, dst_channels, usrc, src_channels, idxary, count, 0);
    } break;
    default:
        abort(); /* illegal value */
        break;
    }
}

ChannelMixBufferProviderIt is also used to convert the channel mask format/channel number of audio data. The sampling rate and sampling format of the audio data remain unchanged. The ChannelMixBufferProviderimplementation of each member function of the class is as follows:

ChannelMixBufferProvider::ChannelMixBufferProvider(audio_channel_mask_t inputChannelMask,
        audio_channel_mask_t outputChannelMask, audio_format_t format,
        size_t bufferFrameCount) :
        CopyBufferProvider(
                audio_bytes_per_sample(format)
                    * audio_channel_count_from_out_mask(inputChannelMask),
                audio_bytes_per_sample(format)
                    * audio_channel_count_from_out_mask(outputChannelMask),
                bufferFrameCount)
{
    ALOGV("ChannelMixBufferProvider(%p)(%#x, %#x, %#x)",
            this, format, inputChannelMask, outputChannelMask);
    if (outputChannelMask == AUDIO_CHANNEL_OUT_STEREO && format == AUDIO_FORMAT_PCM_FLOAT) {
        mIsValid = mChannelMix.setInputChannelMask(inputChannelMask);
    }
}

void ChannelMixBufferProvider::copyFrames(void *dst, const void *src, size_t frames)
{
    mChannelMix.process(static_cast<const float *>(src), static_cast<float *>(dst),
            frames, false /* accumulate */);
}

ChannelMixBufferProviderBased on audio_utils::channels::ChannelMiximplementation, the output data is required to be two-channel stereo, and the sampling format is floating point.

ReformatBufferProviderUsed to convert the sampling format of audio data. The sampling rate and channel mask format of the audio data remain unchanged. The ReformatBufferProviderimplementation of each member function of the class is as follows:

ReformatBufferProvider::ReformatBufferProvider(int32_t channelCount,
        audio_format_t inputFormat, audio_format_t outputFormat,
        size_t bufferFrameCount) :
        CopyBufferProvider(
                channelCount * audio_bytes_per_sample(inputFormat),
                channelCount * audio_bytes_per_sample(outputFormat),
                bufferFrameCount),
        mChannelCount(channelCount),
        mInputFormat(inputFormat),
        mOutputFormat(outputFormat)
{
    ALOGV("ReformatBufferProvider(%p)(%u, %#x, %#x)",
            this, channelCount, inputFormat, outputFormat);
}

void ReformatBufferProvider::copyFrames(void *dst, const void *src, size_t frames)
{
    memcpy_by_audio_format(dst, mOutputFormat, src, mInputFormat, frames * mChannelCount);
}

ClampFloatBufferProviderIt is used to limit the intensity of the audio data to -3 dB, while other things remain unchanged. ClampFloatBufferProviderThe implementation of each member function of the class is as follows:

ClampFloatBufferProvider::ClampFloatBufferProvider(int32_t channelCount, size_t bufferFrameCount) :
        CopyBufferProvider(
                channelCount * audio_bytes_per_sample(AUDIO_FORMAT_PCM_FLOAT),
                channelCount * audio_bytes_per_sample(AUDIO_FORMAT_PCM_FLOAT),
                bufferFrameCount),
        mChannelCount(channelCount)
{
    ALOGV("ClampFloatBufferProvider(%p)(%u)", this, channelCount);
}

void ClampFloatBufferProvider::copyFrames(void *dst, const void *src, size_t frames)
{
    memcpy_to_float_from_float_with_clamping((float*)dst, (const float*)src,
                                             frames * mChannelCount,
                                             FLOAT_NOMINAL_RANGE_HEADROOM);
}

ClampFloatBufferProviderThe input and output sampling format is required to be floating point.

TimestretchBufferProviderIt is used to time stretch audio data. It is implemented based on the third-party library sonic . TimestretchBufferProviderUsed when playing at double speed.

AdjustChannelsBufferProviderUsed to adjust sampled data, expanding or contracting it from one interleaved channel format to another. Additional extended channels are padded with zeros and placed at the end of each audio frame. Shrinking channels are copied to the end of the output buffer (storage should be allocated appropriately). Shrinking channels can be written to the output buffer and adjusted. When the shrink channels in the shrink buffer are adjusted, the input channel count will be calculated as 'inChannelCount - outChannelCount'. The output channel count is provided by the caller, it is ' contractedOutChannelCount '. Currently, audio-coupled haptic playback mainly adopts the method of adjusting the contraction channel. If the device supports two haptic channels and the application only provides one haptic channel, the second haptic channel will duplicate the data from the first haptic channel. If the device supports a single haptic channel and the application provides two haptic channels, the second channel will be shrunk.

In AudioMixer::postCreateTrack()the function, after initializing the Track object, nodes will be created for the audio data processing pipeline and the pipeline will be built:

void AudioMixer::Track::unprepareForDownmix() {
    ALOGV("AudioMixer::unprepareForDownmix(%p)", this);

    if (mPostDownmixReformatBufferProvider.get() != nullptr) {
        // release any buffers held by the mPostDownmixReformatBufferProvider
        // before deallocating the mDownmixerBufferProvider.
        mPostDownmixReformatBufferProvider->reset();
    }

    mDownmixRequiresFormat = AUDIO_FORMAT_INVALID;
    if (mDownmixerBufferProvider.get() != nullptr) {
        // this track had previously been configured with a downmixer, delete it
        mDownmixerBufferProvider.reset(nullptr);
        reconfigureBufferProviders();
    } else {
        ALOGV(" nothing to do, no downmixer to delete");
    }
}

status_t AudioMixer::Track::prepareForDownmix()
{
    ALOGV("AudioMixer::prepareForDownmix(%p) with mask 0x%x",
            this, channelMask);

    // discard the previous downmixer if there was one
    unprepareForDownmix();
    // MONO_HACK Only remix (upmix or downmix) if the track and mixer/device channel masks
    // are not the same and not handled internally, as mono for channel position masks is.
    if (channelMask == mMixerChannelMask
            || (channelMask == AUDIO_CHANNEL_OUT_MONO
                    && isAudioChannelPositionMask(mMixerChannelMask))) {
        return NO_ERROR;
    }
    // DownmixerBufferProvider is only used for position masks.
    if (audio_channel_mask_get_representation(channelMask)
                == AUDIO_CHANNEL_REPRESENTATION_POSITION
            && DownmixerBufferProvider::isMultichannelCapable()) {

        // Check if we have a float or int16 downmixer, in that order.
        for (const audio_format_t format : { AUDIO_FORMAT_PCM_FLOAT, AUDIO_FORMAT_PCM_16_BIT }) {
            mDownmixerBufferProvider.reset(new DownmixerBufferProvider(
                    channelMask, mMixerChannelMask,
                    format,
                    sampleRate, sessionId, kCopyBufferFrameCount));
            if (static_cast<DownmixerBufferProvider *>(mDownmixerBufferProvider.get())
                    ->isValid()) {
                mDownmixRequiresFormat = format;
                reconfigureBufferProviders();
                return NO_ERROR;
            }
        }
        // mDownmixerBufferProvider reset below.
    }

    // See if we should use our built-in non-effect downmixer.
    if (mMixerInFormat == AUDIO_FORMAT_PCM_FLOAT
            && mMixerChannelMask == AUDIO_CHANNEL_OUT_STEREO
            && audio_channel_mask_get_representation(channelMask)
                    == AUDIO_CHANNEL_REPRESENTATION_POSITION) {
        mDownmixerBufferProvider.reset(new ChannelMixBufferProvider(channelMask,
                mMixerChannelMask, mMixerInFormat, kCopyBufferFrameCount));
        if (static_cast<ChannelMixBufferProvider *>(mDownmixerBufferProvider.get())
                ->isValid()) {
            mDownmixRequiresFormat = mMixerInFormat;
            reconfigureBufferProviders();
            ALOGD("%s: Fallback using ChannelMix", __func__);
            return NO_ERROR;
        } else {
            ALOGD("%s: ChannelMix not supported for channel mask %#x", __func__, channelMask);
        }
    }

    // Effect downmixer does not accept the channel conversion.  Let's use our remixer.
    mDownmixerBufferProvider.reset(new RemixBufferProvider(channelMask,
            mMixerChannelMask, mMixerInFormat, kCopyBufferFrameCount));
    // Remix always finds a conversion whereas Downmixer effect above may fail.
    reconfigureBufferProviders();
    return NO_ERROR;
}

void AudioMixer::Track::unprepareForReformat() {
    ALOGV("AudioMixer::unprepareForReformat(%p)", this);
    bool requiresReconfigure = false;
    if (mReformatBufferProvider.get() != nullptr) {
        mReformatBufferProvider.reset(nullptr);
        requiresReconfigure = true;
    }
    if (mPostDownmixReformatBufferProvider.get() != nullptr) {
        mPostDownmixReformatBufferProvider.reset(nullptr);
        requiresReconfigure = true;
    }
    if (requiresReconfigure) {
        reconfigureBufferProviders();
    }
}

status_t AudioMixer::Track::prepareForReformat()
{
    ALOGV("AudioMixer::prepareForReformat(%p) with format %#x", this, mFormat);
    // discard previous reformatters
    unprepareForReformat();
    // only configure reformatters as needed
    const audio_format_t targetFormat = mDownmixRequiresFormat != AUDIO_FORMAT_INVALID
            ? mDownmixRequiresFormat : mMixerInFormat;
    bool requiresReconfigure = false;
    if (mFormat != targetFormat) {
        mReformatBufferProvider.reset(new ReformatBufferProvider(
                audio_channel_count_from_out_mask(channelMask),
                mFormat,
                targetFormat,
                kCopyBufferFrameCount));
        requiresReconfigure = true;
    } else if (mFormat == AUDIO_FORMAT_PCM_FLOAT) {
        // Input and output are floats, make sure application did not provide > 3db samples
        // that would break volume application (b/68099072)
        // TODO: add a trusted source flag to avoid the overhead
        mReformatBufferProvider.reset(new ClampFloatBufferProvider(
                audio_channel_count_from_out_mask(channelMask),
                kCopyBufferFrameCount));
        requiresReconfigure = true;
    }
    if (targetFormat != mMixerInFormat) {
        mPostDownmixReformatBufferProvider.reset(new ReformatBufferProvider(
                audio_channel_count_from_out_mask(mMixerChannelMask),
                targetFormat,
                mMixerInFormat,
                kCopyBufferFrameCount));
        requiresReconfigure = true;
    }
    if (requiresReconfigure) {
        reconfigureBufferProviders();
    }
    return NO_ERROR;
}

void AudioMixer::Track::unprepareForAdjustChannels()
{
    ALOGV("AUDIOMIXER::unprepareForAdjustChannels");
    if (mAdjustChannelsBufferProvider.get() != nullptr) {
        mAdjustChannelsBufferProvider.reset(nullptr);
        reconfigureBufferProviders();
    }
}

status_t AudioMixer::Track::prepareForAdjustChannels(size_t frames)
{
    ALOGV("AudioMixer::prepareForAdjustChannels(%p) with inChannelCount: %u, outChannelCount: %u",
            this, mAdjustInChannelCount, mAdjustOutChannelCount);
    unprepareForAdjustChannels();
    if (mAdjustInChannelCount != mAdjustOutChannelCount) {
        uint8_t* buffer = mKeepContractedChannels
                ? (uint8_t*)mainBuffer + frames * audio_bytes_per_frame(
                        mMixerChannelCount, mMixerFormat)
                : nullptr;
        mAdjustChannelsBufferProvider.reset(new AdjustChannelsBufferProvider(
                mFormat, mAdjustInChannelCount, mAdjustOutChannelCount, frames,
                mKeepContractedChannels ? mMixerFormat : AUDIO_FORMAT_INVALID,
                buffer, mMixerHapticChannelCount));
        reconfigureBufferProviders();
    }
    return NO_ERROR;
}

void AudioMixer::Track::clearContractedBuffer()
{
    if (mAdjustChannelsBufferProvider.get() != nullptr) {
        static_cast<AdjustChannelsBufferProvider*>(
                mAdjustChannelsBufferProvider.get())->clearContractedFrames();
    }
}

After creating nodes for the audio data processing pipeline, AudioMixer::Track::reconfigureBufferProviders()build the pipeline by:

void AudioMixer::Track::reconfigureBufferProviders()
{
    // configure from upstream to downstream buffer providers.
    bufferProvider = mInputBufferProvider;
    if (mAdjustChannelsBufferProvider.get() != nullptr) {
        mAdjustChannelsBufferProvider->setBufferProvider(bufferProvider);
        bufferProvider = mAdjustChannelsBufferProvider.get();
    }
    if (mReformatBufferProvider.get() != nullptr) {
        mReformatBufferProvider->setBufferProvider(bufferProvider);
        bufferProvider = mReformatBufferProvider.get();
    }
    if (mDownmixerBufferProvider.get() != nullptr) {
        mDownmixerBufferProvider->setBufferProvider(bufferProvider);
        bufferProvider = mDownmixerBufferProvider.get();
    }
    if (mPostDownmixReformatBufferProvider.get() != nullptr) {
        mPostDownmixReformatBufferProvider->setBufferProvider(bufferProvider);
        bufferProvider = mPostDownmixReformatBufferProvider.get();
    }
    if (mTimestretchBufferProvider.get() != nullptr) {
        mTimestretchBufferProvider->setBufferProvider(bufferProvider);
        bufferProvider = mTimestretchBufferProvider.get();
    }
}

AudioMixer::TrackThe built audio data processing pipeline will look like the following figure:

Audio pipeline in AudioMixer::Track

AudioMixer::TrackThe built audio data processing pipeline does not perform resampling.

AudioMixerA function is provided setBufferProvider()to set the audio data source for an audio source of the mixer. The definition of this function is as follows:

void AudioMixer::setBufferProvider(int name, AudioBufferProvider* bufferProvider)
{
    LOG_ALWAYS_FATAL_IF(!exists(name), "invalid name: %d", name);
    const std::shared_ptr<Track> &track = getTrack(name);

    if (track->mInputBufferProvider == bufferProvider) {
        return; // don't reset any buffer providers if identical.
    }
    // reset order from downstream to upstream buffer providers.
    if (track->mTimestretchBufferProvider.get() != nullptr) {
        track->mTimestretchBufferProvider->reset();
    } else if (track->mPostDownmixReformatBufferProvider.get() != nullptr) {
        track->mPostDownmixReformatBufferProvider->reset();
    } else if (track->mDownmixerBufferProvider != nullptr) {
        track->mDownmixerBufferProvider->reset();
    } else if (track->mReformatBufferProvider.get() != nullptr) {
        track->mReformatBufferProvider->reset();
    } else if (track->mAdjustChannelsBufferProvider.get() != nullptr) {
        track->mAdjustChannelsBufferProvider->reset();
    }

    track->mInputBufferProvider = bufferProvider;
    track->reconfigureBufferProviders();
}

setBufferProvider()The function needs to pass in the identifier and object of an audio source AudioBufferProvider. Adding the audio data source node, AudioMixer::Trackthe audio data processing pipeline built in this way will look like the following figure:

Audio pipeline in AudioMixer::Track 2

Mix processing

AudioMixerThe user calls its process()function to perform mixing operations. The data of each audio source during mixing ultimately comes from AudioBufferProviderthe object set for it. The data after mixing is saved in the object set for each audio source MAIN_BUFFER. AudioMixerEach audio source Track contained in will MAIN_BUFFERbe grouped according to its , and MAIN_BUFFERaudio data sharing the same audio source will be mixed together.

AudioMixerfunction , that is, the function of process()its parent class , the function definition (located in frameworks/av/media/libaudioprocessing/include/media/AudioMixerBase.h ) is as follows:AudioMixerBaseprocess()

    void        process() {
        preProcess();
        (this->*mHook)();
        postProcess();
    }
 . . . . . .
    // preProcess is called before the process hook, postProcess after,
    // see the implementation of process() method.
    virtual void preProcess() {}
    virtual void postProcess() {}
 . . . . . .
    // Called when track info changes and a new process hook should be determined.
    void invalidate() {
        mHook = &AudioMixerBase::process__validate;
    }
 . . . . . .
    process_hook_t mHook = &AudioMixerBase::process__nop;   // one of process__*, never nullptr

process()In different situations, the function mHookcalls different functions through the class member function pointer process__*to perform processing operations. It is mHookcalled before being called to perform pre-processing preProcess(), and it is called after being called to postProcess()perform post-processing. AudioMixerThe mixing pre-processing and post-processing are as shown in the following code (located in frameworks/av/media/libaudioprocessing/AudioMixer.cpp ):

void AudioMixer::preProcess()
{
    for (const auto &pair : mTracks) {
        // Clear contracted buffer before processing if contracted channels are saved
        const std::shared_ptr<TrackBase> &tb = pair.second;
        Track *t = static_cast<Track*>(tb.get());
        if (t->mKeepContractedChannels) {
            t->clearContractedBuffer();
        }
    }
}

void AudioMixer::postProcess()
{
    // Process haptic data.
    // Need to keep consistent with VibrationEffect.scale(int, float, int)
    for (const auto &pair : mGroups) {
        // process by group of tracks with same output main buffer.
        const auto &group = pair.second;
        for (const int name : group) {
            const std::shared_ptr<Track> &t = getTrack(name);
            if (t->mHapticPlaybackEnabled) {
                size_t sampleCount = mFrameCount * t->mMixerHapticChannelCount;
                uint8_t* buffer = (uint8_t*)pair.first + mFrameCount * audio_bytes_per_frame(
                        t->mMixerChannelCount, t->mMixerFormat);
                switch (t->mMixerFormat) {
                // Mixer format should be AUDIO_FORMAT_PCM_FLOAT.
                case AUDIO_FORMAT_PCM_FLOAT: {
                    os::scaleHapticData((float*) buffer, sampleCount, t->mHapticIntensity,
                                        t->mHapticMaxAmplitude);
                } break;
                default:
                    LOG_ALWAYS_FATAL("bad mMixerFormat: %#x", t->mMixerFormat);
                    break;
                }
                break;
            }
        }
    }
}

They mainly deal with the tactile channel.

AudioMixerBaseThe mHookdefault points to AudioMixerBase::process__nop()the function, AudioMixerBase::process__nop()the function definition (located in frameworks/av/media/libaudioprocessing/AudioMixerBase.cpp ) is as follows:

void AudioMixerBase::process__nop()
{
    ALOGVV("process__nop\n");

    for (const auto &pair : mGroups) {
        // process by group of tracks with same output buffer to
        // avoid multiple memset() on same buffer
        const auto &group = pair.second;

        const std::shared_ptr<TrackBase> &t = mTracks[group[0]];
        memset(t->mainBuffer, 0,
                mFrameCount * audio_bytes_per_frame(t->getMixerChannelCount(), t->mMixerFormat));

        // now consume data
        for (const int name : group) {
            const std::shared_ptr<TrackBase> &t = mTracks[name];
            size_t outFrames = mFrameCount;
            while (outFrames) {
                t->buffer.frameCount = outFrames;
                t->bufferProvider->getNextBuffer(&t->buffer);
                if (t->buffer.raw == NULL) break;
                outFrames -= t->buffer.frameCount;
                t->bufferProvider->releaseBuffer(&t->buffer);
            }
        }
    }
}

AudioMixerBase::process__nop()The function is processed according to the audio source group. It clears the first element of the audio source group MAIN_BUFFERand obtains a piece of data from each audio source Track to prevent timing caused by the accumulation of audio data in the Track's audio data processing pipeline. question. Referring to AudioMixerBase::process__validate()the definition of functions, AudioMixerBase::process__nop()it seems that functions should also be grouped first.

When AudioMixerthe audio source of is enabled, disabled, or some parameters are modified, the function AudioMixerof invalidate()will be called to reset mHookthe pointing AudioMixerBase::process__validate()function. As shown in the following code (located in frameworks/av/media/libaudioprocessing/AudioMixerBase.cpp ):

void AudioMixerBase::destroy(int name)
{
    LOG_ALWAYS_FATAL_IF(!exists(name), "invalid name: %d", name);
    ALOGV("deleteTrackName(%d)", name);

    if (mTracks[name]->enabled) {
        invalidate();
    }
    mTracks.erase(name); // deallocate track
}

void AudioMixerBase::enable(int name)
{
    LOG_ALWAYS_FATAL_IF(!exists(name), "invalid name: %d", name);
    const std::shared_ptr<TrackBase> &track = mTracks[name];

    if (!track->enabled) {
        track->enabled = true;
        ALOGV("enable(%d)", name);
        invalidate();
    }
}

void AudioMixerBase::disable(int name)
{
    LOG_ALWAYS_FATAL_IF(!exists(name), "invalid name: %d", name);
    const std::shared_ptr<TrackBase> &track = mTracks[name];

    if (track->enabled) {
        track->enabled = false;
        ALOGV("disable(%d)", name);
        invalidate();
    }
}
 . . . . . .
void AudioMixerBase::setParameter(int name, int target, int param, void *value)
{
    LOG_ALWAYS_FATAL_IF(!exists(name), "invalid name: %d", name);
    const std::shared_ptr<TrackBase> &track = mTracks[name];

    int valueInt = static_cast<int>(reinterpret_cast<uintptr_t>(value));
    int32_t *valueBuf = reinterpret_cast<int32_t*>(value);

    switch (target) {

    case TRACK:
        switch (param) {
        case CHANNEL_MASK: {
            const audio_channel_mask_t trackChannelMask =
                static_cast<audio_channel_mask_t>(valueInt);
            if (setChannelMasks(name, trackChannelMask, track->mMixerChannelMask)) {
                ALOGV("setParameter(TRACK, CHANNEL_MASK, %x)", trackChannelMask);
                invalidate();
            }
            } break;
        case MAIN_BUFFER:
            if (track->mainBuffer != valueBuf) {
                track->mainBuffer = valueBuf;
                ALOGV("setParameter(TRACK, MAIN_BUFFER, %p)", valueBuf);
                invalidate();
            }
            break;
        case AUX_BUFFER:
            if (track->auxBuffer != valueBuf) {
                track->auxBuffer = valueBuf;
                ALOGV("setParameter(TRACK, AUX_BUFFER, %p)", valueBuf);
                invalidate();
            }
            break;
        case FORMAT: {
            audio_format_t format = static_cast<audio_format_t>(valueInt);
            if (track->mFormat != format) {
                ALOG_ASSERT(audio_is_linear_pcm(format), "Invalid format %#x", format);
                track->mFormat = format;
                ALOGV("setParameter(TRACK, FORMAT, %#x)", format);
                invalidate();
            }
            } break;

AudioMixerBase::process__validate()The function is the entrance to the actual mixing processing operation. The function definition (located in frameworks/av/media/libaudioprocessing/AudioMixerBase.cpp ) is as follows:

void AudioMixerBase::process__validate()
{
    // TODO: fix all16BitsStereNoResample logic to
    // either properly handle muted tracks (it should ignore them)
    // or remove altogether as an obsolete optimization.
    bool all16BitsStereoNoResample = true;
    bool resampling = false;
    bool volumeRamp = false;

    mEnabled.clear();
    mGroups.clear();
    for (const auto &pair : mTracks) {
        const int name = pair.first;
        const std::shared_ptr<TrackBase> &t = pair.second;
        if (!t->enabled) continue;

        mEnabled.emplace_back(name);  // we add to mEnabled in order of name.
        mGroups[t->mainBuffer].emplace_back(name); // mGroups also in order of name.

        uint32_t n = 0;
        // FIXME can overflow (mask is only 3 bits)
        n |= NEEDS_CHANNEL_1 + t->channelCount - 1;
        if (t->doesResample()) {
            n |= NEEDS_RESAMPLE;
        }
        if (t->auxLevel != 0 && t->auxBuffer != NULL) {
            n |= NEEDS_AUX;
        }

        if (t->volumeInc[0]|t->volumeInc[1]) {
            volumeRamp = true;
        } else if (!t->doesResample() && t->volumeRL == 0) {
            n |= NEEDS_MUTE;
        }
        t->needs = n;

        if (n & NEEDS_MUTE) {
            t->hook = &TrackBase::track__nop;
        } else {
            if (n & NEEDS_AUX) {
                all16BitsStereoNoResample = false;
            }
            if (n & NEEDS_RESAMPLE) {
                all16BitsStereoNoResample = false;
                resampling = true;
                if ((n & NEEDS_CHANNEL_COUNT__MASK) == NEEDS_CHANNEL_1
                        && t->channelMask == AUDIO_CHANNEL_OUT_MONO // MONO_HACK
                        && isAudioChannelPositionMask(t->mMixerChannelMask)) {
                    t->hook = TrackBase::getTrackHook(
                            TRACKTYPE_RESAMPLEMONO, t->mMixerChannelCount,
                            t->mMixerInFormat, t->mMixerFormat);
                } else if ((n & NEEDS_CHANNEL_COUNT__MASK) >= NEEDS_CHANNEL_2
                        && t->useStereoVolume()) {
                    t->hook = TrackBase::getTrackHook(
                            TRACKTYPE_RESAMPLESTEREO, t->mMixerChannelCount,
                            t->mMixerInFormat, t->mMixerFormat);
                } else {
                    t->hook = TrackBase::getTrackHook(
                            TRACKTYPE_RESAMPLE, t->mMixerChannelCount,
                            t->mMixerInFormat, t->mMixerFormat);
                }
                ALOGV_IF((n & NEEDS_CHANNEL_COUNT__MASK) > NEEDS_CHANNEL_2,
                        "Track %d needs downmix + resample", name);
            } else {
                if ((n & NEEDS_CHANNEL_COUNT__MASK) == NEEDS_CHANNEL_1){
                    t->hook = TrackBase::getTrackHook(
                            (isAudioChannelPositionMask(t->mMixerChannelMask)  // TODO: MONO_HACK
                                    && t->channelMask == AUDIO_CHANNEL_OUT_MONO)
                                ? TRACKTYPE_NORESAMPLEMONO : TRACKTYPE_NORESAMPLE,
                            t->mMixerChannelCount,
                            t->mMixerInFormat, t->mMixerFormat);
                    all16BitsStereoNoResample = false;
                }
                if ((n & NEEDS_CHANNEL_COUNT__MASK) >= NEEDS_CHANNEL_2){
                    t->hook = TrackBase::getTrackHook(
                            t->useStereoVolume() ? TRACKTYPE_NORESAMPLESTEREO
                                    : TRACKTYPE_NORESAMPLE,
                            t->mMixerChannelCount, t->mMixerInFormat,
                            t->mMixerFormat);
                    ALOGV_IF((n & NEEDS_CHANNEL_COUNT__MASK) > NEEDS_CHANNEL_2,
                            "Track %d needs downmix", name);
                }
            }
        }
    }

    // select the processing hooks
    mHook = &AudioMixerBase::process__nop;
    if (mEnabled.size() > 0) {
        if (resampling) {
            if (mOutputTemp.get() == nullptr) {
                mOutputTemp.reset(new int32_t[MAX_NUM_CHANNELS * mFrameCount]);
            }
            if (mResampleTemp.get() == nullptr) {
                mResampleTemp.reset(new int32_t[MAX_NUM_CHANNELS * mFrameCount]);
            }
            mHook = &AudioMixerBase::process__genericResampling;
        } else {
            // we keep temp arrays around.
            mHook = &AudioMixerBase::process__genericNoResampling;
            if (all16BitsStereoNoResample && !volumeRamp) {
                if (mEnabled.size() == 1) {
                    const std::shared_ptr<TrackBase> &t = mTracks[mEnabled[0]];
                    if ((t->needs & NEEDS_MUTE) == 0) {
                        // The check prevents a muted track from acquiring a process hook.
                        //
                        // This is dangerous if the track is MONO as that requires
                        // special case handling due to implicit channel duplication.
                        // Stereo or Multichannel should actually be fine here.
                        mHook = getProcessHook(PROCESSTYPE_NORESAMPLEONETRACK,
                                t->mMixerChannelCount, t->mMixerInFormat, t->mMixerFormat,
                                t->useStereoVolume());
                    }
                }
            }
        }
    }

    ALOGV("mixer configuration change: %zu "
        "all16BitsStereoNoResample=%d, resampling=%d, volumeRamp=%d",
        mEnabled.size(), all16BitsStereoNoResample, resampling, volumeRamp);

    process();

    // Now that the volume ramp has been done, set optimal state and
    // track hooks for subsequent mixer process
    if (mEnabled.size() > 0) {
        bool allMuted = true;

        for (const int name : mEnabled) {
            const std::shared_ptr<TrackBase> &t = mTracks[name];
            if (!t->doesResample() && t->volumeRL == 0) {
                t->needs |= NEEDS_MUTE;
                t->hook = &TrackBase::track__nop;
            } else {
                allMuted = false;
            }
        }
        if (allMuted) {
            mHook = &AudioMixerBase::process__nop;
        } else if (all16BitsStereoNoResample) {
            if (mEnabled.size() == 1) {
                //const int i = 31 - __builtin_clz(enabledTracks);
                const std::shared_ptr<TrackBase> &t = mTracks[mEnabled[0]];
                // Muted single tracks handled by allMuted above.
                mHook = getProcessHook(PROCESSTYPE_NORESAMPLEONETRACK,
                        t->mMixerChannelCount, t->mMixerInFormat, t->mMixerFormat,
                        t->useStereoVolume());
            }
        }
    }
}

AudioMixerBase::process__validate()The execution process of the function is as follows:

  1. MAIN_BUFFERGroup all enabled audio sources according to their and initialize each audio source hook. The latter is used to mix the data of the current audio source into the output data, while checking whether there is a need for resampling and whether a volume ramp is performed. Operation needs, and whether all are 16-bit two-channel stereo and do not require resampling. The large initialization t->hookcode here seems to define a function specifically for TrackBase, and it is more reasonable to put this code in ;
  2. According to the overall situation of all audio sources in the mixer, reset AudioMixerto mHookpoint to the appropriate processing function, and allocate the temporary buffer and output buffer used in resampling as needed;
  3. Recursively call process()the function to perform mixing processing operations;
  4. At this time, the volume ramp operation has been completed, and the audio source Track of the mixer hookand the mixing processing operation callback of the mixer are reset mHookas needed for subsequent mixing operations.

If AudioMixerthe mixing processing operation of is flattened, it will have an execution process similar to the following figure:

AudioMixer process

When AudioMixerthe audio sources of are enabled, disabled, or certain parameters are modified, AudioMixerBase::process__validate()the function will first be executed to MAIN_BUFFERgroup them according to the audio sources of the mixer and initialize the subsequent mixing processing according to their configuration. operation, and then AudioMixerBase::process__*perform mixing processing through other functions.

When there is an audio source that needs to be resampled during the mixing operation, the mixing operation is mainly AudioMixerBase::process__genericResampling()completed by the function, which is defined as follows:

// generic code with resampling
void AudioMixerBase::process__genericResampling()
{
    ALOGVV("process__genericResampling\n");
    int32_t * const outTemp = mOutputTemp.get(); // naked ptr
    size_t numFrames = mFrameCount;

    for (const auto &pair : mGroups) {
        const auto &group = pair.second;
        const std::shared_ptr<TrackBase> &t1 = mTracks[group[0]];

        // clear temp buffer
        memset(outTemp, 0, sizeof(*outTemp) * t1->mMixerChannelCount * mFrameCount);
        for (const int name : group) {
            const std::shared_ptr<TrackBase> &t = mTracks[name];
            int32_t *aux = NULL;
            if (CC_UNLIKELY(t->needs & NEEDS_AUX)) {
                aux = t->auxBuffer;
            }

            // this is a little goofy, on the resampling case we don't
            // acquire/release the buffers because it's done by
            // the resampler.
            if (t->needs & NEEDS_RESAMPLE) {
                (t.get()->*t->hook)(outTemp, numFrames, mResampleTemp.get() /* naked ptr */, aux);
            } else {

                size_t outFrames = 0;

                while (outFrames < numFrames) {
                    t->buffer.frameCount = numFrames - outFrames;
                    t->bufferProvider->getNextBuffer(&t->buffer);
                    t->mIn = t->buffer.raw;
                    // t->mIn == nullptr can happen if the track was flushed just after having
                    // been enabled for mixing.
                    if (t->mIn == nullptr) break;

                    (t.get()->*t->hook)(
                            outTemp + outFrames * t->mMixerChannelCount, t->buffer.frameCount,
                            mResampleTemp.get() /* naked ptr */,
                            aux != nullptr ? aux + outFrames : nullptr);
                    outFrames += t->buffer.frameCount;

                    t->bufferProvider->releaseBuffer(&t->buffer);
                }
            }
        }
        convertMixerFormat(t1->mainBuffer, t1->mMixerFormat,
                outTemp, t1->mMixerInFormat, numFrames * t1->mMixerChannelCount);
    }
}

The execution process of this function is roughly as follows:

  1. Perform mixing operations for each audio source group of the mixer;
  2. For each audio source in the audio source group, if it needs to be resampled, call the audio source Track to hookresample the audio data of the audio source and mix it into a temporary buffer. If it does not need to be resampled, repeat Obtain a piece of audio data from the audio source Track and mix it into a temporary buffer until the obtained data reaches the number of output audio frames requested by the mixer. The code here implies that each time it is obtained from the audio source Track The number of audio data and the number of mixer output data are aligned, that is, the latter is an integer multiple of the former. As you can see here, the interface semantics of the audio source Track hookare very confusing and inconsistent. When the audio source needs to be resampled, hookThe operation generates and mixes all the audio frames requested by the mixer at once. When the audio source does not require resampling, each call only mixes a small piece of data that has been obtained. In fact, when the audio source does not require resampling, The entire loop here is not as TrackBaseclear and neat as encapsulating it into a function, similar to the situation where mixing is required ;
  3. Call convertMixerFormat()the function to convert the mixed data into the format and save it into the audio source group MAIN_BUFFER.

AudioMixerBase::convertMixerFormat()The function is defined as follows:

void AudioMixerBase::convertMixerFormat(void *out, audio_format_t mixerOutFormat,
        void *in, audio_format_t mixerInFormat, size_t sampleCount)
{
    switch (mixerInFormat) {
    case AUDIO_FORMAT_PCM_FLOAT:
        switch (mixerOutFormat) {
        case AUDIO_FORMAT_PCM_FLOAT:
            memcpy(out, in, sampleCount * sizeof(float)); // MEMCPY. TODO optimize out
            break;
        case AUDIO_FORMAT_PCM_16_BIT:
            memcpy_to_i16_from_float((int16_t*)out, (float*)in, sampleCount);
            break;
        default:
            LOG_ALWAYS_FATAL("bad mixerOutFormat: %#x", mixerOutFormat);
            break;
        }
        break;
    case AUDIO_FORMAT_PCM_16_BIT:
        switch (mixerOutFormat) {
        case AUDIO_FORMAT_PCM_FLOAT:
            memcpy_to_float_from_q4_27((float*)out, (const int32_t*)in, sampleCount);
            break;
        case AUDIO_FORMAT_PCM_16_BIT:
            memcpy_to_i16_from_q4_27((int16_t*)out, (const int32_t*)in, sampleCount);
            break;
        default:
            LOG_ALWAYS_FATAL("bad mixerOutFormat: %#x", mixerOutFormat);
            break;
        }
        break;
    default:
        LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
        break;
    }
}

AudioMixerBase::convertMixerFormat()memcpy*Functions perform format conversion through different functions.

When there is no audio source that needs to be resampled during the mixing operation, the mixing operation is mainly AudioMixerBase::process__genericNoResampling()completed by the function, which is defined as follows:

void AudioMixerBase::process__genericNoResampling()
{
    ALOGVV("process__genericNoResampling\n");
    int32_t outTemp[BLOCKSIZE * MAX_NUM_CHANNELS] __attribute__((aligned(32)));

    for (const auto &pair : mGroups) {
        // process by group of tracks with same output main buffer to
        // avoid multiple memset() on same buffer
        const auto &group = pair.second;

        // acquire buffer
        for (const int name : group) {
            const std::shared_ptr<TrackBase> &t = mTracks[name];
            t->buffer.frameCount = mFrameCount;
            t->bufferProvider->getNextBuffer(&t->buffer);
            t->frameCount = t->buffer.frameCount;
            t->mIn = t->buffer.raw;
        }

        int32_t *out = (int *)pair.first;
        size_t numFrames = 0;
        do {
            const size_t frameCount = std::min((size_t)BLOCKSIZE, mFrameCount - numFrames);
            memset(outTemp, 0, sizeof(outTemp));
            for (const int name : group) {
                const std::shared_ptr<TrackBase> &t = mTracks[name];
                int32_t *aux = NULL;
                if (CC_UNLIKELY(t->needs & NEEDS_AUX)) {
                    aux = t->auxBuffer + numFrames;
                }
                for (int outFrames = frameCount; outFrames > 0; ) {
                    // t->in == nullptr can happen if the track was flushed just after having
                    // been enabled for mixing.
                    if (t->mIn == nullptr) {
                        break;
                    }
                    size_t inFrames = (t->frameCount > outFrames)?outFrames:t->frameCount;
                    if (inFrames > 0) {
                        (t.get()->*t->hook)(
                                outTemp + (frameCount - outFrames) * t->mMixerChannelCount,
                                inFrames, mResampleTemp.get() /* naked ptr */, aux);
                        t->frameCount -= inFrames;
                        outFrames -= inFrames;
                        if (CC_UNLIKELY(aux != NULL)) {
                            aux += inFrames;
                        }
                    }
                    if (t->frameCount == 0 && outFrames) {
                        t->bufferProvider->releaseBuffer(&t->buffer);
                        t->buffer.frameCount = (mFrameCount - numFrames) -
                                (frameCount - outFrames);
                        t->bufferProvider->getNextBuffer(&t->buffer);
                        t->mIn = t->buffer.raw;
                        if (t->mIn == nullptr) {
                            break;
                        }
                        t->frameCount = t->buffer.frameCount;
                    }
                }
            }

            const std::shared_ptr<TrackBase> &t1 = mTracks[group[0]];
            convertMixerFormat(out, t1->mMixerFormat, outTemp, t1->mMixerInFormat,
                    frameCount * t1->mMixerChannelCount);
            // TODO: fix ugly casting due to choice of out pointer type
            out = reinterpret_cast<int32_t*>((uint8_t*)out
                    + frameCount * t1->mMixerChannelCount
                    * audio_bytes_per_sample(t1->mMixerFormat));
            numFrames += frameCount;
        } while (numFrames < mFrameCount);

        // release each track's buffer
        for (const int name : group) {
            const std::shared_ptr<TrackBase> &t = mTracks[name];
            t->bufferProvider->releaseBuffer(&t->buffer);
        }
    }
}

The execution process of this function can AudioMixerBase::process__genericResampling()be compared with the execution process of the function. They both perform mixing operations for each audio source group of the mixer, but the former takes a small piece of data from all audio sources each time. Mixing, after a small piece of data is processed, another small piece of data is processed until the number of output audio frames requested by the mixer is reached; the latter is to process one audio source at a time and convert a sufficient number of audio sources into The audio data is mixed into the output data, and then a sufficient amount of audio data from the next audio source is mixed until all audio sources have been processed. In addition, the latter can process up to 16 audio frames at a time, which is 16 audio frames BLOCKSIZE.

If AudioMixerBase::process__genericResampling()the implementation of functions already seems confusing, AudioMixerBase::process__genericNoResampling()the implementation of functions is even more confusing. If you look at these two functions carefully, you will think, what is the meaning of defining these two functions separately? Merging them will look better. In addition, I don't see AudioMixerBase::process__genericNoResampling()the need to process up to 16 audio frames at a time? It is clear that most of the logic of these two functions is the same. Even if there is a slight difference in the writing method of the two functions, it is difficult to hide the fact that they are duplicate codes.

Seeing such poor code written in Android AOSP really makes people wonder how much role code quality plays in Android's success.

AudioMixerBaseAudioMixerBase::process__oneTrack16BitsStereoNoResampling()There are also two optimized processing functions and for situations where there is only one audio source in the mixer and no resampling is required AudioMixerBase::process__noResampleOneTrack(). The usage conditions of these two functions are as AudioMixerBase::getProcessHook()shown in the definition of the function:

/* Returns the proper process hook for mixing tracks. Currently works only for
 * PROCESSTYPE_NORESAMPLEONETRACK, a mix involving one track, no resampling.
 *
 * TODO: Due to the special mixing considerations of duplicating to
 * a stereo output track, the input track cannot be MONO.  This should be
 * prevented by the caller.
 */
/* static */
AudioMixerBase::process_hook_t AudioMixerBase::getProcessHook(
        int processType, uint32_t channelCount,
        audio_format_t mixerInFormat, audio_format_t mixerOutFormat,
        bool stereoVolume)
{
    if (processType != PROCESSTYPE_NORESAMPLEONETRACK) { // Only NORESAMPLEONETRACK
        LOG_ALWAYS_FATAL("bad processType: %d", processType);
        return NULL;
    }
    if (!kUseNewMixer && channelCount == FCC_2 && mixerInFormat == AUDIO_FORMAT_PCM_16_BIT) {
        return &AudioMixerBase::process__oneTrack16BitsStereoNoResampling;
    }
    LOG_ALWAYS_FATAL_IF(channelCount > MAX_NUM_CHANNELS);

    if (stereoVolume) { // templated arguments require explicit values.
        switch (mixerInFormat) {
        case AUDIO_FORMAT_PCM_FLOAT:
            switch (mixerOutFormat) {
            case AUDIO_FORMAT_PCM_FLOAT:
                return &AudioMixerBase::process__noResampleOneTrack<
                        MIXTYPE_MULTI_SAVEONLY_STEREOVOL, float /*TO*/,
                        float /*TI*/, TYPE_AUX>;
            case AUDIO_FORMAT_PCM_16_BIT:
                return &AudioMixerBase::process__noResampleOneTrack<
                        MIXTYPE_MULTI_SAVEONLY_STEREOVOL, int16_t /*TO*/,
                        float /*TI*/, TYPE_AUX>;
            default:
                LOG_ALWAYS_FATAL("bad mixerOutFormat: %#x", mixerOutFormat);
                break;
            }
            break;
        case AUDIO_FORMAT_PCM_16_BIT:
            switch (mixerOutFormat) {
            case AUDIO_FORMAT_PCM_FLOAT:
                return &AudioMixerBase::process__noResampleOneTrack<
                        MIXTYPE_MULTI_SAVEONLY_STEREOVOL, float /*TO*/,
                        int16_t /*TI*/, TYPE_AUX>;
            case AUDIO_FORMAT_PCM_16_BIT:
                return &AudioMixerBase::process__noResampleOneTrack<
                        MIXTYPE_MULTI_SAVEONLY_STEREOVOL, int16_t /*TO*/,
                        int16_t /*TI*/, TYPE_AUX>;
            default:
                LOG_ALWAYS_FATAL("bad mixerOutFormat: %#x", mixerOutFormat);
                break;
            }
            break;
        default:
            LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
            break;
        }
    } else {
          switch (mixerInFormat) {
          case AUDIO_FORMAT_PCM_FLOAT:
              switch (mixerOutFormat) {
              case AUDIO_FORMAT_PCM_FLOAT:
                  return &AudioMixerBase::process__noResampleOneTrack<
                          MIXTYPE_MULTI_SAVEONLY, float /*TO*/,
                          float /*TI*/, TYPE_AUX>;
              case AUDIO_FORMAT_PCM_16_BIT:
                  return &AudioMixerBase::process__noResampleOneTrack<
                          MIXTYPE_MULTI_SAVEONLY, int16_t /*TO*/,
                          float /*TI*/, TYPE_AUX>;
              default:
                  LOG_ALWAYS_FATAL("bad mixerOutFormat: %#x", mixerOutFormat);
                  break;
              }
              break;
          case AUDIO_FORMAT_PCM_16_BIT:
              switch (mixerOutFormat) {
              case AUDIO_FORMAT_PCM_FLOAT:
                  return &AudioMixerBase::process__noResampleOneTrack<
                          MIXTYPE_MULTI_SAVEONLY, float /*TO*/,
                          int16_t /*TI*/, TYPE_AUX>;
              case AUDIO_FORMAT_PCM_16_BIT:
                  return &AudioMixerBase::process__noResampleOneTrack<
                          MIXTYPE_MULTI_SAVEONLY, int16_t /*TO*/,
                          int16_t /*TI*/, TYPE_AUX>;
              default:
                  LOG_ALWAYS_FATAL("bad mixerOutFormat: %#x", mixerOutFormat);
                  break;
              }
              break;
          default:
              LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
              break;
          }
    }
    return NULL;
}

The detailed processes of these two processing operations will not be looked at here.

We saw earlier that the audio stream Track hookmixes the audio data of the current stream into the output data. hookThe actual function TrackBase::getTrackHook()is determined by parameters such as whether resampling is required, the input and output sampling format and the number of channels. The specific strategy is as follows: Show:

AudioMixerBase::hook_t AudioMixerBase::TrackBase::getTrackHook(int trackType, uint32_t channelCount,
        audio_format_t mixerInFormat, audio_format_t mixerOutFormat __unused)
{
    if (!kUseNewMixer && channelCount == FCC_2 && mixerInFormat == AUDIO_FORMAT_PCM_16_BIT) {
        switch (trackType) {
        case TRACKTYPE_NOP:
            return &TrackBase::track__nop;
        case TRACKTYPE_RESAMPLE:
            return &TrackBase::track__genericResample;
        case TRACKTYPE_NORESAMPLEMONO:
            return &TrackBase::track__16BitsMono;
        case TRACKTYPE_NORESAMPLE:
            return &TrackBase::track__16BitsStereo;
        default:
            LOG_ALWAYS_FATAL("bad trackType: %d", trackType);
            break;
        }
    }
    LOG_ALWAYS_FATAL_IF(channelCount > MAX_NUM_CHANNELS);
    switch (trackType) {
    case TRACKTYPE_NOP:
        return &TrackBase::track__nop;
    case TRACKTYPE_RESAMPLE:
        switch (mixerInFormat) {
        case AUDIO_FORMAT_PCM_FLOAT:
            return (AudioMixerBase::hook_t) &TrackBase::track__Resample<
                    MIXTYPE_MULTI, float /*TO*/, float /*TI*/, TYPE_AUX>;
        case AUDIO_FORMAT_PCM_16_BIT:
            return (AudioMixerBase::hook_t) &TrackBase::track__Resample<
                    MIXTYPE_MULTI, int32_t /*TO*/, int16_t /*TI*/, TYPE_AUX>;
        default:
            LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
            break;
        }
        break;
    case TRACKTYPE_RESAMPLESTEREO:
        switch (mixerInFormat) {
        case AUDIO_FORMAT_PCM_FLOAT:
            return (AudioMixerBase::hook_t) &TrackBase::track__Resample<
                    MIXTYPE_MULTI_STEREOVOL, float /*TO*/, float /*TI*/,
                    TYPE_AUX>;
        case AUDIO_FORMAT_PCM_16_BIT:
            return (AudioMixerBase::hook_t) &TrackBase::track__Resample<
                    MIXTYPE_MULTI_STEREOVOL, int32_t /*TO*/, int16_t /*TI*/,
                    TYPE_AUX>;
        default:
            LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
            break;
        }
        break;
    // RESAMPLEMONO needs MIXTYPE_STEREOEXPAND since resampler will upmix mono
    // track to stereo track
    case TRACKTYPE_RESAMPLEMONO:
        switch (mixerInFormat) {
        case AUDIO_FORMAT_PCM_FLOAT:
            return (AudioMixerBase::hook_t) &TrackBase::track__Resample<
                    MIXTYPE_STEREOEXPAND, float /*TO*/, float /*TI*/,
                    TYPE_AUX>;
        case AUDIO_FORMAT_PCM_16_BIT:
            return (AudioMixerBase::hook_t) &TrackBase::track__Resample<
                    MIXTYPE_STEREOEXPAND, int32_t /*TO*/, int16_t /*TI*/,
                    TYPE_AUX>;
        default:
            LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
            break;
        }
        break;
    case TRACKTYPE_NORESAMPLEMONO:
        switch (mixerInFormat) {
        case AUDIO_FORMAT_PCM_FLOAT:
            return (AudioMixerBase::hook_t) &TrackBase::track__NoResample<
                            MIXTYPE_MONOEXPAND, float /*TO*/, float /*TI*/, TYPE_AUX>;
        case AUDIO_FORMAT_PCM_16_BIT:
            return (AudioMixerBase::hook_t) &TrackBase::track__NoResample<
                            MIXTYPE_MONOEXPAND, int32_t /*TO*/, int16_t /*TI*/, TYPE_AUX>;
        default:
            LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
            break;
        }
        break;
    case TRACKTYPE_NORESAMPLE:
        switch (mixerInFormat) {
        case AUDIO_FORMAT_PCM_FLOAT:
            return (AudioMixerBase::hook_t) &TrackBase::track__NoResample<
                    MIXTYPE_MULTI, float /*TO*/, float /*TI*/, TYPE_AUX>;
        case AUDIO_FORMAT_PCM_16_BIT:
            return (AudioMixerBase::hook_t) &TrackBase::track__NoResample<
                    MIXTYPE_MULTI, int32_t /*TO*/, int16_t /*TI*/, TYPE_AUX>;
        default:
            LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
            break;
        }
        break;
    case TRACKTYPE_NORESAMPLESTEREO:
        switch (mixerInFormat) {
        case AUDIO_FORMAT_PCM_FLOAT:
            return (AudioMixerBase::hook_t) &TrackBase::track__NoResample<
                    MIXTYPE_MULTI_STEREOVOL, float /*TO*/, float /*TI*/,
                    TYPE_AUX>;
        case AUDIO_FORMAT_PCM_16_BIT:
            return (AudioMixerBase::hook_t) &TrackBase::track__NoResample<
                    MIXTYPE_MULTI_STEREOVOL, int32_t /*TO*/, int16_t /*TI*/,
                    TYPE_AUX>;
        default:
            LOG_ALWAYS_FATAL("bad mixerInFormat: %#x", mixerInFormat);
            break;
        }
        break;
    default:
        LOG_ALWAYS_FATAL("bad trackType: %d", trackType);
        break;
    }
    return NULL;
}

When the audio source of the mixer does not need to be mixed, the audio source Track is hook, TrackBase::track__nop()it does nothing, the function is defined (located frameworks/av/media/libaudioprocessing/AudioMixerBase.cpp) as follows:

void AudioMixerBase::TrackBase::track__nop(int32_t* out __unused,
        size_t outFrameCount __unused, int32_t* temp __unused, int32_t* aux __unused)
{
}

When the audio source Track of the mixer needs to be resampled before mixing, and the output sampling format of the audio source Track is, and the number of AUDIO_FORMAT_PCM_16_BIToutput channels of the mixer is two-channel stereo, the audio source Track hookis TrackBase::track__genericResample(), and this function is defined (at frameworks/av/media/libaudioprocessing/AudioMixerBase.cpp) as follows:

void AudioMixerBase::TrackBase::track__genericResample(
        int32_t* out, size_t outFrameCount, int32_t* temp, int32_t* aux)
{
    ALOGVV("track__genericResample\n");
    mResampler->setSampleRate(sampleRate);

    // ramp gain - resample to temp buffer and scale/mix in 2nd step
    if (aux != NULL) {
        // always resample with unity gain when sending to auxiliary buffer to be able
        // to apply send level after resampling
        mResampler->setVolume(UNITY_GAIN_FLOAT, UNITY_GAIN_FLOAT);
        memset(temp, 0, outFrameCount * mMixerChannelCount * sizeof(int32_t));
        mResampler->resample(temp, outFrameCount, bufferProvider);
        if (CC_UNLIKELY(volumeInc[0]|volumeInc[1]|auxInc)) {
            volumeRampStereo(out, outFrameCount, temp, aux);
        } else {
            volumeStereo(out, outFrameCount, temp, aux);
        }
    } else {
        if (CC_UNLIKELY(volumeInc[0]|volumeInc[1])) {
            mResampler->setVolume(UNITY_GAIN_FLOAT, UNITY_GAIN_FLOAT);
            memset(temp, 0, outFrameCount * MAX_NUM_CHANNELS * sizeof(int32_t));
            mResampler->resample(temp, outFrameCount, bufferProvider);
            volumeRampStereo(out, outFrameCount, temp, aux);
        }

        // constant gain
        else {
            mResampler->setVolume(mVolume[0], mVolume[1]);
            mResampler->resample(out, outFrameCount, bufferProvider);
        }
    }
}

void AudioMixerBase::TrackBase::track__nop(int32_t* out __unused,
        size_t outFrameCount __unused, int32_t* temp __unused, int32_t* aux __unused)
{
}

void AudioMixerBase::TrackBase::volumeRampStereo(
        int32_t* out, size_t frameCount, int32_t* temp, int32_t* aux)
{
    int32_t vl = prevVolume[0];
    int32_t vr = prevVolume[1];
    const int32_t vlInc = volumeInc[0];
    const int32_t vrInc = volumeInc[1];

    //ALOGD("[0] %p: inc=%f, v0=%f, v1=%d, final=%f, count=%d",
    //        t, vlInc/65536.0f, vl/65536.0f, volume[0],
    //       (vl + vlInc*frameCount)/65536.0f, frameCount);

    // ramp volume
    if (CC_UNLIKELY(aux != NULL)) {
        int32_t va = prevAuxLevel;
        const int32_t vaInc = auxInc;
        int32_t l;
        int32_t r;

        do {
            l = (*temp++ >> 12);
            r = (*temp++ >> 12);
            *out++ += (vl >> 16) * l;
            *out++ += (vr >> 16) * r;
            *aux++ += (va >> 17) * (l + r);
            vl += vlInc;
            vr += vrInc;
            va += vaInc;
        } while (--frameCount);
        prevAuxLevel = va;
    } else {
        do {
            *out++ += (vl >> 16) * (*temp++ >> 12);
            *out++ += (vr >> 16) * (*temp++ >> 12);
            vl += vlInc;
            vr += vrInc;
        } while (--frameCount);
    }
    prevVolume[0] = vl;
    prevVolume[1] = vr;
    adjustVolumeRamp(aux != NULL);
}

void AudioMixerBase::TrackBase::volumeStereo(
        int32_t* out, size_t frameCount, int32_t* temp, int32_t* aux)
{
    const int16_t vl = volume[0];
    const int16_t vr = volume[1];

    if (CC_UNLIKELY(aux != NULL)) {
        const int16_t va = auxLevel;
        do {
            int16_t l = (int16_t)(*temp++ >> 12);
            int16_t r = (int16_t)(*temp++ >> 12);
            out[0] = mulAdd(l, vl, out[0]);
            int16_t a = (int16_t)(((int32_t)l + r) >> 1);
            out[1] = mulAdd(r, vr, out[1]);
            out += 2;
            aux[0] = mulAdd(a, va, aux[0]);
            aux++;
        } while (--frameCount);
    } else {
        do {
            int16_t l = (int16_t)(*temp++ >> 12);
            int16_t r = (int16_t)(*temp++ >> 12);
            out[0] = mulAdd(l, vl, out[0]);
            out[1] = mulAdd(r, vr, out[1]);
            out += 2;
        } while (--frameCount);
    }
}

TrackBase::track__genericResample()The basic processing flow of the function is roughly as follows:

  1. Resample the audio data of the audio source Track;
  2. Mixing is done while processing the volume. If there is Aux data, it is also processed at the same time. The volume processing here has two meanings, that is, if a volume ramp is needed, the volume ramp is done, otherwise the volume gain is applied.

The Android resampler supports resampling while applying volume gain and mixing the processed audio data into the output data. Therefore, if there is no need for volume ramping on the audio source Track and there is no Aux data, the resampling, volume gain application and mixing can be completed directly by the resampler. TrackBase::track__genericResample()The implementation of the function seems to have some repetitive code. If it is written in a way that is more consistent with its overall execution flow, it can be written as follows:

void AudioMixerBase::TrackBase::track__genericResample(
        int32_t* out, size_t outFrameCount, int32_t* temp, int32_t* aux)
{
    ALOGVV("track__genericResample\n");
    mResampler->setSampleRate(sampleRate);

    float left = mVolume[0];
    float right = mVolume[1];
    int32_t* out_inter = out;
    if (aux != NULL || CC_UNLIKELY(volumeInc[0]|volumeInc[1])) {
        left = UNITY_GAIN_FLOAT;
        right = UNITY_GAIN_FLOAT;

        memset(temp, 0, outFrameCount * mMixerChannelCount * sizeof(int32_t));
        out_inter = temp;
    }

    mResampler->setVolume(left, right);
    mResampler->resample(out_inter, outFrameCount, bufferProvider);

    if (CC_UNLIKELY(volumeInc[0]|volumeInc[1]) || (aux != NULL && auxInc)) {
        volumeRampStereo(out, outFrameCount, out_inter, aux);
    } else if (aux != NULL) {
        volumeStereo(out, outFrameCount, out_inter, aux);
    }
}

TrackBase::track__genericResample()The functions call volumeRampStereo()and volumeStereo()to mix the resampled data, the former is called when volume ramping needs to be performed, and the latter is called when it is not needed. AUX data is generated from data from the left and right channels of audio. These two functions can also be written in a more concise way, as follows:

void AudioMixerBase::TrackBase::volumeRampStereo(
        int32_t* out, size_t frameCount, int32_t* temp, int32_t* aux)
{
    int32_t vl = prevVolume[0];
    int32_t vr = prevVolume[1];
    const int32_t vlInc = volumeInc[0];
    const int32_t vrInc = volumeInc[1];

    //ALOGD("[0] %p: inc=%f, v0=%f, v1=%d, final=%f, count=%d",
    //        t, vlInc/65536.0f, vl/65536.0f, volume[0],
    //       (vl + vlInc*frameCount)/65536.0f, frameCount);

    // ramp volume
    int32_t va = prevAuxLevel;
    const int32_t vaInc = auxInc;
    int32_t l;
    int32_t r;

    do {
        l = (*temp++ >> 12);
        r = (*temp++ >> 12);
        *out++ += (vl >> 16) * l;
        *out++ += (vr >> 16) * r;
        vl += vlInc;
        vr += vrInc;
        if (CC_UNLIKELY(aux != NULL)) {
            *aux++ += (va >> 17) * (l + r);
            va += vaInc;
        }
    } while (--frameCount);
    prevAuxLevel = va;
    prevVolume[0] = vl;
    prevVolume[1] = vr;
    adjustVolumeRamp(aux != NULL);
}

void AudioMixerBase::TrackBase::volumeStereo(
        int32_t* out, size_t frameCount, int32_t* temp, int32_t* aux)
{
    const int16_t vl = volume[0];
    const int16_t vr = volume[1];
    const int16_t va = auxLevel;
    do {
        int16_t l = (int16_t)(*temp++ >> 12);
        int16_t r = (int16_t)(*temp++ >> 12);
        out[0] = mulAdd(l, vl, out[0]);
        out[1] = mulAdd(r, vr, out[1]);
        out += 2;

        if (CC_UNLIKELY(aux != NULL)) {
            int16_t a = (int16_t)(((int32_t)l + r) >> 1);
            aux[0] = mulAdd(a, va, aux[0]);
            aux++;
        }
    } while (--frameCount);
}

When the audio source Track of the mixer can be mixed without resampling, and the output sampling format of the audio source Track is AUDIO_FORMAT_PCM_16_BIT, the number of output channels of the audio source Track is mono, and the number of output channels of the mixer is two-channel stereo. , the audio source Track hookis TrackBase::track__16BitsMono(), the function is defined (located frameworks/av/media/libaudioprocessing/AudioMixerBase.cpp) as follows:

void AudioMixerBase::TrackBase::track__16BitsMono(
        int32_t* out, size_t frameCount, int32_t* temp __unused, int32_t* aux)
{
    ALOGVV("track__16BitsMono\n");
    const int16_t *in = static_cast<int16_t const *>(mIn);

    if (CC_UNLIKELY(aux != NULL)) {
        // ramp gain
        if (CC_UNLIKELY(volumeInc[0]|volumeInc[1]|auxInc)) {
            int32_t vl = prevVolume[0];
            int32_t vr = prevVolume[1];
            int32_t va = prevAuxLevel;
            const int32_t vlInc = volumeInc[0];
            const int32_t vrInc = volumeInc[1];
            const int32_t vaInc = auxInc;

            // ALOGD("[2] %p: inc=%f, v0=%f, v1=%d, final=%f, count=%d",
            //         t, vlInc/65536.0f, vl/65536.0f, volume[0],
            //         (vl + vlInc*frameCount)/65536.0f, frameCount);

            do {
                int32_t l = *in++;
                *out++ += (vl >> 16) * l;
                *out++ += (vr >> 16) * l;
                *aux++ += (va >> 16) * l;
                vl += vlInc;
                vr += vrInc;
                va += vaInc;
            } while (--frameCount);

            prevVolume[0] = vl;
            prevVolume[1] = vr;
            prevAuxLevel = va;
            adjustVolumeRamp(true);
        }
        // constant gain
        else {
            const int16_t vl = volume[0];
            const int16_t vr = volume[1];
            const int16_t va = (int16_t)auxLevel;
            do {
                int16_t l = *in++;
                out[0] = mulAdd(l, vl, out[0]);
                out[1] = mulAdd(l, vr, out[1]);
                out += 2;
                aux[0] = mulAdd(l, va, aux[0]);
                aux++;
            } while (--frameCount);
        }
    } else {
        // ramp gain
        if (CC_UNLIKELY(volumeInc[0]|volumeInc[1])) {
            int32_t vl = prevVolume[0];
            int32_t vr = prevVolume[1];
            const int32_t vlInc = volumeInc[0];
            const int32_t vrInc = volumeInc[1];

            // ALOGD("[2] %p: inc=%f, v0=%f, v1=%d, final=%f, count=%d",
            //         t, vlInc/65536.0f, vl/65536.0f, volume[0],
            //         (vl + vlInc*frameCount)/65536.0f, frameCount);

            do {
                int32_t l = *in++;
                *out++ += (vl >> 16) * l;
                *out++ += (vr >> 16) * l;
                vl += vlInc;
                vr += vrInc;
            } while (--frameCount);

            prevVolume[0] = vl;
            prevVolume[1] = vr;
            adjustVolumeRamp(false);
        }
        // constant gain
        else {
            const int16_t vl = volume[0];
            const int16_t vr = volume[1];
            do {
                int16_t l = *in++;
                out[0] = mulAdd(l, vl, out[0]);
                out[1] = mulAdd(l, vr, out[1]);
                out += 2;
            } while (--frameCount);
        }
    }
    mIn = in;
}

This function mainly handles four situations:

  1. Aux data needs to be generated and volume ramp processing needs to be performed;
  2. Aux data needs to be generated and volume ramp processing does not need to be performed;
  3. There is no need to generate Aux data, and volume ramp processing needs to be performed;
  4. No Aux data needs to be generated, no volume ramp processing needs to be performed.

This function can also be written in a more concise and clear way, as follows: This

void AudioMixerBase::TrackBase::track__16BitsMono(
        int32_t* out, size_t frameCount, int32_t* temp , int32_t* aux)
{
    ALOGVV("track__16BitsMono\n");
    const int16_t *in = static_cast<int16_t const *>(mIn);

    // ramp gain
    if (CC_UNLIKELY(volumeInc[0]|volumeInc[1]) || (CC_UNLIKELY(aux != NULL && auxInc))) {
        int32_t vl = prevVolume[0];
        int32_t vr = prevVolume[1];
        int32_t va = prevAuxLevel;
        const int32_t vlInc = volumeInc[0];
        const int32_t vrInc = volumeInc[1];
        const int32_t vaInc = auxInc;

        // ALOGD("[2] %p: inc=%f, v0=%f, v1=%d, final=%f, count=%d",
        //         t, vlInc/65536.0f, vl/65536.0f, volume[0],
        //         (vl + vlInc*frameCount)/65536.0f, frameCount);

        do {
            int32_t l = *in++;
            *out++ += (vl >> 16) * l;
            *out++ += (vr >> 16) * l;
            vl += vlInc;
            vr += vrInc;
            if (aux != NULL) {
                *aux++ += (va >> 16) * l;
                va += vaInc;
            }
        } while (--frameCount);

        prevVolume[0] = vl;
        prevVolume[1] = vr;
        prevAuxLevel = va;
        adjustVolumeRamp(true);
    }
    // constant gain
    else {
        const int16_t vl = volume[0];
        const int16_t vr = volume[1];
        const int16_t va = (int16_t)auxLevel;
        do {
            int16_t l = *in++;
            out[0] = mulAdd(l, vl, out[0]);
            out[1] = mulAdd(l, vr, out[1]);
            out += 2;
            if (aux != NULL) {
                aux[0] = mulAdd(l, va, aux[0]);
                aux++;
            }
        } while (--frameCount);
    }
    mIn = in;
}

That is, whether the volume ramp needs to be executed is the main logical division dimension, and whether AUX data needs to be generated is the secondary logical division dimension.

When the audio source Track of the mixer can be mixed without resampling, and the output sampling format of the audio source Track is AUDIO_FORMAT_PCM_16_BIT, the number of output channels of the audio source Track is two-channel stereo, and the number of output channels of the mixer is two-channel stereo. When , the audio source Track hookis TrackBase::track__16BitsStereo(), and the function is defined (located frameworks/av/media/libaudioprocessing/AudioMixerBase.cpp) as follows:

void AudioMixerBase::TrackBase::track__16BitsStereo(
        int32_t* out, size_t frameCount, int32_t* temp __unused, int32_t* aux)
{
    ALOGVV("track__16BitsStereo\n");
    const int16_t *in = static_cast<const int16_t *>(mIn);

    if (CC_UNLIKELY(aux != NULL)) {
        int32_t l;
        int32_t r;
        // ramp gain
        if (CC_UNLIKELY(volumeInc[0]|volumeInc[1]|auxInc)) {
            int32_t vl = prevVolume[0];
            int32_t vr = prevVolume[1];
            int32_t va = prevAuxLevel;
            const int32_t vlInc = volumeInc[0];
            const int32_t vrInc = volumeInc[1];
            const int32_t vaInc = auxInc;
            // ALOGD("[1] %p: inc=%f, v0=%f, v1=%d, final=%f, count=%d",
            //        t, vlInc/65536.0f, vl/65536.0f, volume[0],
            //        (vl + vlInc*frameCount)/65536.0f, frameCount);

            do {
                l = (int32_t)*in++;
                r = (int32_t)*in++;
                *out++ += (vl >> 16) * l;
                *out++ += (vr >> 16) * r;
                *aux++ += (va >> 17) * (l + r);
                vl += vlInc;
                vr += vrInc;
                va += vaInc;
            } while (--frameCount);

            prevVolume[0] = vl;
            prevVolume[1] = vr;
            prevAuxLevel = va;
            adjustVolumeRamp(true);
        }

        // constant gain
        else {
            const uint32_t vrl = volumeRL;
            const int16_t va = (int16_t)auxLevel;
            do {
                uint32_t rl = *reinterpret_cast<const uint32_t *>(in);
                int16_t a = (int16_t)(((int32_t)in[0] + in[1]) >> 1);
                in += 2;
                out[0] = mulAddRL(1, rl, vrl, out[0]);
                out[1] = mulAddRL(0, rl, vrl, out[1]);
                out += 2;
                aux[0] = mulAdd(a, va, aux[0]);
                aux++;
            } while (--frameCount);
        }
    } else {
        // ramp gain
        if (CC_UNLIKELY(volumeInc[0]|volumeInc[1])) {
            int32_t vl = prevVolume[0];
            int32_t vr = prevVolume[1];
            const int32_t vlInc = volumeInc[0];
            const int32_t vrInc = volumeInc[1];

            // ALOGD("[1] %p: inc=%f, v0=%f, v1=%d, final=%f, count=%d",
            //        t, vlInc/65536.0f, vl/65536.0f, volume[0],
            //        (vl + vlInc*frameCount)/65536.0f, frameCount);

            do {
                *out++ += (vl >> 16) * (int32_t) *in++;
                *out++ += (vr >> 16) * (int32_t) *in++;
                vl += vlInc;
                vr += vrInc;
            } while (--frameCount);

            prevVolume[0] = vl;
            prevVolume[1] = vr;
            adjustVolumeRamp(false);
        }

        // constant gain
        else {
            const uint32_t vrl = volumeRL;
            do {
                uint32_t rl = *reinterpret_cast<const uint32_t *>(in);
                in += 2;
                out[0] = mulAddRL(1, rl, vrl, out[0]);
                out[1] = mulAddRL(0, rl, vrl, out[1]);
                out += 2;
            } while (--frameCount);
        }
    }
    mIn = in;
}

TrackBase::track__16BitsStereo()The function is roughly the same as the function above TrackBase::track__16BitsMono(), but there are two differences between them:

  1. The two channels of data in the output audio data come from different sources. For the former, the data of the two channels of the output audio data come from the data of different channels in the input data. The latter copies the data of a single channel in the input data and expands it into the output. Dual channel data in the data;
  2. The former uses an operation on some platforms that can use machine instructions to optimize performance when there is no need to perform a volume ramp operation mulAddRL(). However, this operation can also be used when a volume ramp operation needs to be performed .

It seems that it is not impossible to merge TrackBase::track__16BitsStereo()the and functions into one function, as shown below, the version that does not use the operation:TrackBase::track__16BitsMono()mulAddRL()

template<bool stereo>
void AudioMixerBase::TrackBase::track__16Bits(
        int32_t* out, size_t frameCount, int32_t* temp , int32_t* aux)
{
    ALOGVV("track__16BitsMono\n");
    const int16_t *in = static_cast<int16_t const *>(mIn);

    // ramp gain
    if (CC_UNLIKELY(volumeInc[0]|volumeInc[1]) || (CC_UNLIKELY(aux != NULL && auxInc))) {
        int32_t l;
        int32_t r;

        int32_t vl = prevVolume[0];
        int32_t vr = prevVolume[1];
        int32_t va = prevAuxLevel;
        const int32_t vlInc = volumeInc[0];
        const int32_t vrInc = volumeInc[1];
        const int32_t vaInc = auxInc;

        // ALOGD("[2] %p: inc=%f, v0=%f, v1=%d, final=%f, count=%d",
        //         t, vlInc/65536.0f, vl/65536.0f, volume[0],
        //         (vl + vlInc*frameCount)/65536.0f, frameCount);

        do {
            l = *in++;
            if (stereo) {
                r = *in++;
            } else {
                r = l;
            }
            *out++ += (vl >> 16) * l;
            *out++ += (vr >> 16) * r;
            vl += vlInc;
            vr += vrInc;
            if (aux != NULL) {
                if (stereo) {
                    *aux++ += (va >> 17) * (l + r);;
                } else {
                    *aux++ += (va >> 16) * l;
                }
                va += vaInc;
            }
        } while (--frameCount);

        prevVolume[0] = vl;
        prevVolume[1] = vr;
        prevAuxLevel = va;
        adjustVolumeRamp(true);
    }
    // constant gain
    else {
        const int16_t vl = volume[0];
        const int16_t vr = volume[1];
        const int16_t va = (int16_t)auxLevel;
        int16_t l;
        int16_t r;
        do {
            l = *in++;
            if (stereo) {
                r = *in++;
            } else {
                r = l;
            }
            out[0] = mulAdd(l, vl, out[0]);
            out[1] = mulAdd(r, vr, out[1]);
            out += 2;
            if (aux != NULL) {
                if (stereo) {
                    aux[0] = mulAdd((l + r) >> 1, va, aux[0]);
                } else {
                    aux[0] = mulAdd(l, va, aux[0]);
                }
                aux++;
            }
        } while (--frameCount);
    }

    mIn = in;
}

The mixing operation functions of various audio source Tracks above are available when the use of a new mixer is not forced. When a new mixer is forced to be used, or the configuration parameters of the mixer and audio source Track are different from those listed above, mainly depending on whether mixing is required, the template functions and are used respectively. The definitions of AudioMixerBase::TrackBase::track__Resample()these AudioMixerBase::TrackBase::track__NoResample()two functions (located frameworks/av/media/libaudioprocessing/AudioMixerBase.cpp) are as follows:

/* This track hook is called to do resampling then mixing,
 * pulling from the track's upstream AudioBufferProvider.
 *
 * MIXTYPE     (see AudioMixerOps.h MIXTYPE_* enumeration)
 * TO: int32_t (Q4.27) or float
 * TI: int32_t (Q4.27) or int16_t (Q0.15) or float
 * TA: int32_t (Q4.27) or float
 */
template <int MIXTYPE, typename TO, typename TI, typename TA>
void AudioMixerBase::TrackBase::track__Resample(TO* out, size_t outFrameCount, TO* temp, TA* aux)
{
    ALOGVV("track__Resample\n");
    mResampler->setSampleRate(sampleRate);
    const bool ramp = needsRamp();
    if (MIXTYPE == MIXTYPE_MONOEXPAND || MIXTYPE == MIXTYPE_STEREOEXPAND // custom volume handling
            || ramp || aux != NULL) {
        // if ramp:        resample with unity gain to temp buffer and scale/mix in 2nd step.
        // if aux != NULL: resample with unity gain to temp buffer then apply send level.

        mResampler->setVolume(UNITY_GAIN_FLOAT, UNITY_GAIN_FLOAT);
        memset(temp, 0, outFrameCount * mMixerChannelCount * sizeof(TO));
        mResampler->resample((int32_t*)temp, outFrameCount, bufferProvider);

        volumeMix<MIXTYPE, std::is_same_v<TI, float> /* USEFLOATVOL */, true /* ADJUSTVOL */>(
                out, outFrameCount, temp, aux, ramp);

    } else { // constant volume gain
        mResampler->setVolume(mVolume[0], mVolume[1]);
        mResampler->resample((int32_t*)out, outFrameCount, bufferProvider);
    }
}

/* This track hook is called to mix a track, when no resampling is required.
 * The input buffer should be present in in.
 *
 * MIXTYPE     (see AudioMixerOps.h MIXTYPE_* enumeration)
 * TO: int32_t (Q4.27) or float
 * TI: int32_t (Q4.27) or int16_t (Q0.15) or float
 * TA: int32_t (Q4.27) or float
 */
template <int MIXTYPE, typename TO, typename TI, typename TA>
void AudioMixerBase::TrackBase::track__NoResample(
        TO* out, size_t frameCount, TO* temp __unused, TA* aux)
{
    ALOGVV("track__NoResample\n");
    const TI *in = static_cast<const TI *>(mIn);

    volumeMix<MIXTYPE, std::is_same_v<TI, float> /* USEFLOATVOL */, true /* ADJUSTVOL */>(
            out, frameCount, in, aux, needsRamp());

    // MIXTYPE_MONOEXPAND reads a single input channel and expands to NCHAN output channels.
    // MIXTYPE_MULTI reads NCHAN input channels and places to NCHAN output channels.
    in += (MIXTYPE == MIXTYPE_MONOEXPAND) ? frameCount : frameCount * mMixerChannelCount;
    mIn = in;
}

AudioMixerBase::TrackBase::track__Resample()Function is AudioMixerBase::TrackBase::track__genericResample()somewhat similar to , but the former is a more general implementation.

Modify or set mixer and its audio source parameters

AudioMixerBaseand AudioMixerboth provide setParameter()functions to modify or set the parameters of the mixer and its audio source. AudioMixer::setParameter()The function definition (located frameworks/av/media/libaudioprocessing/AudioMixer.cpp) is as follows:

void AudioMixer::setParameter(int name, int target, int param, void *value)
{
    LOG_ALWAYS_FATAL_IF(!exists(name), "invalid name: %d", name);
    const std::shared_ptr<Track> &track = getTrack(name);

    int valueInt = static_cast<int>(reinterpret_cast<uintptr_t>(value));
    int32_t *valueBuf = reinterpret_cast<int32_t*>(value);

    switch (target) {

    case TRACK:
        switch (param) {
        case CHANNEL_MASK: {
            const audio_channel_mask_t trackChannelMask =
                static_cast<audio_channel_mask_t>(valueInt);
            if (setChannelMasks(name, trackChannelMask,
                    static_cast<audio_channel_mask_t>(
                            track->mMixerChannelMask | track->mMixerHapticChannelMask))) {
                ALOGV("setParameter(TRACK, CHANNEL_MASK, %x)", trackChannelMask);
                invalidate();
            }
            } break;
        case MAIN_BUFFER:
            if (track->mainBuffer != valueBuf) {
                track->mainBuffer = valueBuf;
                ALOGV("setParameter(TRACK, MAIN_BUFFER, %p)", valueBuf);
                if (track->mKeepContractedChannels) {
                    track->prepareForAdjustChannels(mFrameCount);
                }
                invalidate();
            }
            break;
        case AUX_BUFFER:
            AudioMixerBase::setParameter(name, target, param, value);
            break;
        case FORMAT: {
            audio_format_t format = static_cast<audio_format_t>(valueInt);
            if (track->mFormat != format) {
                ALOG_ASSERT(audio_is_linear_pcm(format), "Invalid format %#x", format);
                track->mFormat = format;
                ALOGV("setParameter(TRACK, FORMAT, %#x)", format);
                track->prepareForReformat();
                invalidate();
            }
            } break;
        // FIXME do we want to support setting the downmix type from AudioFlinger?
        //         for a specific track? or per mixer?
        /* case DOWNMIX_TYPE:
            break          */
        case MIXER_FORMAT: {
            audio_format_t format = static_cast<audio_format_t>(valueInt);
            if (track->mMixerFormat != format) {
                track->mMixerFormat = format;
                ALOGV("setParameter(TRACK, MIXER_FORMAT, %#x)", format);
                if (track->mKeepContractedChannels) {
                    track->prepareForAdjustChannels(mFrameCount);
                }
            }
            } break;
        case MIXER_CHANNEL_MASK: {
            const audio_channel_mask_t mixerChannelMask =
                    static_cast<audio_channel_mask_t>(valueInt);
            if (setChannelMasks(name, static_cast<audio_channel_mask_t>(
                                    track->channelMask | track->mHapticChannelMask),
                    mixerChannelMask)) {
                ALOGV("setParameter(TRACK, MIXER_CHANNEL_MASK, %#x)", mixerChannelMask);
                invalidate();
            }
            } break;
        case HAPTIC_ENABLED: {
            const bool hapticPlaybackEnabled = static_cast<bool>(valueInt);
            if (track->mHapticPlaybackEnabled != hapticPlaybackEnabled) {
                track->mHapticPlaybackEnabled = hapticPlaybackEnabled;
                track->mKeepContractedChannels = hapticPlaybackEnabled;
                track->prepareForAdjustChannels(mFrameCount);
            }
            } break;
        case HAPTIC_INTENSITY: {
            const os::HapticScale hapticIntensity = static_cast<os::HapticScale>(valueInt);
            if (track->mHapticIntensity != hapticIntensity) {
                track->mHapticIntensity = hapticIntensity;
            }
            } break;
        case HAPTIC_MAX_AMPLITUDE: {
            const float hapticMaxAmplitude = *reinterpret_cast<float*>(value);
            if (track->mHapticMaxAmplitude != hapticMaxAmplitude) {
                track->mHapticMaxAmplitude = hapticMaxAmplitude;
            }
            } break;
        default:
            LOG_ALWAYS_FATAL("setParameter track: bad param %d", param);
        }
        break;

    case RESAMPLE:
    case RAMP_VOLUME:
    case VOLUME:
        AudioMixerBase::setParameter(name, target, param, value);
        break;
    case TIMESTRETCH:
        switch (param) {
        case PLAYBACK_RATE: {
            const AudioPlaybackRate *playbackRate =
                    reinterpret_cast<AudioPlaybackRate*>(value);
            ALOGW_IF(!isAudioPlaybackRateValid(*playbackRate),
                    "bad parameters speed %f, pitch %f",
                    playbackRate->mSpeed, playbackRate->mPitch);
            if (track->setPlaybackRate(*playbackRate)) {
                ALOGV("setParameter(TIMESTRETCH, PLAYBACK_RATE, STRETCH_MODE, FALLBACK_MODE "
                        "%f %f %d %d",
                        playbackRate->mSpeed,
                        playbackRate->mPitch,
                        playbackRate->mStretchMode,
                        playbackRate->mFallbackMode);
                // invalidate();  (should not require reconfigure)
            }
        } break;
        default:
            LOG_ALWAYS_FATAL("setParameter timestretch: bad param %d", param);
        }
        break;

    default:
        LOG_ALWAYS_FATAL("setParameter: bad target %d", target);
    }
}

AudioMixer::setParameter()The parameters supported by the function include the following:

  • TRACK
    • CHANNEL_MASK : Set the channel mask format for a single audio source of the mixer;
    • MAIN_BUFFER : Set the mixing output data buffer for a single audio source of the mixer. This value should not belong to a single audio source ;
    • AUX_BUFFER : AudioMixerBase::setParameter()Handled by ;
    • FORMAT : Set the sampling format for a single audio source of the mixer;
    • MIXER_FORMAT : Sets the mixer sampling format for a single audio source of the mixer. This value should not belong to a single audio source ;
    • MIXER_CHANNEL_MASK : Sets the mixer channel mask format for a single audio source of the mixer. This value should not belong to a single audio source ;
    • HAPTIC_ENABLED : Sets a haptic switch for a single audio source of the mixer;
    • HAPTIC_INTENSITY : Sets the haptic intensity for a single audio source in the mixer;
    • HAPTIC_MAX_AMPLITUDE : Sets the haptic maximum amplitude for a single audio source in the mixer;
  • RESAMPLE / RAMP_VOLUME / VOLUME : AudioMixerBase::setParameter()Handled by;
  • TIMESTRETCH
    • PLAYBACK_RATE : Sets the playback rate for the mixer's individual audio sources.

AudioMixerBase::setParameter()The function definition (located frameworks/av/media/libaudioprocessing/AudioMixerBase.cpp) is as follows:

void AudioMixerBase::setParameter(int name, int target, int param, void *value)
{
    LOG_ALWAYS_FATAL_IF(!exists(name), "invalid name: %d", name);
    const std::shared_ptr<TrackBase> &track = mTracks[name];

    int valueInt = static_cast<int>(reinterpret_cast<uintptr_t>(value));
    int32_t *valueBuf = reinterpret_cast<int32_t*>(value);

    switch (target) {

    case TRACK:
        switch (param) {
        case CHANNEL_MASK: {
            const audio_channel_mask_t trackChannelMask =
                static_cast<audio_channel_mask_t>(valueInt);
            if (setChannelMasks(name, trackChannelMask, track->mMixerChannelMask)) {
                ALOGV("setParameter(TRACK, CHANNEL_MASK, %x)", trackChannelMask);
                invalidate();
            }
            } break;
        case MAIN_BUFFER:
            if (track->mainBuffer != valueBuf) {
                track->mainBuffer = valueBuf;
                ALOGV("setParameter(TRACK, MAIN_BUFFER, %p)", valueBuf);
                invalidate();
            }
            break;
        case AUX_BUFFER:
            if (track->auxBuffer != valueBuf) {
                track->auxBuffer = valueBuf;
                ALOGV("setParameter(TRACK, AUX_BUFFER, %p)", valueBuf);
                invalidate();
            }
            break;
        case FORMAT: {
            audio_format_t format = static_cast<audio_format_t>(valueInt);
            if (track->mFormat != format) {
                ALOG_ASSERT(audio_is_linear_pcm(format), "Invalid format %#x", format);
                track->mFormat = format;
                ALOGV("setParameter(TRACK, FORMAT, %#x)", format);
                invalidate();
            }
            } break;
        case MIXER_FORMAT: {
            audio_format_t format = static_cast<audio_format_t>(valueInt);
            if (track->mMixerFormat != format) {
                track->mMixerFormat = format;
                ALOGV("setParameter(TRACK, MIXER_FORMAT, %#x)", format);
            }
            } break;
        case MIXER_CHANNEL_MASK: {
            const audio_channel_mask_t mixerChannelMask =
                    static_cast<audio_channel_mask_t>(valueInt);
            if (setChannelMasks(name, track->channelMask, mixerChannelMask)) {
                ALOGV("setParameter(TRACK, MIXER_CHANNEL_MASK, %#x)", mixerChannelMask);
                invalidate();
            }
            } break;
        default:
            LOG_ALWAYS_FATAL("setParameter track: bad param %d", param);
        }
        break;

    case RESAMPLE:
        switch (param) {
        case SAMPLE_RATE:
            ALOG_ASSERT(valueInt > 0, "bad sample rate %d", valueInt);
            if (track->setResampler(uint32_t(valueInt), mSampleRate)) {
                ALOGV("setParameter(RESAMPLE, SAMPLE_RATE, %u)",
                        uint32_t(valueInt));
                invalidate();
            }
            break;
        case RESET:
            track->resetResampler();
            invalidate();
            break;
        case REMOVE:
            track->mResampler.reset(nullptr);
            track->sampleRate = mSampleRate;
            invalidate();
            break;
        default:
            LOG_ALWAYS_FATAL("setParameter resample: bad param %d", param);
        }
        break;

    case RAMP_VOLUME:
    case VOLUME:
        switch (param) {
        case AUXLEVEL:
            if (setVolumeRampVariables(*reinterpret_cast<float*>(value),
                    target == RAMP_VOLUME ? mFrameCount : 0,
                    &track->auxLevel, &track->prevAuxLevel, &track->auxInc,
                    &track->mAuxLevel, &track->mPrevAuxLevel, &track->mAuxInc)) {
                ALOGV("setParameter(%s, AUXLEVEL: %04x)",
                        target == VOLUME ? "VOLUME" : "RAMP_VOLUME", track->auxLevel);
                invalidate();
            }
            break;
        default:
            if ((unsigned)param >= VOLUME0 && (unsigned)param < VOLUME0 + MAX_NUM_VOLUMES) {
                if (setVolumeRampVariables(*reinterpret_cast<float*>(value),
                        target == RAMP_VOLUME ? mFrameCount : 0,
                        &track->volume[param - VOLUME0],
                        &track->prevVolume[param - VOLUME0],
                        &track->volumeInc[param - VOLUME0],
                        &track->mVolume[param - VOLUME0],
                        &track->mPrevVolume[param - VOLUME0],
                        &track->mVolumeInc[param - VOLUME0])) {
                    ALOGV("setParameter(%s, VOLUME%d: %04x)",
                            target == VOLUME ? "VOLUME" : "RAMP_VOLUME", param - VOLUME0,
                                    track->volume[param - VOLUME0]);
                    invalidate();
                }
            } else {
                LOG_ALWAYS_FATAL("setParameter volume: bad param %d", param);
            }
        }
        break;

    default:
        LOG_ALWAYS_FATAL("setParameter: bad target %d", target);
    }
}

AudioMixerBase::setParameter()The parameters supported by the function include the following:

  • TRACK
    • CHANNEL_MASK / MAIN_BUFFER / FORMAT / MIXER_FORMAT / MIXER_CHANNEL_MASK : AudioMixer::setParameter()These parameters are also processed in the function ;
    • AUX_BUFFER : Sets the auxiliary data buffer for a single audio source of the mixer;
  • RESAMPLE
    • SAMPLE_RATE : Sets the input sampling rate of the resampler for a single audio source of the mixer, which is also the sampling rate of the audio source itself, and the output sampling rate is the output sampling rate of the mixer;
    • RESET : Resets the mixer's resampler for a single audio source;
    • REMOVE : Removes the resampler of a single audio source from the mixer;
  • RAMP_VOLUME/VOLUME
    • AUXLEVEL : Sets the auxiliary data volume for a single audio source of the mixer;
    • default : Sets the volume for a single audio source of the mixer.

These parameters, regardless of whether they logically belong to a single audio source, are all set to a single audio source. Modifying some parameters of a single audio source, such as channel mask format, sampling format, and double-speed playback, may trigger the reconfiguration of the audio data processing pipeline of a single audio source. Changes to many parameters will cause the mixer invalidate()to execute and reinitialize the mixing process, including the grouping of audio sources, etc.

Here's a closer look at setting the resampler's input sample rate for a single audio source in the mixer. This setting is AudioMixerBase::TrackBase::setResampler()handled by the function, which is defined (located frameworks/av/media/libaudioprocessing/AudioMixerBase.cpp) as follows:

bool AudioMixerBase::TrackBase::setResampler(uint32_t trackSampleRate, uint32_t devSampleRate)
{
    if (trackSampleRate != devSampleRate || mResampler.get() != nullptr) {
        if (sampleRate != trackSampleRate) {
            sampleRate = trackSampleRate;
            if (mResampler.get() == nullptr) {
                ALOGV("Creating resampler from track %d Hz to device %d Hz",
                        trackSampleRate, devSampleRate);
                AudioResampler::src_quality quality;
                // force lowest quality level resampler if use case isn't music or video
                // FIXME this is flawed for dynamic sample rates, as we choose the resampler
                // quality level based on the initial ratio, but that could change later.
                // Should have a way to distinguish tracks with static ratios vs. dynamic ratios.
                if (isMusicRate(trackSampleRate)) {
                    quality = AudioResampler::DEFAULT_QUALITY;
                } else {
                    quality = AudioResampler::DYN_LOW_QUALITY;
                }

                // TODO: Remove MONO_HACK. Resampler sees #channels after the downmixer
                // but if none exists, it is the channel count (1 for mono).
                const int resamplerChannelCount = getOutputChannelCount();
                ALOGVV("Creating resampler:"
                        " format(%#x) channels(%d) devSampleRate(%u) quality(%d)\n",
                        mMixerInFormat, resamplerChannelCount, devSampleRate, quality);
                mResampler.reset(AudioResampler::create(
                        mMixerInFormat,
                        resamplerChannelCount,
                        devSampleRate, quality));
            }
            return true;
        }
    }
    return false;
}

AudioMixerBase::TrackBase::setResampler()The function receives two parameters, one is the sampling rate of the audio source, and the other is the device sampling rate, which is the output sampling rate of the resampler. AudioMixerBase::TrackBase::setResampler()When the function is called, the device sampling rate passed in is always the output sampling rate of the mixer. For a specific mixer, it will remain unchanged, which means that the AudioMixerBase::TrackBase::setResampler()device sampling rate parameter of the function will be a fixed value.

A resampler is required when the sample rate of the audio source and the device sample rate are different. If the resampler does not exist, the resampler will be created. When creating a resampler, you only need to pass in the output sample rate, not the input sample rate. When creating a resampler, the input quality parameter is calculated based on the sampling rate of the audio source, that is, the input sampling rate. When creating a resampler, the number of input audio data channels is obtained from the audio source configuration.

For the set sample rate of the audio source, just do a simple recording after the resampler is created. During the mixing operation processing, AudioMixerBase::TrackBasethe related function sets the input sample rate for the resampler.

AudioMixerBase::TrackBase::setResampler()There are some problems with the implementation of the function:

  1. From the interface point of view, parameters such as channel number and quality may change after the resampler is created, but after the Android resampler is created, these parameters will not change;
  2. Passing in a basically fixed device sample rate parameter seems pretty lame.

In addition, resampling is an operation that cannot be finished after it is started. If the audio source sample rate is initially different from the device sample rate, but later becomes the same as the device sample rate, the resampler will not be destroyed at this time.

Setting the volume setVolumeRampVariables()is done by the function, which is defined (located frameworks/av/media/libaudioprocessing/AudioMixerBase.cpp) as follows:

static inline bool setVolumeRampVariables(float newVolume, int32_t ramp,
        int16_t *pIntSetVolume, int32_t *pIntPrevVolume, int32_t *pIntVolumeInc,
        float *pSetVolume, float *pPrevVolume, float *pVolumeInc) {
    // check floating point volume to see if it is identical to the previously
    // set volume.
    // We do not use a tolerance here (and reject changes too small)
    // as it may be confusing to use a different value than the one set.
    // If the resulting volume is too small to ramp, it is a direct set of the volume.
    if (newVolume == *pSetVolume) {
        return false;
    }
    if (newVolume < 0) {
        newVolume = 0; // should not have negative volumes
    } else {
        switch (fpclassify(newVolume)) {
        case FP_SUBNORMAL:
        case FP_NAN:
            newVolume = 0;
            break;
        case FP_ZERO:
            break; // zero volume is fine
        case FP_INFINITE:
            // Infinite volume could be handled consistently since
            // floating point math saturates at infinities,
            // but we limit volume to unity gain float.
            // ramp = 0; break;
            //
            newVolume = AudioMixerBase::UNITY_GAIN_FLOAT;
            break;
        case FP_NORMAL:
        default:
            // Floating point does not have problems with overflow wrap
            // that integer has.  However, we limit the volume to
            // unity gain here.
            // TODO: Revisit the volume limitation and perhaps parameterize.
            if (newVolume > AudioMixerBase::UNITY_GAIN_FLOAT) {
                newVolume = AudioMixerBase::UNITY_GAIN_FLOAT;
            }
            break;
        }
    }

    // set floating point volume ramp
    if (ramp != 0) {
        // when the ramp completes, *pPrevVolume is set to *pSetVolume, so there
        // is no computational mismatch; hence equality is checked here.
        ALOGD_IF(*pPrevVolume != *pSetVolume, "previous float ramp hasn't finished,"
                " prev:%f  set_to:%f", *pPrevVolume, *pSetVolume);
        const float inc = (newVolume - *pPrevVolume) / ramp; // could be inf, nan, subnormal
        // could be inf, cannot be nan, subnormal
        const float maxv = std::max(newVolume, *pPrevVolume);

        if (isnormal(inc) // inc must be a normal number (no subnormals, infinite, nan)
                && maxv + inc != maxv) { // inc must make forward progress
            *pVolumeInc = inc;
            // ramp is set now.
            // Note: if newVolume is 0, then near the end of the ramp,
            // it may be possible that the ramped volume may be subnormal or
            // temporarily negative by a small amount or subnormal due to floating
            // point inaccuracies.
        } else {
            ramp = 0; // ramp not allowed
        }
    }

    // compute and check integer volume, no need to check negative values
    // The integer volume is limited to "unity_gain" to avoid wrapping and other
    // audio artifacts, so it never reaches the range limit of U4.28.
    // We safely use signed 16 and 32 bit integers here.
    const float scaledVolume = newVolume * AudioMixerBase::UNITY_GAIN_INT; // not neg, subnormal, nan
    const int32_t intVolume = (scaledVolume >= (float)AudioMixerBase::UNITY_GAIN_INT) ?
            AudioMixerBase::UNITY_GAIN_INT : (int32_t)scaledVolume;

    // set integer volume ramp
    if (ramp != 0) {
        // integer volume is U4.12 (to use 16 bit multiplies), but ramping uses U4.28.
        // when the ramp completes, *pIntPrevVolume is set to *pIntSetVolume << 16, so there
        // is no computational mismatch; hence equality is checked here.
        ALOGD_IF(*pIntPrevVolume != *pIntSetVolume << 16, "previous int ramp hasn't finished,"
                " prev:%d  set_to:%d", *pIntPrevVolume, *pIntSetVolume << 16);
        const int32_t inc = ((intVolume << 16) - *pIntPrevVolume) / ramp;

        if (inc != 0) { // inc must make forward progress
            *pIntVolumeInc = inc;
        } else {
            ramp = 0; // ramp not allowed
        }
    }

    // if no ramp, or ramp not allowed, then clear float and integer increments
    if (ramp == 0) {
        *pVolumeInc = 0;
        *pPrevVolume = newVolume;
        *pIntVolumeInc = 0;
        *pIntPrevVolume = intVolume << 16;
    }
    *pSetVolume = newVolume;
    *pIntSetVolume = intVolume;
    return true;
}

The user of the mixer sets the volume gain value as a float. This function handles the volume ramp and generates an integer volume value.

at last

Go back and read the questions at the beginning of the article and answer them based on the content of the article.

How to represent the audio source data to be mixed?
The Android mixer uses AudioMixerBase::TrackBase/ AudioMixer::Trackto represent the audio source data to be mixed.

How to set the audio data source for an audio source to be mixed?
The audio data source of an audio source to be mixed is AudioBufferProviderrepresented by , and AudioMixer::setBufferProvider(int name, AudioBufferProvider* bufferProvider)the function can set the audio data source for an audio source to be mixed.

How to set the buffer for data output after mixing, or how to obtain the data after mixing?
The buffer for data output after mixing is called the main buffer. AudioMixer::setParameter(int name, int target, int param, void *value)You can set the main buffer for the audio source to be mixed through . From setParameter()the interface point of view, the main buffer seems to be specific to one audio source to be mixed, but in fact it may be to set the same main buffer for multiple audio sources to be mixed. Audio sources with the same main buffer will be mixed together. The main buffer is maintained by the user of the mixer. After the user of the mixer drives the mixer to perform mixing, it reads data from the main buffer for further processing, such as throwing it to the device for playback.

How to determine the output data format and configuration of the mixer, that is, is it set by an external user through the provided interface, or is it dynamically calculated based on the data format and configuration of each audio source?
This information is set through various settings interfaces. For example, the sampling rate AudioMixeris passed in when the object is constructed. The number of output channels and sampling format of the mixer are similar to the processing of the main buffer, and are set to the audio source to be mixed.

When the data format and configuration of the audio source are different from the output data format and configuration, how to convert the data format and configuration?
AudioMixer::TrackAn audio data processing pipeline is maintained to perform a variety of audio data processing including audio data format conversion, volume limiting, and variable speed playback. Resampling is not AudioMixer::Trackhandled by the audio data processing pipeline, but it has a resampler to handle resampling.

How to add or create an audio source to the mixer?
AudioMixerBase::create()Functions can be used to create and add an audio source to the mixer.

How to modify the parameters of an audio source of the mixer?
AudioMixerBase::TrackBase/ AudioMixer::TrackProvides relatively few interfaces, generally set through AudioMixerBase::setParameter(int name, int target, int param, void *value)functions and AudioMixer::setParameter(int name, int target, int param, void *value)functions. These components of the Android mixer are not well packaged, or even poorly encapsulated.

How to delete an audio source from the mixer?
AudioMixerBase::disable(int name)This function disables an audio source without removing it from the mixer. AudioMixerBase::destroy(int name)This function removes an audio source from the mixer.

When mixing, what should I do if the data after mixing crosses the boundary?
Simple limit cap.

How is the mixing operation driven? Does the mixer start a thread inside to perform the mixing operation and throw the result through the callback, or does the user actively call the mixing operation?
AudioMixerBaseThe class has a process()function, and the user of the mixer calls this method to drive the execution of the mixing operation. After this function is executed, the user of the mixer obtains the mixed data from the main buffer.

Android's mixer can perform multiple sets of mixing operations at the same time, that is, it can produce multiple outputs at the same time.

Done.

Guess you like

Origin blog.csdn.net/tq08g2z/article/details/130114227