Simple Android video player based on FFmpeg

1. Module division

First of all, it is necessary to be clear about some components used in this video player. This player can be mainly divided into 4 parts:

1. Decoding: FFmpeg

2. Audio output: OpenSLES

3. Video rendering: OpenGLES

These frameworks are all C-based APIs, so this time our main work will focus on the NDK part. Some knowledge about NDK has also been mentioned in previous blogs, so this project will be a comprehensive application of previous knowledge.

According to the function of the video player, we will divide the following modules:

  1. Image display
  2. Audio output
  3. decoding
  4. playback control
  5. Audio and video synchronization

In order to improve portability, interfaces are used for key components to standardize their API interfaces.

1. IAudioPlayer: Audio player interface. The interface it specifies is as follows

class IAudioPlayer {
public:
    virtual bool create() = 0;
    virtual void release() = 0;
    virtual void start() = 0;
    virtual void stop() = 0;
    virtual bool isPlaying() = 0;
    virtual void setAudioFrameProvider(IAudioFrameProvider *provider) = 0;
    virtual void removeAudioFrameProvider(IAudioFrameProvider *provider) = 0;
};

2. IVideoPlayer: Video player interface.

class IVideoPlayer {
public:
    virtual bool create() = 0;
    virtual void release() = 0;
    virtual void refresh() = 0;
    virtual void setVideoFrameProvider(IVideoFrameProvider *provider) = 0;
    virtual void removeVideoFrameProvider(IVideoFrameProvider *provider) = 0;
    virtual void setWindow(void *window) = 0;
    virtual void setSize(int32_t width, int32_t height) = 0;
    virtual bool isReady() = 0;
};

3. AudioFrame: store the decoded audio data.

For the player, the audio data format played is 16-bit PCM, 44.1kHz sampling rate, two-channel. In order to avoid re-applying memory for each piece of audio data, we will reuse AudioFrame, so we need to set a maximum audio data storage space for it.

struct AudioFrame{
    // present time stamp
    int64_t pts;
    int16_t *data;
    int32_t sampleCount;

    int32_t maxDataSizeInByte = 0;

    AudioFrame(int32_t dataLenInByte)
    {
        this->maxDataSizeInByte = dataLenInByte;
        pts = 0;
        sampleCount = 0;
        data = (int16_t *)malloc(maxDataSizeInByte);
        memset(data, 0, maxDataSizeInByte);
    }

    ~AudioFrame(){
        if(data != NULL)
        {
            free(data);
        }
    }
};

4. VideoFrame: store the decoded video data:

For the video data format used inside the player, the resolution is 1920*1080, the pixel format is RGB888, each color has one byte, and one pixel occupies 3 bytes. It will also be reused for VideoFrame.

struct VideoFrame
{
    int64_t pts;
    uint8_t *data;
    int32_t width;
    int32_t height;

    int32_t maxDataSizeInByte = 0;;

    VideoFrame(int32_t dataLenInByte)
    {
        this->maxDataSizeInByte = dataLenInByte;
        data = (uint8_t *)malloc(maxDataSizeInByte);
        memset(data, 0, maxDataSizeInByte);
    }

    ~VideoFrame()
    {
        if(data != NULL)
        {
            free(data);
        }
    }
};

5. IAudioFrameProvider: The source of audio data for IAudioPlayer, which provides decoded audio data for IAudioPlayer

Since the AudioFrame needs to be reused, it is necessary to set up an interface to let IAudioPlayer return the used AudioFrame to us.

class IAudioFrameProvider {
public:
    virtual AudioFrame* getAudioFrame() = 0;
    virtual void putBackUsed(AudioFrame *data) = 0;
};

6. IVideoFrameProvider: Same as IAudioFrameProvider.

class IVideoFrameProvider {
public:
    virtual VideoFrame* getVideoFrame() = 0;
    virtual void putBackUsed(VideoFrame* data) = 0;
};

7. IMediaDataReceiver: Interface for receiving decoded audio and video data.

It is used to maintain and store decoded audio and video data and used audio and video data.

class IMediaDataReceiver {
public:
    virtual void receiveAudioFrame(AudioFrame *audioData) = 0;
    virtual void receiveVideoFrame(VideoFrame *videoData) = 0;
    virtual AudioFrame* getUsedAudioFrame() = 0;
    virtual VideoFrame* getUsedVideoFrame() = 0;

    virtual void putUsedAudioFrame(AudioFrame *audioData) = 0;
    virtual void putUsedVideoFrame(VideoFrame *videoData) = 0;
};

8. BlockRecyclerQueue: Synchronous multiplexing queue.

There is no thread-safe queue model in C++. So we implement one ourselves. And because a lot of data in the player needs to be multiplexed, a multiplexing function is added to this queue. In this way, there will be two queues inside this class, one to store unused data and one to store used data. Use two locks to protect the threads of the two queues respectively. Of course, in fact, you can also consider this matter at a smaller granularity, just use a queue, and then thread-protect the queue. As for whether the data stored in it is used data or unused data, it is completely It can be decided by the upper layer.

The multithreading in the player uses the thread that comes with c++11.

This synchronous multiplexing queue is actually a pipeline in the producer-consumer mode. It has the following characteristics:

1. If you set capacity=-1, then the queue is unlimited in size. If the size is limited, when the internally stored data is full, the put operation will wait. This is to prevent the decoder from being too fast and causing high memory usage.

2. For get and put operations, you can set wait to decide whether to wait when the data is empty or full. For the get operation, when the queue is empty, if wait = true, then it will wait until there is data; if wait = false, then it will return NULL immediately. For the put operation, when the queue is full, if wait = true, it will wait until the queue is not full; if wait = false, then it will store directly in the queue regardless of capacity, resulting in size > capacity.

3. In order to prevent deadlock at the end of playback, set up two interfaces to release all waits for get and put operations. This takes into account that after the decoder has finished decoding, the player has been waiting.

4. All the above cases are for useful data. For the recovery data queue, all put and get operations only guarantee thread safety, without waiting. It has no maximum capacity, and all put operations will be executed immediately after obtaining the lock. All get operations will also be executed immediately after obtaining the thread lock. If no data is recovered, NULL will be returned immediately.

5. Through the discardAll(void (*discardCallback)(T)) method, all useful data can be put into the recycling data at one time, and a function pointer can also be passed to recycle all the useful data, and then put into Recycle queue. This is considered for the seek operation, because all decoded data must be discarded when seeking.

template <class T>
class BlockRecyclerQueue {
public:
    // if size == -1, then we don't limit the size of data queue, and all the put option will not wait.
    BlockRecyclerQueue(int capacity = -1);
    ~BlockRecyclerQueue();
    int getCapacity();

    int getSize();

    // put a element, if wait = true, put option will wait until the length of data queue is less than specified size.
    void put(T t, bool wait = true);

    // get a element, if wait = true, it will wait until the data queue is not empty. If wait = false, it will return NULL if the data queue is empty.
    // It will still return NULL even wait = true, in this case, it must be someone call notifyWaitGet() but the data queue is still empty.
    T get(bool wait = true);

    void putToUsed(T t);

    T getUsed();

    void discardAll(void (*discardCallback)(T));

    // notify all the put option to not wait. This will cause put option succeed immediately
    void notifyWaitPut();

    // notify all the get option to return immediately. if data queue is still empty, get option will return a NULL.
    void notifyWaitGet();





private:
    int capacity = 0;
    mutex queueMu;
    mutex usedQueueMu;
    condition_variable notFullSignal;
    condition_variable notEmptySignal;
    list<T> queue;
    list<T> usedQueue;

    bool allowNotifyPut = false;
    bool allowNotifyGet = false;


};

2. Decoder implementation

The decoding part still uses FFmpeg. The decoding process is similar to the decoding audio process.

First, we definitely need two threads to decode audio and video separately.

Secondly, a thread is also needed to read the file. Before we decoded the audio, we put the process of reading the packet from the file and decoding the packet into a frame in the same thread, because we only pay attention to the audio stream of the audio file. Now we need to put the operation of reading the packet in a separate thread, and then the decoder needs to maintain two queues to store the audio AVPacket and the video AVPacket respectively. These two queues can use the previous BlockRecyclerQueue. This is equivalent to the fact that the file reading thread is a producer, and both the audio decoding thread and the video decoding thread are consumers. The specific code can be viewed in VideoFileDecoder.cpp.

It should be noted that the seek operation is also performed in the decoder, because seek needs to operate on media files. When seeking, all previously read AVPackets must also be discarded.

Since the encoding format of the decoded file will be different, we need swr_convert of FFmpeg to transcode audio data and sws_scale to transcode video data.

3. Playback Control

I have provided a player's unified operation interface: VideoPlayController.cpp, and it is also responsible for notifying the upper layer of the playback progress, managing audio and video players and decoders, and managing decoded data, etc. Therefore it is declared as follows:

class VideoPlayController: public IMediaDataReceiver, public IAudioFrameProvider, public IVideoFrameProvider

It implements three interfaces, which can accept the data decoded by the decoder, and provide audio data and video data to the audio and video player respectively.

4. Audio and video synchronization

Since the audio frame rate is usually much higher than the video frame rate, the audio sampling rate in the general video is mostly 44.1kHz or 48kHz, while the video is generally 25fps.

There are usually two ways to synchronize audio and video:

  1. Video is played on an audio time base, due to the higher audio frame rate.
  2. Synchronize audio and video with an additional clock.

Generally speaking, the method of extra clock will be better, firstly because of its high precision; secondly, in this way, if there is only video or only audio in the file, the applicability will be higher; thirdly, if Your audio player is not actively requesting audio data, so you need an extra clock anyway to send data to the audio player and video player regularly. However, its disadvantage is that it takes up more resources.

What I use here is based on audio time, because OpenSLES actively requests audio data. In this way, every time the audio player requests data, we can get the pts of the current AudioFrame to know the current playback progress, and also use this playback progress to determine whether to send a refresh command to the video player.

Naturally, the play and pause functions are also implemented by controlling the playback and pause of the audio player.

Audio and video synchronization is also placed in VideoPlayController.cpp. The code of the audio and video synchronization part is placed in the AudioFrame *VideoPlayController::getAudioFrame() method.

5. Summary

At this point, the key parts of the player are clarified. Please check the code on my github, the link is at the top of the blog. However it still has many problems:

1. In some cases, there will be ANR when exiting video playback, which may be because a thread has entered a deadlock or dead wait.

2. Now it can only play videos with lower resolution normally, because it is not optimized for hardware acceleration, which makes decoding videos too time-consuming. The test shows that it takes about 70ms to decode a frame of 1920*1080 video.

Original link: Simple Android video player based on FFmpeg_android ffmpeg plays video_zuguorui's blog-CSDN blog

 

★The business card at the end of the article can receive audio and video development learning materials for free, including (FFmpeg, webRTC, rtmp, hls, rtsp, ffplay, srs) and audio and video learning roadmaps, etc.

see below!

 

Guess you like

Origin blog.csdn.net/yinshipin007/article/details/130530999