Use RTP packets to load AAC stream data

Table of contents

I. Introduction

2. Introduction to RTP protocol

3. Introduction to AAC

1. AAC format

2. ADTS

4. Combination of RTP and AAC

5. Code combat

6. Effect display


I. Introduction

        In audio and video calls, we usually use the RTP protocol packet to carry the audio and video bit rate data. For example, the microphone collects the input data and encodes it into a frame, and then puts the frame data into the RTP protocol packet and sends it to the streaming server. This article introduces how RTP loads the AAC bit stream Data, use JRTPLIB to send and VLC to play.

2. Introduction to RTP protocol

        Refer to this blog .

3. Introduction to AAC

1. AAC format

AAC has two formats: ADIF, ADTS.

ADIF (Audio Data Interchange Format), audio data exchange format, this format is characterized by only storing header information (such as sampling rate, number of channels, etc.) Head start, generally used for playback stored in the local disk.

ADTS (Audio Data Transport Stream), audio data transport stream, this format is characterized by the fact that the data can be regarded as individual audio frames, and each frame stores header information for audio decoding and playback (such as sampling rate, channel number, etc.), that is, it can be decoded and played from any frame position, which is more suitable for streaming media transmission.

2. ADTS

The AAC code stream in ADTS format is composed of ADTS Frames one by one, and the structure is as follows.

Each ADTS Frame is composed of a header (fixed header + variable header) and data. The frame header structure and field meanings are as follows.

serial number Field Name Length (bits) illustrate
1 Syncword 12 The first field of the ADTS Frame header, all 12 bits are 1
2 MPEG version 1

0 means MPEG-4

1 for MPEG-2

3 Layer 2 always 0
4 Protection Absent 1

Whether there is a CRC check, 0 means there is a CRC check field, 1 means there is no CRC check field

5 Profile 2

0 means AAC Main

1 means AAC LC

2 means AAC SSR

6 MPEG-4 Sampling Frequence Index 4

Sampling rate, 0 means 96000Hz, 4 means 44100Hz, 11 means 8000Hz

see here

7 Private Stream 1 Set this value to 0 when encoding, and ignore it when decoding
8 MPEG-4 Channel Configuration 3 number of channels
9 Originality 1 Set this value to 0 when encoding, and ignore it when decoding
10 Home 1 Set this value to 0 when encoding, and ignore it when decoding
11 Copyrighted Stream 1 Set this value to 0 when encoding, and ignore it when decoding
12 Copyrighted Start 1 Set this value to 0 when encoding, and ignore it when decoding
13 Frame Length 13 ADTS frame length, including the length occupied by the header
14 Buffer Fullness 11 When the value is 0x7FF, it means dynamic code rate
15 Number of AAC Frames 2 The value is the number of AAC frames in the ADTS frame minus one. For compatibility, an ADTS frame generally contains an AAC frame
16 CRC 16 CRC check code

This website provides a tool for parsing the AAC ADTS Frame Header. You can enter the 7 or 9 bytes of data in the header, and click Submit to see the meaning of each field in the header.

        The following is the content displayed after we open an aac file in binary format. You can see that the syncword of the first 12bits of the first ADTS Frame is all 1, and then continue to parse the header to get the frame length, and the first 12bits of the second ADTS Frame syncword is also all 1.

 

4. Combination of RTP and AAC

        If RTP packets are used to carry video frame data, multiple RTP packets may be required to carry one video frame due to large video frame data, while audio frames are generally small, and generally only one RTP packet can also carry one video frame. The schematic diagram of ADTS frame data carrying AAC code stream in RTP is as follows.

        First, a 4-byte payload identifier needs to be added in front of the RTP Payload, payload[0] = 0x00, payload[1] = 0x10, payload[2] = (frameLength & 0x1FE0) >> 5, payload[3] = ( frameLength & 0x1F) << 3.

        Next, copy ADTS Frame Data to the beginning of RTP Payload[4]. Note that ADTS Frame Header does not need to be copied.

5. Code combat

jrtp_aac.cpp

#include <jrtplib3/rtpsession.h>
#include <jrtplib3/rtplibraryversion.h>
#include <jrtplib3/rtpudpv4transmitter.h>
#include <jrtplib3/rtpsessionparams.h>
#include <jrtplib3/rtppacket.h>
#include <jrtplib3/rtperrors.h>
#include <iostream>
#include <stdio.h>
#include <string>
#include "aac/aac.h"

using namespace std;
using namespace jrtplib;

const string SSRC = "10001";
const string AAC_FILE_PATH = "movie_file/lantingxv.aac";
const int MTU_SIZE = 1500;
const int MAX_RTP_PACKET_LENGTH = 1360;
const int AAC_PAYLOAD_TYPE = 97;
const int AAC_SAMPLE_NUM_PER_FRAME = 1024;

static void checkerror(int rtperr) {
    if (rtperr < 0) {
        std::cout << "ERROR: " << RTPGetErrorString(rtperr) << std::endl;
        exit(-1);
    }
}

int main(int argc, char** argv) {

    FILE* faac = fopen(AAC_FILE_PATH.c_str(), "rb");
    if (faac == NULL) {
        std::cout << "打开aac文件失败" << std::endl;
        exit(-1);
    }

    AdtsFrame* aframe = AllocAdtsFrame();
    int size = GetAdtsFrame(faac, aframe);
    if (size <= 0) {
        exit(0);
    }
    int frequence = GetFrequenceFromIndex(aframe->frequenceIdx);
    int frameRate = frequence / AAC_SAMPLE_NUM_PER_FRAME;
    uint32_t timestampInc = frequence / frameRate;
    fseek(faac, 0, SEEK_SET);

    // 获取本地用于发送的端口以及对端的IP和端口
    uint16_t localport;
    std::cout << "Enter local port(even): ";
	std::cin >> localport;
 
    std::string ipstr;
	std::cout << "Enter the destination IP address: ";
	std::cin >> ipstr;
	uint32_t destip = inet_addr(ipstr.c_str());
	if (destip == INADDR_NONE) {
		std::cerr << "Bad IP address specified" << std::endl;
		return -1;
	}
    destip = ntohl(destip);
 
    uint16_t destport;
	std::cout << "Enter the destination port: ";
	std::cin >> destport;

    // 设置RTP属性
    RTPUDPv4TransmissionParams tranparams;
    tranparams.SetPortbase(localport);
 
    RTPSessionParams sessparams;
    sessparams.SetOwnTimestampUnit(1.0/frequence);
 
    RTPSession sess;
    int status = sess.Create(sessparams, &tranparams);
    checkerror(status);
 
    RTPIPv4Address destAddr(destip, destport);
    status = sess.AddDestination(destAddr);
	checkerror(status);

    sess.SetDefaultPayloadType(AAC_PAYLOAD_TYPE);
    sess.SetDefaultMark(true);
    sess.SetDefaultTimestampIncrement(timestampInc);

    RTPTime sendDelay(0, 1000000/frameRate);
    uint8_t sendbuf[MTU_SIZE] = { 0 };

    while (true) {
        if (feof(faac)) {
            fseek(faac, 0, SEEK_SET);
        }
        int size = GetAdtsFrame(faac, aframe);
        if (size == 0) {
            continue;
        } else if (size < 0) {
            exit(0);
        } else {
            std::cout << "Adts Frame, profile: " << (int) aframe->profile << ", frequenceIdx: " << (int) aframe->frequenceIdx
                      << ", frameLength: " << aframe->frameLength << ", headerLen: " << aframe->headerLen << ", bodyLen: " << aframe->bodyLen << std::endl;

            if (size <= MAX_RTP_PACKET_LENGTH) {
                memset(sendbuf, 0, MTU_SIZE);
                sendbuf[0] = 0x00;
                sendbuf[1] = 0x10;
                sendbuf[2] = (aframe->frameLength & 0x1FE0) >> 5;
                sendbuf[3] = (aframe->frameLength & 0x1F) << 3;
                memcpy(sendbuf+4, aframe->body, aframe->bodyLen);
                sess.SendPacket((void*) sendbuf, aframe->bodyLen+4, AAC_PAYLOAD_TYPE, true, timestampInc);
            } else {
                std::cout << "frame size too large, just ignore it" << std::endl;
            }
            RTPTime::Wait(sendDelay);
        }
    }
    FreeAdtsFrame(aframe);
    if (faac) {
        fclose(faac);
        faac = NULL;
    }
    sess.BYEDestroy(RTPTime(3, 0), 0, 0);

    return 0;
}

aac.cpp

#include <jrtplib3/rtpsession.h>
#include <jrtplib3/rtplibraryversion.h>
#include <jrtplib3/rtpudpv4transmitter.h>
#include <jrtplib3/rtpsessionparams.h>
#include <jrtplib3/rtppacket.h>
#include <jrtplib3/rtperrors.h>
#include <iostream>
#include <stdio.h>
#include <string>
#include "aac/aac.h"

using namespace std;
using namespace jrtplib;

const string SSRC = "10001";
const string AAC_FILE_PATH = "movie_file/lantingxv.aac";
const int MTU_SIZE = 1500;
const int MAX_RTP_PACKET_LENGTH = 1360;
const int AAC_PAYLOAD_TYPE = 97;
const int AAC_SAMPLE_NUM_PER_FRAME = 1024;

static void checkerror(int rtperr) {
    if (rtperr < 0) {
        std::cout << "ERROR: " << RTPGetErrorString(rtperr) << std::endl;
        exit(-1);
    }
}

int main(int argc, char** argv) {

    FILE* faac = fopen(AAC_FILE_PATH.c_str(), "rb");
    if (faac == NULL) {
        std::cout << "打开aac文件失败" << std::endl;
        exit(-1);
    }

    AdtsFrame* aframe = AllocAdtsFrame();
    int size = GetAdtsFrame(faac, aframe);
    if (size <= 0) {
        exit(0);
    }
    int frequence = GetFrequenceFromIndex(aframe->frequenceIdx);
    int frameRate = frequence / AAC_SAMPLE_NUM_PER_FRAME;
    uint32_t timestampInc = frequence / frameRate;
    fseek(faac, 0, SEEK_SET);

    // 获取本地用于发送的端口以及对端的IP和端口
    uint16_t localport;
    std::cout << "Enter local port(even): ";
	std::cin >> localport;
 
    std::string ipstr;
	std::cout << "Enter the destination IP address: ";
	std::cin >> ipstr;
	uint32_t destip = inet_addr(ipstr.c_str());
	if (destip == INADDR_NONE) {
		std::cerr << "Bad IP address specified" << std::endl;
		return -1;
	}
    destip = ntohl(destip);
 
    uint16_t destport;
	std::cout << "Enter the destination port: ";
	std::cin >> destport;

    // 设置RTP属性
    RTPUDPv4TransmissionParams tranparams;
    tranparams.SetPortbase(localport);
 
    RTPSessionParams sessparams;
    sessparams.SetOwnTimestampUnit(1.0/frequence);
 
    RTPSession sess;
    int status = sess.Create(sessparams, &tranparams);
    checkerror(status);
 
    RTPIPv4Address destAddr(destip, destport);
    status = sess.AddDestination(destAddr);
	checkerror(status);

    sess.SetDefaultPayloadType(AAC_PAYLOAD_TYPE);
    sess.SetDefaultMark(true);
    sess.SetDefaultTimestampIncrement(timestampInc);

    RTPTime sendDelay(0, 1000000/frameRate);
    uint8_t sendbuf[MTU_SIZE] = { 0 };

    while (true) {
        if (feof(faac)) {
            fseek(faac, 0, SEEK_SET);
        }
        int size = GetAdtsFrame(faac, aframe);
        if (size == 0) {
            continue;
        } else if (size < 0) {
            exit(0);
        } else {
            std::cout << "Adts Frame, profile: " << (int) aframe->profile << ", frequenceIdx: " << (int) aframe->frequenceIdx
                      << ", frameLength: " << aframe->frameLength << ", headerLen: " << aframe->headerLen << ", bodyLen: " << aframe->bodyLen << std::endl;

            if (size <= MAX_RTP_PACKET_LENGTH) {
                memset(sendbuf, 0, MTU_SIZE);
                sendbuf[0] = 0x00;
                sendbuf[1] = 0x10;
                sendbuf[2] = (aframe->frameLength & 0x1FE0) >> 5;
                sendbuf[3] = (aframe->frameLength & 0x1F) << 3;
                memcpy(sendbuf+4, aframe->body, aframe->bodyLen);
                sess.SendPacket((void*) sendbuf, aframe->bodyLen+4, AAC_PAYLOAD_TYPE, true, timestampInc);
            } else {
                std::cout << "frame size too large, just ignore it" << std::endl;
            }
            RTPTime::Wait(sendDelay);
        }
    }
    FreeAdtsFrame(aframe);
    if (faac) {
        fclose(faac);
        faac = NULL;
    }
    sess.BYEDestroy(RTPTime(3, 0), 0, 0);

    return 0;
}

aac.h

#pragma once

#include <iostream>

struct AdtsFrame {
    bool crcProtectionAbsent;
    uint8_t profile;
    uint8_t frequenceIdx;
    uint16_t frameLength;

    uint8_t* buf;
    uint32_t maxSize;
    uint32_t len;
    uint8_t* header;
    uint32_t headerLen;
    uint8_t* body;
    uint32_t bodyLen;
};

int GetAdtsFrame(FILE* f, AdtsFrame* aframe);
AdtsFrame* AllocAdtsFrame();
AdtsFrame* AllocAdtsFrame(uint32_t bufferSize);
void FreeAdtsFrame(AdtsFrame* aframe);
int GetFrequenceFromIndex(uint8_t idx);

编译:g++ jrtp_aac.cpp aac/aac.cpp -ljrtp -o jrtp_aac

6. Effect display

        After the jrtp_aac program is started, after setting the sending port used by the local end and the address of the peer end, the process begins to send packets. We use VLC to set the sdp information to start receiving and playing the stream.

m=audio 10004 RTP/AVP 97
a=rtpmap:97 mpeg4-generic/44100/2
a=fmtp:97 streamtype=5; profile-level-id=15; mode=AAC-hbr; sizelength=13; indexlength=3; indexdeltalength=3;
c=IN IP4 127.0.0.1

Guess you like

Origin blog.csdn.net/weixin_38102771/article/details/128304673