Detailed FLV file format


For the specific format, please refer to flv spec.


The following mainly introduces the fields in the Tag in the above figure. Each Tag is composed of two parts: Tag Header and Tag Data.

1. Tag Header

name length Introduction
Tag type 1 bytes 8: audio
9: video
18: meta
other: reserved
Data area length 3 bytes The length of the data area
Timestamp 3 bytes The integer is in milliseconds. For script tags always 0
Timestamp extension 1 bytes Extend the time stamp to 4 bytes, representing the upper 8 bits. Rarely used
StreamsID 3 bytes Always 0
Data area (data) Determined by the length of the data area Data entity

Day 2 data

The data area of ​​the tag can be divided into three types according to the tag type: audio data, video data and meta data.

2.1 Audio data

The first byte is audio information, the format is as follows:
name length Introduction
audio format 4 bits 0 = Linear PCM, platform endian
1 = ADPCM
2 = MP3
3 = Linear PCM, little endian
4 = Nellymoser 16-kHz mono
5 = Nellymoser 8-kHz mono
6 = Nellymoser
7 = G.711 A-law logarithmic PCM
8 = G.711 mu-law logarithmic PCM
9 = reserved
10 = AAC
11 = Speex
14 = MP3 8-Khz
15 = Device-specific sound
Sampling Rate 2 bits 0 = 5.5-kHz
1 = 11-kHz
2 = 22-kHz
3 = 44-kHz
is always 3 for AAC
Sample length 1 bit 0 = snd8Bit
1 = snd16Bit
compressed audio is 16bit
Audio type 1 bit 0 = sndMono
1 = sndStereo
is always 1 for AAC
Audio data         8 bit [n]  
     

In the case of AAC format, the format of audio data is as follows:


2.2 Video data

The first byte is video information, the format is as follows:
name length Introduction
Frame type 4 bits 1: keyframe (for AVC, a seekable frame)
2: inter frame (for AVC, a non-seekable frame)
3: disposable inter frame (H.263 only)
4: generated keyframe (reserved for server use only)
5: video info/command frame
Code ID 4 bits 1: JPEG (currently unused)
2: Sorenson H.263
3: Screen video
4: On2 VP6
5: On2 VP6 with alpha channel
6: Screen video version 2
7: AVC

If the frame type is 5, then the content of the tag data is not video data, and the data is 8bits data, the meaning is as follows:
  • 0, Start of client-side seeking video frame sequence 
  • 1,End of client-side seeking video frame sequence 
If it is in AVC format, the data format after tag data is as follows:


2.3 Script data

There is generally only one script tag, which is the first tag of flv, used to store flv information, such as duration, audiodatarate, creator, width, etc.

First introduce the data type of the script. All data appears in the format of data type + (data length) + data . The data type occupies 1 byte. The data length depends on whether the data type exists, followed by data.

The format is as follows:



If the type is String, the next 2 bytes are the length of the string (Long String is 4 bytes), and then the string data; if it is the Number type, the next 8 bytes are Double type data; the Boolean type, the next 1 byte is Bool type .

After knowing this, let's take a look at the script in flv. Generally, the beginning is 0x02, which means the String type, the next 2bytes are the string length, usually 0x000a (the length of "onMetaData"), and then the string "onMetaData". It seems that files in flv format have onMetaData tags, which will be used when running ActionScript. It is followed by 0x08, indicating the ECMA Array type, which is similar to Map, with a key followed by a value. The keys are all of type String, so the 0x02 at the beginning is omitted, followed directly by the length of the string, then the string, and then the type of value, that is, those introduced above.

The onMetaData tag contains stream information. General information includes:

  • duration: a DOUBLE indicating the total duration of the file in seconds
  • width: a DOUBLE indicating the width of the video in pixels
  • height: a DOUBLE indicating the height of the video in pixels
  • videodatarate: a DOUBLE indicating the video bit rate in kilobits per second 
  • framerate: a DOUBLE indicating the number of frames per second
  • videocodecid: a DOUBLE indicating the video codec ID used in the file (see “Video tags” on page 8 for available CodecID values)
  • audiosamplerate: a DOUBLE indicating the frequency at which the audio stream is replayed
  • audiosamplesize: a DOUBLE indicating the resolution of a single audio sample
  • stereo: a BOOL indicating whether the data is stereo
  • audiocodecid: a DOUBLE indicating the audio codec ID used in the file (see “Audio tags” on page 6 for available SoundFormat values)
  • filesize: a DOUBLE indicating the total size of the file in bytes 

2.4 keyframes索引信息

官方的文档中并没有对 keyframes index 做描述,但是,flv 的这种结构每个 tag 又不像 TS 有同步头,如果没有 keyframes index 的话,需要按顺序读取每一个tag, seek 及快进快退的效果会非常差。后来在做 flv 文件合成的时候,发现网上有的 flv 文件将 keyframes 信息隐藏在 Script Tag 中。

keyframes 几乎是一个非官方的标准, 也就是民间标准。两个常用的操作 metadata 的工具是 flvtool2 和 FLVMDI,都是把 keyframes 作为一个默认的元信息项目。在 FLVMDI 的主页上有描述:

keyframes: (Object) This object is added only if you specify the /k switch. 'keyframes' is known to FLVMDI and if /k switch is not specified, 'keyframes' object will be deleted. 'keyframes' object has 2 arrays: 'filepositions' and 'times'. Both arrays have the same number of elements, which is equal to the number of key frames in the FLV. Values in times array are in 'seconds'. Each correspond to the timestamp of the n'th key frame. Values in filepositions array are in 'bytes'. Each correspond to the fileposition of the nth key frame video tag (which starts with byte tag type 9).

也就是说 keyframes 中包含着 2 个内容 “filepositions” 和 “times”分别指的是关键帧的文件位置和关键帧的 PTS。通过 keyframes 可以建立起自己的 Index,然后在 seek 和快进快退的操作中,快速有效地跳转到你想要找的关键帧位置进行处理。

3. FLV分析工具


发布了60 篇原创文章 · 获赞 44 · 访问量 34万+

Guess you like

Origin blog.csdn.net/beyond702/article/details/78929334