For the specific format, please refer to flv spec.
The following mainly introduces the fields in the Tag in the above figure. Each Tag is composed of two parts: Tag Header and Tag Data.
1. Tag Header
name | length | Introduction |
---|---|---|
Tag type | 1 bytes | 8: audio 9: video 18: meta other: reserved |
Data area length | 3 bytes | The length of the data area |
Timestamp | 3 bytes | The integer is in milliseconds. For script tags always 0 |
Timestamp extension | 1 bytes | Extend the time stamp to 4 bytes, representing the upper 8 bits. Rarely used |
StreamsID | 3 bytes | Always 0 |
Data area (data) | Determined by the length of the data area | Data entity |
Day 2 data
2.1 Audio data
name | length | Introduction |
---|---|---|
audio format | 4 bits | 0 = Linear PCM, platform endian 1 = ADPCM 2 = MP3 3 = Linear PCM, little endian 4 = Nellymoser 16-kHz mono 5 = Nellymoser 8-kHz mono 6 = Nellymoser 7 = G.711 A-law logarithmic PCM 8 = G.711 mu-law logarithmic PCM 9 = reserved 10 = AAC 11 = Speex 14 = MP3 8-Khz 15 = Device-specific sound |
Sampling Rate | 2 bits | 0 = 5.5-kHz 1 = 11-kHz 2 = 22-kHz 3 = 44-kHz is always 3 for AAC |
Sample length | 1 bit | 0 = snd8Bit 1 = snd16Bit compressed audio is 16bit |
Audio type | 1 bit | 0 = sndMono 1 = sndStereo is always 1 for AAC |
Audio data | 8 bit [n] | |
In the case of AAC format, the format of audio data is as follows:
2.2 Video data
The first byte is video information, the format is as follows:name | length | Introduction |
---|---|---|
Frame type | 4 bits | 1: keyframe (for AVC, a seekable frame) 2: inter frame (for AVC, a non-seekable frame) 3: disposable inter frame (H.263 only) 4: generated keyframe (reserved for server use only) 5: video info/command frame |
Code ID | 4 bits | 1: JPEG (currently unused) 2: Sorenson H.263 3: Screen video 4: On2 VP6 5: On2 VP6 with alpha channel 6: Screen video version 2 7: AVC |
If the frame type is 5, then the content of the tag data is not video data, and the data is 8bits data, the meaning is as follows:
- 0, Start of client-side seeking video frame sequence
- 1,End of client-side seeking video frame sequence
2.3 Script data
First introduce the data type of the script. All data appears in the format of data type + (data length) + data . The data type occupies 1 byte. The data length depends on whether the data type exists, followed by data.
The format is as follows:
If the type is String, the next 2 bytes are the length of the string (Long String is 4 bytes), and then the string data; if it is the Number type, the next 8 bytes are Double type data; the Boolean type, the next 1 byte is Bool type .
After knowing this, let's take a look at the script in flv. Generally, the beginning is 0x02, which means the String type, the next 2bytes are the string length, usually 0x000a (the length of "onMetaData"), and then the string "onMetaData". It seems that files in flv format have onMetaData tags, which will be used when running ActionScript. It is followed by 0x08, indicating the ECMA Array type, which is similar to Map, with a key followed by a value. The keys are all of type String, so the 0x02 at the beginning is omitted, followed directly by the length of the string, then the string, and then the type of value, that is, those introduced above.
The onMetaData tag contains stream information. General information includes:
- duration: a DOUBLE indicating the total duration of the file in seconds
- width: a DOUBLE indicating the width of the video in pixels
- height: a DOUBLE indicating the height of the video in pixels
- videodatarate: a DOUBLE indicating the video bit rate in kilobits per second
- framerate: a DOUBLE indicating the number of frames per second
- videocodecid: a DOUBLE indicating the video codec ID used in the file (see “Video tags” on page 8 for available CodecID values)
- audiosamplerate: a DOUBLE indicating the frequency at which the audio stream is replayed
- audiosamplesize: a DOUBLE indicating the resolution of a single audio sample
- stereo: a BOOL indicating whether the data is stereo
- audiocodecid: a DOUBLE indicating the audio codec ID used in the file (see “Audio tags” on page 6 for available SoundFormat values)
- filesize: a DOUBLE indicating the total size of the file in bytes
2.4 keyframes索引信息
官方的文档中并没有对 keyframes index 做描述,但是,flv 的这种结构每个 tag 又不像 TS 有同步头,如果没有 keyframes index 的话,需要按顺序读取每一个tag, seek 及快进快退的效果会非常差。后来在做 flv 文件合成的时候,发现网上有的 flv 文件将 keyframes 信息隐藏在 Script Tag 中。
keyframes 几乎是一个非官方的标准, 也就是民间标准。两个常用的操作 metadata 的工具是 flvtool2 和 FLVMDI,都是把 keyframes 作为一个默认的元信息项目。在 FLVMDI 的主页上有描述:
keyframes: (Object) This object is added only if you specify the /k switch. 'keyframes' is known to FLVMDI and if /k switch is not specified, 'keyframes' object will be deleted. 'keyframes' object has 2 arrays: 'filepositions' and 'times'. Both arrays have the same number of elements, which is equal to the number of key frames in the FLV. Values in times array are in 'seconds'. Each correspond to the timestamp of the n'th key frame. Values in filepositions array are in 'bytes'. Each correspond to the fileposition of the nth key frame video tag (which starts with byte tag type 9).
也就是说 keyframes 中包含着 2 个内容 “filepositions” 和 “times”分别指的是关键帧的文件位置和关键帧的 PTS。通过 keyframes 可以建立起自己的 Index,然后在 seek 和快进快退的操作中,快速有效地跳转到你想要找的关键帧位置进行处理。
3. FLV分析工具
- http://www.flvmeta.com/
- yamdi:将flv转成带索引的flv,yamdi -i i.flv -o o.flv
- flvlib: pip install flvlib, 查看索引信息:debug-flv --metadata file.flv
- flvcheck:http://www.adobe.com/products/adobe-media-server-family/tool-downloads.html