Article directory
WAV format file analysis
Introduction to WAV format
WAV is one of the most common sound file formats. It is a standard digital audio file specially developed by Microsoft for Windows. The file can record various monophonic or stereo sound information and ensure that the sound is not distorted. It conforms to the Resource Interchange File Format (RIFF) specification and is used to save the audio information resources of the Windows platform, and is widely supported by the Windows platform and its applications. The Wave format supports MSADPCM, CCITT A-law, CCITT μ-law and other compression algorithms, and supports a variety of audio bits, sampling frequencies and channels. It is the most popular sound file format on PCs; however, its file size is large and is mostly used for storage. Short sound clips.
Source: Baidu Encyclopedia
WAV format composition
WAV files follow RIFF ruleschunk
, and their content is stored in the smallest unit. A WAV file generally consists of three blocks: RIFF chunk
, Format chunk
and Data chunk
. At the same time, there may also be some optional blocks in the file, such as: Fact chunk
, PlayList chunk
etc. In the process of analysis, we focus on the analysis of the first three blocks: RIFF chunk
, Format chunk
and Data chunk
.
The structure of each block is given in detail below:
RIFF Chunk
name
offset address
number of bytes
endian
content
ID
0x00
4
big endian
RIFF (0x52494646)
Size
0x04
4
little endian
fileSize - 8
Type
0x08
4
big endian
WAVE(0x57415645)
- thought to
RIFF
identify Size
Refers to the size of the entire file minusID
theSize
length of the sum. So filesize 8 filesize - 8 filesize8Type
ToWave
indicate that there are two sub-blocks behind:Format
andData
Format Chunk
name
offset address
number of bytes
endian
content
ID
0x00
4
big endian
fmt (0x666D7420)
Size
0x04
4
little endian
16/18
AudioFormat
0x08
2
little endian
audio format
NumChannels
0x0A
2
little endian
number of channels
SampleRate
0x0C
4
little endian
Sampling Rate
ByteRate
0x10
4
little endian
data bytes per second
BlockAlign
0x14
2
little endian
data block alignment
BitsPerSample
0x16
2
little endian
Sampling bits
- thought to
fmt
identify Size
WAV
Indicates that the header does not contain additional information when the length of the block data (excluding the length of ID and Size) is 16 .AudioFormat
IndicatesData
the format of the audio data stored in the block,PCM
the value of audio data is 1NumChannels
Indicates the number of channels of audio data, 1: mono, 2: dualSampleRate
Indicates the sample rate of the audio dataByteRate
Sample Rate Num Channels B its Per Sample / 8 SampleRate * NumChannels * BitsPerSample / 8 SampleRateNumChannelsBitsPerSample/8BlockAlign
Number of bytes required per sample Num Channels B its Per Sample / 8 NumChannels * BitsPerSample / 8 NumChannelsBitsPerSample/8BitsPerSample
The number of bits stored in each sample, the values are 8, 16, 32
Data Chunk
name
offset address
number of bytes
endian
content
ID
0x00
4
big endian
data(0x64617461)
Size
0x04
4
little endian
It depends on the actual situation
Data
0x08
Depends on file size
little endian
audio data
Data
to identifySize
Indicates the length of the audio B yte R ate S econds ByteRate * Seconds ByteRateSecondsData
represent data
For Data Chunk
, the number of channels and the sampling rate are different, resulting in different data layout: (1Byte size per column)
8 bit mono
Sample 1
Sample 2
data 1
data 2
8 bit dual channel
Sample 1
Sample 2
Channel 1 Data 1
Channel 2 Data 1
Channel 1 Data 2
Channel 2 Data 2
16 bit mono
Sample 1
Sample 2
data 1 low byte
Data 1 high byte
data 2 low byte
Data 2 high byte
16 bit dual channel
Sample 1
Channel 1 Data 1 Low Byte
Channel 1 Data 1 High Byte
Channel 2 Data 1 Low Byte
Channel 2 Data 1 High Byte
Sample 2
Channel 1 Data 2 Low Byte
Channel 1 Data 2 High Byte
Channel 2 Data 2 Low Byte
Channel 2 Data 2 High Byte
The following explains the big and small endian problems that often occur in the above content
big and small endian
Wave files store data in little-endian order
- Big-endian mode means that the low bits of data are stored in the high address of the memory , and the high bits of the data are stored in the low address of the memory, such as PNG file format;
- Little-endian mode means that the low-order bits of the data are stored in the low address of the memory , and the high-order bits of the data are stored in the high address of the memory.
Let's analyze a specific .wav file:
actual file analysis
RIFF Chunk
name
The actual data
illustrate
ID
consistent with the above
Size
The entire file size is 45340 bytes
Type
file type is WAVE
Format Chunk
name
The actual data
illustrate
ID
consistent with the description above
Size
The size is 16, and the header contains no additional information
AudioFormat
for PCM audio data
NumChannels
mono audio
SampleRate
The sample rate is 22050
ByteRate
44100 bytes of data per second
BlockAlign
The number of bytes required per sample is 2
BitsPerSample
Each sample stores 16bit
Data Chunk
name
The actual data
illustrate
ID
consistent with the description above
Size
The data length is 45304 bytes
Data
It's a lot if you don't let go...
actually stored data
Are there other optional blocks?
To verify that the file has optional blocks, the following code is added:
#include <iostream>
#include <fstream>
#include <cstdio>
#include <vector>
#include <map>
#include <set>
#define uchar unsigned char
using namespace std;
const string path = "test.wav";
vector<string> ans;
struct RiffHeader
{
string id = "";
string type = "";
unsigned int length = 0;
uchar len[4];
uchar Type[4];
void GetHead(ifstream & in) {
uchar* buffer = new uchar[4];
in.read((char *)buffer, 4);
for (int i = 0;i < 4;i ++) id += (int)buffer[i];
in.read((char *)len, 4);
length = (len[1] + (len[0] << 8)) + ((len[3] + (len[2] << 8)) << 8);
in.read((char *)Type, 4);
for (auto i : Type) type += (int)i;
return ;
}
};
struct FormatHeader
{
string id = "";
uchar data[20];
void GetHead(ifstream & in) {
uchar* buffer = new uchar[4];
in.read((char*)buffer, 4);
for (int i = 0;i < 4;i ++) id += (int)buffer[i];
in.read((char*)data, 20);
return ;
}
};
struct DataHeader
{
string id = "";
uchar* data;
unsigned int length = 0;
void GetHead(ifstream & in) {
uchar* buffer = new uchar[4];
in.read((char*)buffer, 4);
for (int i = 0;i < 4;i ++) id += (int)buffer[i];
uchar* len = new uchar[4];
in.read((char*)len, 4);
length = (len[1] + (len[0] << 8)) + ((len[3] + (len[2] << 8)) << 8);
data = new uchar[length];
in.read((char *)data, length);
return ;
}
};
int main()
{
ifstream in(path, ios :: binary);
RiffHeader riff;
FormatHeader format;
riff.GetHead(in);
ans.push_back(riff.id);
format.GetHead(in);
ans.push_back(format.id);
while (!in.eof()) {
DataHeader data;
data.GetHead(in);
ans.push_back(data.id);
}
cout << "All chunks : " << endl;
for (auto i : ans) cout << "Chunk id : " << i << endl;
return 0;
}
got the answer:
All chunks :
Chunk id : RIFF
Chunk id : fmt
Chunk id : data
There are no optional data blocks in this file.