Music Category

Fourier transform

Fourier Principle: any continuous measurement timing signal, can be represented as an infinite superposition of sine wave signals of different frequencies.
Time domain analysis: for a signal, the signal strength with the law of variation of the time-domain is the time, for example time domain waveform of a signal change in signal can be expressed with time.
Frequency domain analysis: for a signal, in the time of its analysis, the frequency and the relevant part of the signal, rather than time-related portion, and the relative time domain. I.e. signal is a combined signal of a single frequency which is a frequency domain characteristics. The frequency domain, there is one important rule is the sine wave is the only existing frequency domain wave. That is a description of a sine wave in the frequency domain, since any available time domain waveform sine wave synthesizer.
In general, the image represents the time domain is more intuitive, the frequency domain analysis concise. One way is through the Fourier transform time domain and frequency domain, a Fourier transform is a time-domain signal processing has become difficult to easily convert a frequency domain signal analysis.

Here Insert Picture Description

Fourier transform audio files

Fourier transform to wav files

from scipy import fft
from scipy.io import wavfile
from matplotlib.pyplot import specgram
import matplotlib.pyplot as plt


# 对单首音乐进行傅里叶变换
# 画框设置figsize=(9, 6)宽度和高度的英寸,dpi=80是分辨率
plt.figure(figsize=(9, 6), dpi=80)
# sample_rate代表每秒样本的采样率,X代表读取文件的所有信息 音轨信息,这里全是单音轨数据  是个数组【双音轨是个二维数组,左声道和右声道】
# 采样率:每秒从连续信号中提取并组成离散信号的采样个数,它用赫兹(Hz)来表示
sample_rate, X = wavfile.read("D:/music/jazz/converted/jazz.00001.au.wav")
print(sample_rate, X, type(X), len(X))
plt.subplot(211)
# 画wav文件时频分析的函数
specgram(X, Fs=sample_rate)
plt.xlabel("time")
plt.ylabel("frequency")

plt.subplot(212)
# fft 快速傅里叶变换  fft(X)得到振幅 即当前采样下频率的振幅
fft_X = abs(fft(X))
print("fft_x", fft_X, len(fft_X), type(fft_X))
# 画频域分析图  Fs是采样率
specgram(fft_X, Fs=sample_rate)
plt.xlabel("frequency")
plt.ylabel("amplitude")
plt.savefig("D:/music/jazz.00000.au.wav.fft.png")
plt.show()

Here Insert Picture Description

Fourier changes all the music, you can get a set of arrays. Then according to their corresponding original classification, training logistic regression,


from sklearn.linear_model import LogisticRegression

X =  .... # 是所有数据的值
Y = ...   # 是所有数据的label

model = LogisticRegression()
model.fit(X, Y)


# 如果对一个新的wav文件进行分类,先进行傅里叶变换,然后使用模型进行预测
sample_rate, test = wavfile.read("./test.wav")
print(sample_rate, test, len(test))
testdata_fft_features = abs(fft(test))[:1000]
# model.predict(testdata_fft_features) 预测为一个数组,array([类别])
type_index = model.predict([testdata_fft_features])
// 根据type_index找出对应的类别

Published 45 original articles · won praise 1 · views 5130

Guess you like

Origin blog.csdn.net/weixin_41634974/article/details/104094679