python机器学习模型DT/RF/SVM/XGBoost实现婴儿啼哭声纹识别

在之前的项目中,有关声纹识别和声源定位相关的实践较少,项目的原因,后续会陆续增加这块的投入,周末闲来无事就像先简单开发实践基础项目,这里主要是基于mfcc、csc、zcr几种特征提取方法集合决策树、支持向量机、随机森林和XGBoost几种模型来实现声纹识别,话不多说,首先看下效果:

简单看下数据集:

awake:

diaper:

hug:

hungry:

sleepy:

uncomfortable:

数据解析特征提取核心实现如下所示:

feature=[]
for one_label in os.listdir(dataDir):
    oneDir=dataDir+one_label+"/"
    for one_file in os.listdir(oneDir):
        print("one_file: ", one_file)
        try:
            one_path=oneDir+one_file
            data,RATE = librosa.load(one_path, 44100)
            f1=mfcc(data)
            f2=csc(data)
            f3=zcr(data)
            one_vec=np.hstack([f1, f2, f3]).tolist()
            one_vec.append(one_label)
            feature.append(one_vec)
        except Exception as e:
            print("Exception: ", e)
print("feature_length: ", len(feature))
with open(save_path,"w") as f:
    f.write(json.dumps(feature))

生成feature文件后,接下来初始化搭建模型,这里以RF为例如下:

with open(data) as f:
    data_list = json.load(f)
x_list = [one[:-1] for one in data_list]
y_list = [one[-1] for one in data_list]
print("y_list: ", list(set(y_list)))
X_train, X_test, y_train, y_test = splitData(x_list, y_list, ratio=rationum)
try:
    model = loadModel(model_path="rf.pkl")
except:
    model = RandomForestClassifier(n_estimators=100)
    model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print("RF model accuracy: ", model.score(X_test, y_test))
# 指标计算
res_dict = {}
Precision, Recall, F1 = calThree(y_test, y_pred)
res_dict["precision"] = Precision
res_dict["recall"] = Recall
res_dict["F1"] = F1
res_dict["accuracy"] = F1
print("res_dict: ", res_dict)
saveModel(model, save_path=model_path)
return res_dict

结果如下:

============================Loading classifyModel============================
y_list:  ['hug', 'diaper', 'hungry', 'uncomfortable', 'awake', 'sleepy']
=================splitData shape============================
688 688
230 230
DT model accuracy:  0.48695652173913045
F1:  0.47760615805170265
Precision:  0.47929344540393237
Recall:  0.479810176927286
res_dict:  {'precision': 0.47929344540393237, 'recall': 0.479810176927286, 'F1': 0.47760615805170265, 'accuracy': 0.47760615805170265}
y_list:  ['hug', 'diaper', 'hungry', 'uncomfortable', 'awake', 'sleepy']
=================splitData shape============================
688 688
230 230
RF model accuracy:  0.6434782608695652
F1:  0.6319053593191524
Precision:  0.6449388328420587
Recall:  0.6348164749454582
res_dict:  {'precision': 0.6449388328420587, 'recall': 0.6348164749454582, 'F1': 0.6319053593191524, 'accuracy': 0.6319053593191524}
y_list:  ['hug', 'diaper', 'hungry', 'uncomfortable', 'awake', 'sleepy']
=================splitData shape============================
688 688
230 230
SVM model accuracy:  0.2217391304347826
F1:  0.11388339920948616
Precision:  0.15566037735849056
Recall:  0.20020325203252032
res_dict:  {'precision': 0.15566037735849056, 'recall': 0.20020325203252032, 'F1': 0.11388339920948616, 'accuracy': 0.11388339920948616}
y_list:  ['hug', 'diaper', 'hungry', 'uncomfortable', 'awake', 'sleepy']
=================splitData shape============================
688 688
230 230
XGBOOST model accuracy:  0.5
F1:  0.48469331276155714
Precision:  0.48176557040929696
Recall:  0.49042707402902413
res_dict:  {'precision': 0.48176557040929696, 'recall': 0.49042707402902413, 'F1': 0.48469331276155714, 'accuracy': 0.48469331276155714}

四种模型对比结果如下:

{
    "DT": {
        "precision": 0.47929344540393239,
        "recall": 0.479810176927286,
        "F1": 0.47760615805170267,
        "accuracy": 0.47760615805170267
    },
    "RF": {
        "precision": 0.6449388328420587,
        "recall": 0.6348164749454582,
        "F1": 0.6319053593191524,
        "accuracy": 0.6319053593191524
    },
    "SVM": {
        "precision": 0.15566037735849057,
        "recall": 0.20020325203252033,
        "F1": 0.11388339920948616,
        "accuracy": 0.11388339920948616
    },
    "XGBOOST": {
        "precision": 0.48176557040929698,
        "recall": 0.49042707402902416,
        "F1": 0.48469331276155716,
        "accuracy": 0.48469331276155716
    }
}

为了直观对比这里对其进行可视化展示,如下所示:

猜你喜欢

转载自blog.csdn.net/Together_CZ/article/details/129637168