声音类型分类之初步

嗨,接着回家前的工作继续做,如今还是这个问题。

目前主要做声音的分类,暂时不做场景分类,DCASE上有场景的分类,比如下图,2019年的DCASE某个task

also there is h5 model trained by the author,but maybe not helpful for my sound classification.

just get the feature of waveform,Let me try.

the feature is mel Spectrum,mel bin is 40,time length is 500, model is based CNN.

If the datasets are enough big ,could I use the trained model for transfer learning ?

First just recognize the voice/talk of person,background music,singing by person and no sound,then will recognize the music's category ,Now will look for some references for music classification. 

found two models for GTZAN dataset,even though the class precision is not enough well, but maybe could be embedding,we just want the high-features before softmax,is right ?160D low and high feature ?

另外有相关问题可以加入QQ群讨论,不设微信群

QQ群:868373192 

语音图像视频深度-学习群

发布了240 篇原创文章 · 获赞 233 · 访问量 7万+

猜你喜欢

转载自blog.csdn.net/SPESEG/article/details/105103101