Contents
Chapter 1 Introduction 1
1.1 Introduction 1
1.2 Background and significance of the topic 2
1.3 Current status of audio characteristic parameter analysis 3
1.4 Main tasks and arrangements of the paper 4
Chapter 2 Basic Theory 5
2.1 Introduction 5
2.2 Basics of speech signal processing 5
2.2.1 Speech signal analysis Frame 5
2.2.2 Speech feature parameter theory 6
2.2.3 Calculation and extraction of MFCC feature parameters 6
2.2.4 Improved power normalized cepstrum parameter 8
2.3 Formant theory of speech signal 8
2.4 Summary of this chapter 9
Chapter 3 FFMPEG Format conversion technology 10
3.1 Introduction to FFMPEG technology 10
3.2 FFMPEG project composition 10
3.3 FFMPEG technology advantages 11
3.4 Summary of this chapter 11
Chapter 4 Introduction to the software part 12
4.1 MATLAB advantages 12
4.1.1 Development background 12
4.1.2 Easy-to-understand programming language 12
4.1.3 Powerful data processing capabilities 13
4.1.4 Complete graphics processing capabilities 13
4.1.5 Introduction to the work interface 14
4.2 C# advantages 14
4.2.1 Background 14
4.2.2 C# programming advantages 14
4.2.3 Introduction to VS2010 software 15
4.3 Introduction to SQL Server 2008 16
4.4 Summary of this chapter 17
Chapter 5 System Construction and Audio Feature Analysis 18
Introduction 18
5.1 Preliminary work 18
5.2 Format conversion of audio data 19
5.3 Related designs in C# 20
5.3.1 Use Implementing database link with C# programming language 21
5.3.2 Using C# programming to implement audio operation 21
5.4 Related designs in MATLAB 22
5.4.1 Generation of MATLAB spectrogram 23
5.5 Design of system work interface 25
5.6 Overall system flow chart 27
5.7 Summary of this chapter 28
Chapter 6 Graduation Project Summary and Expectations 29
6.1 Thesis Summary 29
6.2 Main Problems and Solutions Encountered in the Graduation Project 29
6.3 Outlook for the Future 31
Acknowledgments 32
Chapter 5 System Construction and Audio Characteristics Analysis
Introduction
This research topic mainly includes Four aspects are covered: first, the format conversion of the collected audio, then the interface design based on C#, then the converted audio files are stored in the database through interface operations, and finally, the stored audio files are processed through MATLAB. The output of the graphics is used to extract feature parameters.
In this step-by-step process, the C# programming language plays a role in connecting various modules and jointly debugging the entire system. In terms of MATLAB, it is to complete the output analysis of the audio time spectrum, which needs to be called through C# programming. For the SQL database, it mainly implements the functions of storing and outputting audio signal data and managing information.
5.1 Preliminary work
At the beginning of the project, the program process determined in the project proposal report is as follows:
图5.1 基本工作流程图
Therefore, the first preliminary preparation work to be done is to collect a large amount of audio signal data. I have also set aside enough time to prepare for this aspect, and make more preparations for recording under various circumstances. These include recording with multiple people and multiple recordings, recording with the device battery between 5% and 50% during recording, recording in different mobile phone device modes, recording in different recording environments, etc. At the same time, in order to ensure the convenience of classification and storage in the later stage, it is necessary to strictly record the detailed information of the environment where the recording files were located at that time, and then classify and save different recording files in a timely manner to achieve a high degree of distinction.
<?xml version="1.0" encoding="utf-8"?>
<root>
<!--
Microsoft ResX Schema
Version 2.0
The primary goals of this format is to allow a simple XML format
that is mostly human readable. The generation and parsing of the
various data types are done through the TypeConverter classes
associated with the data types.
Example:
... ado.net/XML headers & schema ...
<resheader name="resmimetype">text/microsoft-resx</resheader>
<resheader name="version">2.0</resheader>
<resheader name="reader">System.Resources.ResXResourceReader, System.Windows.Forms, ...</resheader>
<resheader name="writer">System.Resources.ResXResourceWriter, System.Windows.Forms, ...</resheader>
<data name="Name1"><value>this is my long string</value><comment>this is a comment</comment></data>
<data name="Color1" type="System.Drawing.Color, System.Drawing">Blue</data>
<data name="Bitmap1" mimetype="application/x-microsoft.net.object.binary.base64">
<value>[base64 mime encoded serialized .NET Framework object]</value>
</data>
<data name="Icon1" type="System.Drawing.Icon, System.Drawing" mimetype="application/x-microsoft.net.object.bytearray.base64">
<value>[base64 mime encoded string representing a byte array form of the .NET Framework object]</value>
<comment>This is a comment</comment>
</data>
There are any number of "resheader" rows that contain simple
name/value pairs.
Each data row contains a name, and value. The row also contains a
type or mimetype. Type corresponds to a .NET class that support
text/value conversion through the TypeConverter architecture.
Classes that don't support this are serialized and stored with the
mimetype set.
The mimetype is used for serialized objects, and tells the
ResXResourceReader how to depersist the object. This is currently not
extensible. For a given mimetype the value must be set accordingly:
Note - application/x-microsoft.net.object.binary.base64 is the format
that the ResXResourceWriter will generate, however the reader can
read any of the formats listed below.
mimetype: application/x-microsoft.net.object.binary.base64
value : The object must be serialized with
: System.Serialization.Formatters.Binary.BinaryFormatter
: and then encoded with base64 encoding.
mimetype: application/x-microsoft.net.object.soap.base64
value : The object must be serialized with
: System.Runtime.Serialization.Formatters.Soap.SoapFormatter
: and then encoded with base64 encoding.
mimetype: application/x-microsoft.net.object.bytearray.base64
value : The object must be serialized into a byte array
: using a System.ComponentModel.TypeConverter
: and then encoded with base64 encoding.
-->
<xsd:schema id="root" xmlns="" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata">
<xsd:element name="root" msdata:IsDataSet="true">
<xsd:complexType>
<xsd:choice maxOccurs="unbounded">
<xsd:element name="metadata">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="value" type="xsd:string" minOccurs="0" />
</xsd:sequence>
<xsd:attribute name="name" type="xsd:string" />
<xsd:attribute name="type" type="xsd:string" />
<xsd:attribute name="mimetype" type="xsd:string" />
</xsd:complexType>
</xsd:element>
<xsd:element name="assembly">
<xsd:complexType>
<xsd:attribute name="alias" type="xsd:string" />
<xsd:attribute name="name" type="xsd:string" />
</xsd:complexType>
</xsd:element>
<xsd:element name="data">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="value" type="xsd:string" minOccurs="0" msdata:Ordinal="1" />
<xsd:element name="comment" type="xsd:string" minOccurs="0" msdata:Ordinal="2" />
</xsd:sequence>
<xsd:attribute name="name" type="xsd:string" msdata:Ordinal="1" />
<xsd:attribute name="type" type="xsd:string" msdata:Ordinal="3" />
<xsd:attribute name="mimetype" type="xsd:string" msdata:Ordinal="4" />
</xsd:complexType>
</xsd:element>
<xsd:element name="resheader">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="value" type="xsd:string" minOccurs="0" msdata:Ordinal="1" />
</xsd:sequence>
<xsd:attribute name="name" type="xsd:string" use="required" />
</xsd:complexType>
</xsd:element>
</xsd:choice>
</xsd:complexType>
</xsd:element>
</xsd:schema>
<resheader name="resmimetype">
<value>text/microsoft-resx</value>
</resheader>
<resheader name="version">
<value>2.0</value>
</resheader>
<resheader name="reader">
<value>System.Resources.ResXResourceReader, System.Windows.Forms, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value>
</resheader>
<resheader name="writer">
<value>System.Resources.ResXResourceWriter, System.Windows.Forms, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089</value>
</resheader>
</root>