Interactive applications and several important modules, smart phones

In the way of information exchange and language above, for humans it is a very easy thing to do. However, in terms of speech recognition machine it would not be so simple, logic technology and which involves very complicated.

Speech recognition, also known as automatic speech recognition (AutomaticSpeechRecognition, ASR), refers to a computer can automatically convert human speech content into corresponding text, and then presented to the human technology. Speech recognition technology has been 50 years of history. In recent years, speech recognition technology has been widely applied only just begun. As mobile devices, increasing popularity of wearable devices, intelligent home devices, smart phones or robot systems, interactive dialogue has become the focus of human-computer interaction.

1, part of the speech recognition
speech recognition consists of the following basic modules: an information processing and handling characteristics, acoustic model (AM), the language model (the LM), pronunciation (voice) dictionary and ×××.

The signal processing and feature extraction.
It is the first step of a speech recognition system, and the first portion. Also suitable extracting feature vectors representative model for subsequent studies while receiving the original audio signal, it is done by. In signal processing, it is possible in a relatively noisy environment, the voice recognition rate is referred to the optimal solution.

Acoustic model.
Acoustic mentioned, would have to mention the famous hidden Markov model speech recognition system can typically be modeled words, syllables, phonemes basic unit via an acoustic model, and then generate a model. Simply put, it is to sound modeling, the language output into an acoustic output.

Language model.
Language model language for modeling system requirements identified. A variety of language models, and includes context-free statements, can be used as a language model. Today, most speech recognition systems generally calculated using N-gram model and variants thereof. It is possible to estimate the likelihood of the word sequences by assuming that the link between training and learning words and words.

Dictionary pronunciation.
Pronunciation dictionary contains the word system can handle the collections and their pronunciation. Pronunciation dictionary to obtain the mapping relationship between the acoustic models and language models modeling means modeling unit, connecting the two components form a state-space search for ××× decoding work.
Pronunciation dictionary contains words and their pronunciation of the system can handle the collection. Voice dictionary acquisition unit and the acoustic model constructing a language model construct a mapping relationship between the units, to form a state-space search and connect them, a combination of both with each other can be used for a decoding operation ×××.

×××.
This is one of the most central part of a speech recognition system, a qualified phone really easy to use robot depends here, as one of the core speech recognition system, and its main task is responsible for reading the sequence characteristics of the speech signal input, according to another acoustic model, language model and a pronunciation dictionary, a maximum probability decoding the word string output signal.
Decoded speech recognition is the process of encoding a first signal processing and feature extraction is the process of encoding, speech vectors obtained from the original speech. That is later decoded speech vector, and decoding requires the acoustic and language models mentioned above.

2, how to recognize the voice machine
to the computer room for vision, speech recognition is more pure, because only one core task is to speak the human language into the robot can then compile the data into text and then presented. In simple terms these text frames is to cut the sound of phonemes and then combined into words, it enables voice converted into text.

3, application and development of speech recognition
because the technology matures appeal, gave birth to a telephone robot, which in recent years the rise of artificial intelligence products to Europe can intelligent robots as an example, it is mainly through telephone group call potential clients, conduct communication information filtering, which helps companies choose the intention of customers. Telephone companies use robots can reduce labor costs and improve efficiency. Application of these areas can greatly reduce the cost of labor above, the development of many companies will not formality.

From the above direction is large, although the voice recognition technology still needs to be improved in many places, such as voice recognition dialect, speech recognition in high noise environment is still some way to go. But it is undeniable, with the continuous development of information technology, voice recognition technology will continue to get a breakthrough, has a more broad space for development.

Guess you like

Origin blog.51cto.com/14387331/2411431