Technology liberates productivity: voice-to-text

Speech transcription, as the name suggests, is to convert speech into text

In the actual application scenario, it can be a speech to a manuscript, it can be communicated in an IM tool, and so on.

It can be said that the emergence of this technology has greatly liberated productivity and improved the efficiency of work and communication.

Here is an example today, let's take a look at the convenience brought by speech-to-text in this application scenario

For journalist friends, sorting out written manuscripts after a press conference is always the first theme

In the past, he needed to record the audio on the spot, and after returning to the unit, he dictated the sentences, transcribed purely manually, and then formed a report again.

Now, he can call the service directly on the spot and convert the speech into text in real time. Back in the office, he can directly form reports based on the manuscript

Or he still recorded the live recording with a voice recorder. On the way back to the office, he converted the audio file into a text manuscript through software. Similarly, when he returned to the office, he only needed to form a report based on the manuscript.

After understanding such a case, the protagonist to be introduced today can make his debut

IBM Watson!

You might think it's just a simple speech-to-text service (interface), but in fact, it's a standard computer cognitive system!

Let's go back in time to 2011, when there was such a report

“In 2011, Watson was featured on America’s most popular quiz TV show, “Jeopardy,” beating the human quiz champion in one fell swoop. Today, Watson has grown into a commercial, cloud-based cognitive system, Applied to all walks of life, and gradually make our lives better."

On such a strong foundation, what we need to use today is its speech recognition service Speech to Text

First of all, as a general user, you may have the most basic use environment as mentioned in the case just now. Then, the IBM team has provided you with a free web-side language-to-text program that can be used immediately.

https://speech-to-text-demo.mybluemix.net/ (If you can't open it on your side, please push hard! Push hard! I mean climbing over the wall, you know)

First of all, you can see that there are two ways to input audio, one is to call the device's microphone for live recording, and the other is to upload an audio file

It should be noted here that the supported formats of uploaded files are .wav, .flac, .opus. Here is a digression. We recommend that you transcode the locally recorded audio to opus format, because in the case of low bit rate, The sound quality of the opus format will be stronger, which means that you can suppress your audio file to a smaller size without losing too much sound quality, and it will not reduce the recognition ability of IBM Watson.

Secondly, you may have noticed in the screenshot that there is a drop-down option for the recognition mode, which is currently displayed in English. So in addition to English, let's see what languages ​​it supports

Clear and clear at a glance, the highlighted selection is everyone's native language, Mandarin

Then we can try the two input methods just mentioned, one of which is to upload the recorded audio files

Secondly, we can also directly record the language into text in real time

This simple demo of speech-to-text looks like this. Of course, as a developer, you must not be satisfied with the demo demo above. You can register for the Bluemix service for free . The Speech to Text service is embedded in the service. You can easily build yourself through a powerful interface and complete documentation. , deploy Speech to Text into your own application scenarios

Well, let's look forward to the future, more convenient and powerful services are constantly appearing, today's dreams, tomorrow's daily life.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325475906&siteId=291194637