Are there any open source python Chinese speech-to-text projects?

With the continuous development of speech technology, speech recognition technology has gradually matured and has become an important part of many smart applications, such as smart home, voice assistant and so on. In speech recognition technology, Chinese speech recognition is a more challenging field. In order to facilitate Chinese speech recognition for programmers, here are ten Python open source Chinese speech-to-text projects recommended, hoping to be helpful to everyone.

wax

vosk is a lightweight speech recognition library that supports multiple languages, including Chinese. It uses deep learning technology to complete speech-to-text tasks in a relatively short period of time. The advantage of vosk is that it is fast, accurate and can be used offline. Github link: https://github.com/alphacep/vosk-api

Kaldi-python

Kaldi-python is a Kaldi-based Python speech recognition toolkit that supports multiple languages, including Chinese. Kaldi is a very popular speech recognition engine, and its recognition accuracy is very high. With Kaldi-python, you can easily use Kaldi functions in Python. Github link: https://github.com/janchorowski/kaldi-python

PocketSphinx

PocketSphinx is an open source speech recognition toolkit by CMU Sphinx, which supports multiple languages, including Chinese. It is a lightweight speech recognition engine that can be used in resource-constrained environments such as mobile devices. Github link: https://github.com/cmusphinx/pocketsphinx

py-kaldi-asr

py-kaldi-asr is a Kaldi-based Python speech recognition toolkit that supports multiple languages, including Chinese. Different from Kaldi-python, py-kaldi-asr provides a more advanced API and supports functions such as multi-thread recognition. Github link: https://github.com/jpuigcerver/py-kaldi-asr

Assemblyai

Assemblyai is a speech recognition API that uses deep learning technology and supports multiple languages, including Chinese. It uses an algorithm called "adaptive density comparison", which can complete the speech-to-text task in a relatively short period of time. Github link: https://github.com/assemblyai/python-sdk

Google Cloud Speech-to-Text

Google Cloud Speech-to-Text is a speech recognition API on the Google Cloud platform that supports multiple languages, including Chinese. It uses Google's own speech recognition engine, which can achieve a high level of accuracy. Github link: https://github.com/googleapis/python-sdk

Baidu AI Open Platform

The Baidu AI Open Platform provides a speech recognition API that supports multiple languages, including Chinese. It uses Baidu's own speech recognition engine, which can achieve a high level of accuracy. It also supports offline speech recognition and real-time speech recognition. Github link: https://github.com/Baidu-AIP/python-sdk

iFLYTEK

iFLYTEK is a speech recognition API launched by iFLYTEK, which supports multiple languages, including Chinese. It uses deep learning techniques and can achieve a high level of accuracy. It also supports offline speech recognition and real-time speech recognition. Github link: https://github.com/iFLYTEK-Speech/python_sdk

DeepSpeech

DeepSpeech is Mozilla's open source speech recognition toolkit, which supports multiple languages, including Chinese. It uses deep learning techniques and can achieve a high level of accuracy. Its advantage is that it can be used offline, and it also provides a pre-trained Chinese speech recognition model. Github link: https://github.com/mozilla/DeepSpeech

vosk-api-python

vosk-api-python is vosk's Python speech recognition toolkit, which uses deep learning technology to complete speech-to-text tasks in a relatively short period of time. Unlike vosk, it provides a more advanced API and supports functions such as multi-thread recognition. Github link: https://github.com/alphacep/vosk-api/tree/master/python

Guess you like

Origin blog.csdn.net/devid008/article/details/129656356