AI > Speech Recognition Open Source Project List

name Affiliated development agency scenes to be used Advantages and disadvantages technical features share describe
CMU Sphinx Carnegie Mellon University Embedded devices, server applications Pros: Can be used in embedded devices and server applications. Disadvantages: The accuracy is relatively low and the scope of application is limited. - Support for multiple language models and tools. - Suitable for embedded devices and server applications. medium CMU Sphinx is an open source speech recognition system for embedded devices and server applications. It provides a variety of language models and tools, but the accuracy is relatively low and the scope of application is limited.
DeepSpeech Mozilla Multi-platform application, speech-to-text conversion, speech recognition Pros: Supports multiple platforms. Disadvantages: The training process is slower and the model is larger. - Based on deep learning technology. - Supports multiple platforms. Low DeepSpeech is an open source speech recognition engine developed by Mozilla, based on deep learning technology, and supports multi-platform applications. However, since the training process of the deep learning model is slow and the model is large, it may require high computing resources and time.
left Kaldi team Academia and Industry, Large-Scale Speech Recognition Pros: Powerful speech recognition toolkit. Cons: A steeper learning curve. - Powerful speech recognition toolkit. medium Kaldi is a powerful speech recognition toolkit widely used in academia and industry, providing a variety of modern speech recognition algorithms. However, due to its complexity, it may require a certain learning curve to use.
OpenSeq2Seq NVIDIA End-to-end speech recognition, large-scale speech recognition Pros: Supports end-to-end speech recognition. Disadvantages: High computing resources are required. - End-to-end speech recognition system based on Tensorflow. - Support large-scale speech recognition. Low OpenSeq2Seq is an open source project developed by NVIDIA that supports end-to-end speech recognition for large-scale speech recognition tasks. However, since end-to-end systems usually require high computational resources, they may not be suitable for resource-constrained devices.
Julius Unspecified Fast real-time large vocabulary continuous speech recognition Advantages: Fast and real-time, suitable for large vocabulary recognition. Cons: Development agency not specified. - Fast real-time continuous speech recognition with large vocabulary. Low Julius is a fast real-time large-vocabulary continuous speech recognition engine for multiple languages, especially for scenarios that require real-time and large-vocabulary recognition. However, its specific development agency was not specified.
Pocketsphinx.js Carnegie Mellon University Speech recognition running in the browser Pros: Works in the browser. Disadvantages: relatively low accuracy. - Speech recognition that runs in the browser. Low Pocketsphinx.js is a JavaScript port of CMU Sphinx that runs speech recognition in the browser. It provides a way to implement speech recognition in the browser, but the accuracy rate may be relatively low.
Wax Unspecified offline speech recognition Advantages: Support offline speech recognition. Cons: Development agency not specified. - Support offline speech recognition. unknown Vosk is an open source toolkit for offline speech recognition that supports multiple languages ​​and platforms. However, its specific development agency was not specified.

Note that this information may change over time, and it is recommended that you check their official websites or development communities for the latest information when using these open source projects. At the same time, the "occupancy rate" here is estimated based on the information provided at present, and is not an accurate market share data.

Vibrating voice: dilo_Abel

dilo_Abel's personal space-dilo_Abel personal homepage-哔哩哔哩Video

Guess you like

Origin blog.csdn.net/DL_62532/article/details/131892217