name | Affiliated development agency | scenes to be used | Advantages and disadvantages | technical features | share | describe |
---|---|---|---|---|---|---|
CMU Sphinx | Carnegie Mellon University | Embedded devices, server applications | Pros: Can be used in embedded devices and server applications. Disadvantages: The accuracy is relatively low and the scope of application is limited. | - Support for multiple language models and tools. - Suitable for embedded devices and server applications. | medium | CMU Sphinx is an open source speech recognition system for embedded devices and server applications. It provides a variety of language models and tools, but the accuracy is relatively low and the scope of application is limited. |
DeepSpeech | Mozilla | Multi-platform application, speech-to-text conversion, speech recognition | Pros: Supports multiple platforms. Disadvantages: The training process is slower and the model is larger. | - Based on deep learning technology. - Supports multiple platforms. | Low | DeepSpeech is an open source speech recognition engine developed by Mozilla, based on deep learning technology, and supports multi-platform applications. However, since the training process of the deep learning model is slow and the model is large, it may require high computing resources and time. |
left | Kaldi team | Academia and Industry, Large-Scale Speech Recognition | Pros: Powerful speech recognition toolkit. Cons: A steeper learning curve. | - Powerful speech recognition toolkit. | medium | Kaldi is a powerful speech recognition toolkit widely used in academia and industry, providing a variety of modern speech recognition algorithms. However, due to its complexity, it may require a certain learning curve to use. |
OpenSeq2Seq | NVIDIA | End-to-end speech recognition, large-scale speech recognition | Pros: Supports end-to-end speech recognition. Disadvantages: High computing resources are required. | - End-to-end speech recognition system based on Tensorflow. - Support large-scale speech recognition. | Low | OpenSeq2Seq is an open source project developed by NVIDIA that supports end-to-end speech recognition for large-scale speech recognition tasks. However, since end-to-end systems usually require high computational resources, they may not be suitable for resource-constrained devices. |
Julius | Unspecified | Fast real-time large vocabulary continuous speech recognition | Advantages: Fast and real-time, suitable for large vocabulary recognition. Cons: Development agency not specified. | - Fast real-time continuous speech recognition with large vocabulary. | Low | Julius is a fast real-time large-vocabulary continuous speech recognition engine for multiple languages, especially for scenarios that require real-time and large-vocabulary recognition. However, its specific development agency was not specified. |
Pocketsphinx.js | Carnegie Mellon University | Speech recognition running in the browser | Pros: Works in the browser. Disadvantages: relatively low accuracy. | - Speech recognition that runs in the browser. | Low | Pocketsphinx.js is a JavaScript port of CMU Sphinx that runs speech recognition in the browser. It provides a way to implement speech recognition in the browser, but the accuracy rate may be relatively low. |
Wax | Unspecified | offline speech recognition | Advantages: Support offline speech recognition. Cons: Development agency not specified. | - Support offline speech recognition. | unknown | Vosk is an open source toolkit for offline speech recognition that supports multiple languages and platforms. However, its specific development agency was not specified. |
Note that this information may change over time, and it is recommended that you check their official websites or development communities for the latest information when using these open source projects. At the same time, the "occupancy rate" here is estimated based on the information provided at present, and is not an accurate market share data.
Vibrating voice: dilo_Abel
dilo_Abel's personal space-dilo_Abel personal homepage-哔哩哔哩Video