Jinglianwen Technology can provide data collection support for multi-modal speech translation models

On August 22, Meta Platforms, the parent company of Facebook, released an artificial intelligence model - SeamlessM4T - that can translate and transcribe dozens of languages, which can provide users with more convenient translation and transcription services in daily life or business communications.

Compared with traditional text translation, the biggest difference of this technology is that it can realize end-to-end speech translation, that is, directly translate the speech of one language into another language, so that people can communicate directly without going through complicated intermediate conversion process.

SeamlessM4T supports:

1. Speech recognition in nearly 100 languages.

2. Speech-to-text translation for nearly 100 input and output languages.

3. Supports speech-to-speech translation in nearly 100 input languages ​​and 36 output languages.

4. Supports text-to-text translation in nearly 100 languages.

5. Supports text-to-speech translation in nearly 100 input languages ​​and 35 output languages.

The speech-to-speech translation model supported by SeamlessM4T requires a large amount of high-quality end-to-end data. Meeting the demand for speech translation in nearly 100 languages ​​is difficult by relying solely on manual transcription and translation of speech, because the process of building a speech translation data set is complex and costly. On the premise of obtaining authorized audio, corresponding transcription and translation need to be performed, then the audio, transcription and translation need to be segmented, and finally aligned and filtered to obtain valid data.

Jinglianwen Technology has rich experience in voice data collection and annotation projects. It has built its own professional voice collection and recording studio with a high degree of ability to restore real scenes. It has nearly 10,000 collected personnel in more than 30 provinces and cities across the country. There are also collection channels that support voice collection in multiple languages ​​and dialects. The self-owned data management platform opens up the data closed loop and can carry out data distribution, cleaning, annotation, quality inspection, etc. in an orderly manner, delivering high-quality training data, improving the efficiency of enterprise AI data training, and accelerating the implementation and iteration of artificial intelligence-related applications. cycle.

Jinglianwen Technology|Data Collection|Data Annotation

Promote artificial intelligence technology and empower the intelligent transformation and upgrading of traditional industries

The copyright of the article's graphics and text belongs to Jinglianwen Technology. For commercial reprinting, please contact Jinglianwen Technology for authorization. For non-commercial reprinting, please indicate the source.

Guess you like

Origin blog.csdn.net/weixin_55551028/article/details/132759907