利用科大讯飞API实现语音识别,SWT实现客户端封装

利用科大讯飞API来实现语音识别,利用Java SWT来封装界面。

科大讯飞API

语音识别的API可以免费试用5小时,许多厂家已经开放了语音识别的API例如百度,阿里等,这里使用科大讯飞的API来实现。其实也可以自己训练数据来实现语音识别的功能,只不过识别率可能不是太高,具体实现原理可以参考如下:日后有时间可以研究一下。

https://blog.ailemon.me/2018/08/29/asrt-a-chinese-speech-recognition-system/

https://github.com/nl8590687/ASRT_SpeechRecognition

声学模型通过采用卷积神经网络(CNN)和连接性时序分类(CTC)方法,使用大量中文语音数据集进行训练,将声音转录为中文拼音,并通过语言模型,将拼音序列转换为中文文本。

登录科大讯飞网址:https://www.xfyun.cn/services/lfasr

   

下载Java  SDK

   

新建应用,获取appId以及secret

       

扫描二维码关注公众号,回复: 5671290 查看本文章

在SDK中配置appId以及secret

# APP ID
app_id=
# secret key
secret_key=
# we support both http and https prototype
lfasr_host=http://raasr.xfyun.cn/api
# file piece size
file_piece_size=10485760
# store path: this is not the store path for the result json file, but the path for the file piece during upload
store_path=F://

Demo中给出一个测试用例:

使用过程如下:

初始化LFASRClient实例

       // 初始化LFASRClient实例
        LfasrClientImp lc = null;
        try {
            lc = LfasrClientImp.initLfasrClient();
        } catch (LfasrException e) {
            // 初始化异常,解析异常描述信息
            Message initMsg = JSON.parseObject(e.getMessage(), Message.class);
            System.out.println("ecode=" + initMsg.getErr_no());
            System.out.println("failed=" + initMsg.getFailed());
        }

上传语音文件


            // 上传音频文件
            Message uploadMsg = lc.lfasrUpload(local_file, type, params);

            // 判断返回值
            int ok = uploadMsg.getOk();
            if (ok == 0) {
                // 创建任务成功
                task_id = uploadMsg.getData();

循环等待任务处理结果:

// 循环等待音频处理结果
        while (true) {
            try {
                // 等待20s在获取任务进度
                Thread.sleep(sleepSecond * 1000);
                System.out.println("waiting ...");
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            try {
                // 获取处理进度
                Message progressMsg = lc.lfasrGetProgress(task_id);

                // 如果返回状态不等于0,则任务失败
                if (progressMsg.getOk() != 0) {
                    System.out.println("task was fail. task_id:" + task_id);
                    System.out.println("ecode=" + progressMsg.getErr_no());
                    System.out.println("failed=" + progressMsg.getFailed());

                    return;
                } else {
                    ProgressStatus progressStatus = JSON.parseObject(progressMsg.getData(), ProgressStatus.class);
                    if (progressStatus.getStatus() == 9) {
                        // 处理完成
                        System.out.println("task was completed. task_id:" + task_id);
                        break;
                    } else {
                        // 未处理完成
                        System.out.println("task is incomplete. task_id:" + task_id + ", status:" + progressStatus.getDesc());
                        continue;
                    }
                }
            } catch (LfasrException e) {
                // 获取进度异常处理,根据返回信息排查问题后,再次进行获取
                Message progressMsg = JSON.parseObject(e.getMessage(), Message.class);
                System.out.println("ecode=" + progressMsg.getErr_no());
                System.out.println("failed=" + progressMsg.getFailed());
            }
        }

获取最终结果:

       // 获取任务结果
        try {
            Message resultMsg = lc.lfasrGetResult(task_id);
            // 如果返回状态等于0,则获取任务结果成功
            if (resultMsg.getOk() == 0) {
                // 打印转写结果
            	System.out.println(resultMsg.getData());
                System.out.println(Test.getFinalResult(resultMsg.getData()));
            } else {
                // 获取任务结果失败
                System.out.println("ecode=" + resultMsg.getErr_no());
                System.out.println("failed=" + resultMsg.getFailed());
            }
        } catch (LfasrException e) {
            // 获取结果异常处理,解析异常描述信息
            Message resultMsg = JSON.parseObject(e.getMessage(), Message.class);
            System.out.println("ecode=" + resultMsg.getErr_no());
            System.out.println("failed=" + resultMsg.getFailed());
        }

resultMsg.getData()返回一个json数组,里面有多个元素,在此将“onebest”元素取出拼接组成最终的输出文本。

String str = "[{\"bg\":\"0\",\"ed\":\"2180\",\"onebest\":\"科大讯飞是中国最大!\",\"si\":\"0\",\"speaker\":\"0\","
				+ "\"wordsResultList\":[{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"6\",\"wordEd\":\"114\",\"wordsName\":"
				+ "\"科大讯飞\",\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"118\",\"wordEd\":\"147\",\"wordsName\""
				+ ":\"是\",\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"148\",\"wordEd\":\"193\",\"wordsName\":\"中国\","
						+ "\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"194\",\"wordEd\":\"213\",\"wordsName\":\"最\","
						+ "\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"214\",\"wordEd\":\"218\",\"wordsName\":\"大\","
						+ "\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"0.0000\",\"wordBg\":\"218\",\"wordEd\":\"218\",\"wordsName\":\"!\","
						+ "\"wp\":\"p\"},{\"alternativeList\":[],\"wc\":\"0.0000\",\"wordBg\":\"218\",\"wordEd\":\"218\",\"wordsName\":\"\","
						+ "\"wp\":\"g\"}]},{\"bg\":\"2190\",\"ed\":\"3080\",\"onebest\":\"的智能。\",\"si\":\"1\",\"speaker\":\"0\","
						+ "\"wordsResultList\":[{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"15\",\"wordEd\":\"42\","
						+ "\"wordsName\":\"的\",\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"47\",\"wordEd\":\"89\","
						+ "\"wordsName\":\"智能\",\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"0.0000\",\"wordBg\":\"89\",\"wordEd\":\"89\","
						+ "\"wordsName\":\"。\",\"wp\":\"p\"},{\"alternativeList\":[],\"wc\":\"0.0000\",\"wordBg\":\"89\",\"wordEd\":\"89\","
						+ "\"wordsName\":\"\",\"wp\":\"g\"}]},{\"bg\":\"3090\",\"ed\":\"4950\",\"onebest\":\"语音技术提供商,\",\"si\":\"2\","
						+ "\"speaker\":\"0\",\"wordsResultList\":[{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"4\",\"wordEd\":\"46\","
						+ "\"wordsName\":\"语音\",\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"47\",\"wordEd\":\"92\","
						+ "\"wordsName\":\"技术\",\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"1.0000\",\"wordBg\":\"93\",\"wordEd\":\"164\","
						+ "\"wordsName\":\"提供商\",\"wp\":\"n\"},{\"alternativeList\":[],\"wc\":\"0.0000\",\"wordBg\":\"164\",\"wordEd\":\"164\","
						+ "\"wordsName\":\",\",\"wp\":\"p\"}]}]";
public static String getFinalResult(String data){
		
		JSONArray ja = JSONArray.parseArray(data);
		StringBuilder sb = new StringBuilder();
		for(int i=0; i<ja.size(); i++){
			//System.out.println(ja.get(i));			
			sb.append(JSON.parseObject(ja.get(i).toString()).get("onebest"));
			//System.out.println(JSON.parseObject(ja.get(i).toString()).get("onebest"));
		}
		return sb.toString();
	}

SWT界面

直接使用有点费劲,想利用SWT来封装一个客户端,这里使用Eclipse来开发,首先安装SWT环境

参考地址如下:https://www.cnblogs.com/xinyan123/p/6225194.html

下载SWT插件:https://www.eclipse.org/windowbuilder/download.php

                                 

将安装包features以及plugins放入到eclipse安装目录对应文件夹下,重启eclipse

        

新建SWT工程

                                     

新建一个ApplicationWindow

                                    

可以使用图形化界面来进行界面UI设计

                                         

SWT核心实现:开始转换按钮的实现逻辑

        //开始转换按钮
		Button startThansfer = new Button(container, SWT.NONE);
		startThansfer.addSelectionListener(new SelectionAdapter() {
			@Override
			public void widgetSelected(SelectionEvent e) {
				logDetailText.append(datePrefix + "开始转换........" + "\n");
				startThansfer.setEnabled(false);
				voicePath = voicePathText.getText();
				textPath = textPathText.getText();									
				int status = 0;					
				Callable<Integer> f = new TransferThread(logDetailText, countDownLatch, datePrefix, voicePath, textPath);
				//Callable<Integer> f = new TransferThreadAsyc(parent, logDetailText, countDownLatch, datePrefix, voicePath, textPath);
				try {
					status = f.call();
				} catch (Exception e2) {
					// TODO Auto-generated catch block
					e2.printStackTrace();
				}												
				try {
					countDownLatch.await();
				} catch (InterruptedException e1) {
					// TODO Auto-generated catch block
					e1.printStackTrace();
				}				
				if(status == 1){
					logDetailText.append(datePrefix + "转换完成" + "\n");
				}else{
					logDetailText.append(datePrefix + "转换失败" + "\n");
				}						
				startThansfer.setEnabled(true);
			}
		});

转换执行线程工作类:

package com.voice.text;

import java.io.FileOutputStream;
import java.util.HashMap;
import java.util.concurrent.Callable;
import java.util.concurrent.CountDownLatch;

import org.eclipse.swt.widgets.Text;

import com.alibaba.fastjson.JSON;
import com.iflytek.msp.cpdb.lfasr.client.LfasrClientImp;
import com.iflytek.msp.cpdb.lfasr.exception.LfasrException;
import com.iflytek.msp.cpdb.lfasr.model.LfasrType;
import com.iflytek.msp.cpdb.lfasr.model.Message;
import com.iflytek.msp.cpdb.lfasr.model.ProgressStatus;
import com.iflytek.voicecloud.lfasr.demo.Test;

public class TransferThread implements Callable<Integer> {
	
	private Text logDetailText;
	private CountDownLatch countDownLatch;
	private LfasrType type = LfasrType.LFASR_STANDARD_RECORDED_AUDIO;
	private int sleepSecond = 20;
	private String datePrefix;
	private String voicePath;
	private String textPath;
	
	public TransferThread(Text logDetailText, CountDownLatch countDownLatch, String datePrefix, String voicePath, String textPath) {
		this.logDetailText = logDetailText;
		this.countDownLatch = countDownLatch;
		this.datePrefix = datePrefix;
		this.voicePath = voicePath;
		this.textPath = textPath;
	}
	@Override
	public Integer call() throws Exception {
		// 初始化LFASRClient实例
        LfasrClientImp lc = null;
        try {
            lc = LfasrClientImp.initLfasrClient();
        } catch (LfasrException e) {
            // 初始化异常,解析异常描述信息
            Message initMsg = JSON.parseObject(e.getMessage(), Message.class);
            logDetailText.append(datePrefix + "ecode=" + initMsg.getErr_no() + "\n");
            ////System.out.println("ecode=" + initMsg.getErr_no());
            logDetailText.append(datePrefix + "failed=" + initMsg.getFailed() + "\n");
            //System.out.println(datePrefix + "failed=" + initMsg.getFailed());
            countDownLatch.countDown();
            return -1;
        }

        // 获取上传任务ID
        String task_id = "";
        HashMap<String, String> params = new HashMap<String, String>();
        params.put("has_participle", "true");
        //合并后标准版开启电话版功能
        //params.put("has_seperate", "true");
        try {
            // 上传音频文件
            Message uploadMsg = lc.lfasrUpload(voicePath, type, params);

            // 判断返回值
            int ok = uploadMsg.getOk();
            if (ok == 0) {
                // 创建任务成功
                task_id = uploadMsg.getData();
                //System.out.println("创建任务成功  task_id=" + task_id);
                logDetailText.append(datePrefix + "创建任务成功  task_id=" + task_id + "\n");
            } else {
                // 创建任务失败-服务端异常
                //System.out.println(datePrefix + "ecode=" + uploadMsg.getErr_no());
                logDetailText.append(datePrefix + "ecode=" + uploadMsg.getErr_no() + "\n");
                //System.out.println(datePrefix + "failed=" + uploadMsg.getFailed());
                logDetailText.append(datePrefix + "failed=" + uploadMsg.getFailed() + "\n");
                countDownLatch.countDown();
                return -1;
            }
        } catch (LfasrException e) {
            // 上传异常,解析异常描述信息
            Message uploadMsg = JSON.parseObject(e.getMessage(), Message.class);
            //System.out.println(datePrefix + "ecode=" + uploadMsg.getErr_no());
            logDetailText.append(datePrefix + "ecode=" + uploadMsg.getErr_no() + "\n");
            //System.out.println(datePrefix + "failed=" + uploadMsg.getFailed()); 
            logDetailText.append(datePrefix + "failed=" + uploadMsg.getFailed() + "\n"); 
            countDownLatch.countDown();
            return -1;
        }

        // 循环等待音频处理结果
        while (true) {
            try {
                // 等待20s在获取任务进度
                Thread.sleep(sleepSecond * 1000);
                //System.out.println("waiting ...");
                logDetailText.append(datePrefix + "failed=" + "waiting ..." + "\n");
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
            try {
                // 获取处理进度
                Message progressMsg = lc.lfasrGetProgress(task_id);

                // 如果返回状态不等于0,则任务失败
                if (progressMsg.getOk() != 0) {
                    //System.out.println("task was fail. task_id:" + task_id);
                    //System.out.println("ecode=" + progressMsg.getErr_no());
                    //System.out.println("failed=" + progressMsg.getFailed());
                    logDetailText.append(datePrefix + "task was fail. task_id:" + task_id + "\n");
                    logDetailText.append(datePrefix + "ecode=" + progressMsg.getErr_no() + "\n");
                    logDetailText.append(datePrefix + "failed=" + progressMsg.getFailed() + "\n");
                    countDownLatch.countDown();
                    return -1;
                } else {
                    ProgressStatus progressStatus = JSON.parseObject(progressMsg.getData(), ProgressStatus.class);
                    if (progressStatus.getStatus() == 9) {
                        // 处理完成
                        //System.out.println(datePrefix + "task was completed. task_id:" + task_id + "\n");
                        logDetailText.append(datePrefix + "task was completed. task_id:" + task_id + "\n");
                        break;
                    } else {
                        // 未处理完成
                        //System.out.println(datePrefix + "task is incomplete. task_id:" + task_id + ", status:" + progressStatus.getDesc() + "\n");
                        logDetailText.append(datePrefix + "task is incomplete. task_id:" + task_id + ", status:" + progressStatus.getDesc() + "\n");
                        continue;
                    }
                }
            } catch (LfasrException e) {
                // 获取进度异常处理,根据返回信息排查问题后,再次进行获取
                Message progressMsg = JSON.parseObject(e.getMessage(), Message.class);
                //System.out.println(datePrefix + "ecode=" + progressMsg.getErr_no() + "\n");
                //System.out.println(datePrefix + "failed=" + progressMsg.getFailed() + "\n");
                logDetailText.append(datePrefix + "ecode=" + progressMsg.getErr_no() + "\n");
                logDetailText.append(datePrefix + "failed=" + progressMsg.getFailed() + "\n");
            }
        }

        // 获取任务结果
        try {
            Message resultMsg = lc.lfasrGetResult(task_id);
            // 如果返回状态等于0,则获取任务结果成功
            if (resultMsg.getOk() == 0) {
                // 打印转写结果
            	String result = Test.getFinalResult(resultMsg.getData());
            	String output = textPath + "\\" + System.currentTimeMillis() + ".txt";
            	FileOutputStream f = new FileOutputStream(output);
            	f.write(result.getBytes());
                //System.out.println(result);
                logDetailText.append(datePrefix + "结果存放路径: " + output + "\n");
                logDetailText.append(datePrefix + "最终转换结果: " + "\n");
                logDetailText.append(datePrefix + result + "\n");
            } else {
                // 获取任务结果失败
                //System.out.println(datePrefix + "ecode=" + resultMsg.getErr_no() + "\n");
                //System.out.println(datePrefix + "failed=" + resultMsg.getFailed() + "\n");
                logDetailText.append(datePrefix + "ecode=" + resultMsg.getErr_no() + "\n");
                logDetailText.append(datePrefix + "failed=" + resultMsg.getFailed() + "\n");
                countDownLatch.countDown();
                return -1;
            }
        } catch (LfasrException e) {
            // 获取结果异常处理,解析异常描述信息
            Message resultMsg = JSON.parseObject(e.getMessage(), Message.class);
            //System.out.println(datePrefix + "ecode=" + resultMsg.getErr_no() + "\n");
            //System.out.println(datePrefix + "failed=" + resultMsg.getFailed() + "\n");
            logDetailText.append(datePrefix + "ecode=" + resultMsg.getErr_no() + "\n");
            logDetailText.append(datePrefix + "failed=" + resultMsg.getFailed() + "\n");
            countDownLatch.countDown();
            return -1;
        }
		countDownLatch.countDown();
		return 1;
	}		
} 

整合代码,实现最终效果如下:

                                         

代码位置:https://github.com/ChenWenKaiVN/VoiceToText

下一阶段优化方向

1.主线程会出现假死现象。需要深入研究一下SWT UI线程与非UI线程的运行机制。

https://blog.csdn.net/dollyn/article/details/38582743/

2.研究一下进度条的问题,显示转换进度。

3.配置界面需要与SDK配置文件进一步相结合,许多变量还是写死在SDK配置文件中。

4.研究一下可执行jar的打包方法,将JRE一起加入到可执行jar中。

5.研究一下语音识别的技术原理

https://github.com/nl8590687/ASRT_SpeechRecognition

猜你喜欢

转载自blog.csdn.net/u014106644/article/details/88824092